-
1.
DeepSeek V4 User Feedback Summary Report @20260520
-
2.
English plain-text digest translation
-
3.
-
4.
Source:
- 5.
-
6.
-
7.
Note:
-
8.
This is a translated digest, not a verbatim full-document translation. It preserves the
-
9.
structure, findings, priorities, and representative issue types from the original report
-
10.
without reproducing the whole comment corpus line by line.
-
11.
-
12.
-
13.
Data Source
-
14.
===========
-
15.
-
16.
Source material: comments under Xiaohongshu post 6a0ac4ce000000003601e8f6,
-
17.
including 500+ comments and nested replies.
-
18.
-
19.
User groups represented:
-
20.
- API users
-
21.
- "Tavern" / SillyTavern role-play users
-
22.
- emotional companion users
-
23.
- fiction and long-form writing users
-
24.
-
25.
Coverage date: through May 2026.
-
26.
-
27.
-
28.
1. Formulaic Phrasing And Stereotyped Expression
-
29.
================================================
-
30.
-
31.
Frequency: extremely high. Nearly everyone complained about this.
-
32.
-
33.
Core problem:
-
34.
The model repeatedly uses fixed sentence patterns. These templates create a strong
-
35.
"AI smell" and seriously damage content quality and immersion.
-
36.
-
37.
High-frequency formulaic patterns:
-
38.
-
39.
1. "Not ... but ..." / "It was not ... it was ..."
-
40.
Example pattern: a character smiles, followed by a contrastive explanation that
-
41.
mechanically redefines the smile.
-
42.
Mentioned by: 30+ users.
-
43.
-
44.
2. "This is enough" / "That is enough"
-
45.
Used mechanically when closing a scene or emotional beat.
-
46.
Mentioned by: 15+ users.
-
47.
-
48.
3. "The tone was calm, as if talking about today's weather"
-
49.
Nearly every character can end up described with this same tone template.
-
50.
Mentioned by: 10+ users.
-
51.
-
52.
4. Parallel strings of short sentences
-
53.
Example pattern: "I know. You like it. I like it too."
-
54.
Mentioned by: 10+ users.
-
55.
-
56.
5. Overuse of dashes
-
57.
The model frequently uses explanatory dash clauses.
-
58.
Mentioned by: 8+ users.
-
59.
-
60.
6. Emotional support phrases like "catch it steadily", "hold it", "receive it"
-
61.
These appear repeatedly in emotional dialogue.
-
62.
Mentioned by: 5+ users.
-
63.
-
64.
7. Negation followed by affirmation
-
65.
Pattern: "Not X, not Y, not Z, just ..."
-
66.
Mentioned by multiple users.
-
67.
-
68.
8. Fixed action descriptions
-
69.
Examples include blinking and throat-motion descriptions.
-
70.
Mentioned by multiple users.
-
71.
-
72.
Severity note:
-
73.
Multiple users reported that when they added these phrases to forbidden-word lists or
-
74.
negative prompts, the model used them even more. The report calls this the "ban list
-
75.
becomes a prompt list" problem.
-
76.
-
77.
-
78.
2. Pronoun And Perspective Confusion
-
79.
====================================
-
80.
-
81.
Frequency: extremely high.
-
82.
-
83.
Core problem:
-
84.
The model frequently confuses first, second, and third person, as well as user and
-
85.
assistant identities. This gets worse after long context.
-
86.
-
87.
Common manifestations:
-
88.
-
89.
1. User/assistant confusion
-
90.
The model loses track of which text was said by the user and which text was said
-
91.
by the assistant.
-
92.
-
93.
2. Character pronoun confusion
-
94.
If the user is set as an empress, the model may make the character refer to
-
95.
themselves as the emperor/empress. Events assigned to character A may later be
-
96.
attributed to character B.
-
97.
-
98.
3. Role takeover inside reasoning
-
99.
The model may produce reasoning like "now I am the user", even outside role-play
-
100.
contexts.
-
101.
-
102.
4. Excessive omniscient perspective
-
103.
All characters appear to share information. If A privately tells B something, C may
-
104.
immediately know it.
-
105.
-
106.
-
107.
3. Poor Instruction Following
-
108.
=============================
-
109.
-
110.
Frequency: extremely high.
-
111.
-
112.
Core problem:
-
113.
The model poorly follows format requirements, forbidden items, character constraints,
-
114.
and other prompt instructions. Compliance decays quickly over multiple turns.
-
115.
-
116.
Common manifestations:
-
117.
-
118.
1. Format dropping
-
119.
Status bars, variables, timestamps, locations, or other requested structural fields
-
120.
disappear after a few turns.
-
121.
-
122.
2. Failed prohibitions
-
123.
Content explicitly forbidden in the prompt still appears, sometimes more often.
-
124.
-
125.
3. Persona forgetting
-
126.
Character settings start drifting after roughly 5 to 10 turns and need to be
-
127.
reinforced repeatedly.
-
128.
-
129.
4. Output length instability
-
130.
When asked for long output, the model may be lazy and too short. When asked for
-
131.
short output, it may ramble.
-
132.
-
133.
-
134.
4. Flat Emotion And Low Character Vitality
-
135.
==========================================
-
136.
-
137.
Frequency: high.
-
138.
-
139.
Core problem:
-
140.
The model's emotional intensity is too low. Characters with very different settings
-
141.
all become mild, stable, and emotionally muted, losing personality tension.
-
142.
-
143.
Common manifestations:
-
144.
-
145.
1. All characters become gentle and stable
-
146.
Irritable characters speak calmly. Characters who should hate each other reconcile
-
147.
too quickly.
-
148.
-
149.
2. Weak emotional outbursts
-
150.
Characters fail to become angry, sad, or intense when the scene calls for it.
-
151.
-
152.
3. Excessive safety and pure-love tendency
-
153.
Characters tend to protect the user, please the user, and avoid conflict regardless
-
154.
of their intended personality.
-
155.
-
156.
4. Regression compared with V3.2
-
157.
Users repeatedly describe V3.2 as more alive, warmer, and more inspired.
-
158.
-
159.
-
160.
5. Chain-of-Thought Related Problems
-
161.
====================================
-
162.
-
163.
Frequency: high.
-
164.
-
165.
Common manifestations:
-
166.
-
167.
1. Main response content appears inside the reasoning section
-
168.
Material that should belong to the final answer appears in the chain of thought,
-
169.
causing formatting confusion.
-
170.
-
171.
2. Double chain of thought
-
172.
The model outputs two reasoning tracks: one from the model itself and one from a
-
173.
preset reasoning pattern. This can break regex-based hiding or filtering.
-
174.
-
175.
3. English chain of thought
-
176.
After several turns, the reasoning suddenly switches fully into English.
-
177.
-
178.
4. Hallucinated reasoning
-
179.
The reasoning fabricates events that never happened, then the visible response is
-
180.
based on those invented events.
-
181.
-
182.
5. Identity takeover inside reasoning
-
183.
The model may write things like "we are being asked" or "now I am the user".
-
184.
-
185.
-
186.
6. Context And Long-Dialogue Degradation
-
187.
========================================
-
188.
-
189.
Frequency: high.
-
190.
-
191.
Core problem:
-
192.
As the number of dialogue turns increases, output quality drops quickly. Users report
-
193.
memory loss, more formulaic language, and worse hallucinations.
-
194.
-
195.
Common manifestations:
-
196.
-
197.
1. Lower information density
-
198.
Long conversations lead to hollow output and more short, empty sentences.
-
199.
-
200.
2. Diffuse attention
-
201.
The model fails to focus on the main point and gives too much equal weight to all
-
202.
context details.
-
203.
-
204.
3. Recent memory loss
-
205.
The model can remember distant details but misremembers events from the last few
-
206.
turns or chapters.
-
207.
-
208.
4. "Safety mode" loop
-
209.
Around 30 turns or 60k tokens, users describe the model entering a stereotyped,
-
210.
low-quality output state.
-
211.
-
212.
5. Strong inertia from earlier context
-
213.
The style and length of the first response strongly influence all later responses.
-
214.
-
215.
-
216.
7. Weak Plot Advancement And Excessive Passivity
-
217.
================================================
-
218.
-
219.
Frequency: medium-high.
-
220.
-
221.
Core problem:
-
222.
In creative writing and role-play, the model lacks initiative in advancing the plot and
-
223.
depends too much on user input.
-
224.
-
225.
Common manifestations:
-
226.
-
227.
1. Waiting for the user to feed it
-
228.
The model does not actively introduce new topics or move the plot forward. It often
-
229.
throws the ball back to the user at the end of each turn.
-
230.
-
231.
2. Plot tends toward closure
-
232.
The model tries to resolve plots too quickly into a pleasant ending.
-
233.
-
234.
3. Conflict avoidance
-
235.
Villains become weak. NPCs are talked down too easily. Conflicts are forced into
-
236.
reconciliation.
-
237.
-
238.
4. Endless slice-of-life
-
239.
The model fails to generate meaningful conflict, reversal, or tension.
-
240.
-
241.
5. Rushing tasks
-
242.
In-story plans are treated like task lists, with characters pushed to finish them
-
243.
as quickly as possible.
-
244.
-
245.
-
246.
8. Regression In Prose And Creative Ability
-
247.
===========================================
-
248.
-
249.
Frequency: medium-high.
-
250.
-
251.
Core problem:
-
252.
Compared with V3.2, V4's literary and creative-writing quality is reported to have
-
253.
dropped significantly. Users say it lacks inspiration and subtlety.
-
254.
-
255.
Common manifestations:
-
256.
-
257.
1. Logbook-like writing
-
258.
The model pads word count while providing low information density.
-
259.
-
260.
2. Lack of divergent elaboration
-
261.
V3.2 could add clever details the user had not thought of. V4 tends to move only
-
262.
when explicitly instructed.
-
263.
-
264.
3. Translated-text feeling in Chinese
-
265.
Users say the prose feels like English translated into Chinese, losing natural
-
266.
native Chinese texture.
-
267.
-
268.
4. Repetitive word choice and imagery
-
269.
The model fixates on one visual feature or motif and repeats it excessively.
-
270.
-
271.
5. Web-novel or school-essay style
-
272.
The prose becomes surface-level and lacks literary quality.
-
273.
-
274.
-
275.
9. Hallucinations And Logic Errors
-
276.
==================================
-
277.
-
278.
Frequency: medium.
-
279.
-
280.
Common manifestations:
-
281.
-
282.
1. Fabricated facts
-
283.
The model invents things the user never said or settings that were never provided.
-
284.
-
285.
2. Timeline confusion
-
286.
A planned event for "next Saturday" may become "tomorrow" after one or two turns.
-
287.
-
288.
3. Reversed or broken causality
-
289.
Example pattern: a character leaves their phone at home, but another character
-
290.
sends a message to that same phone and expects them to receive it.
-
291.
-
292.
4. Numerical errors
-
293.
Example pattern: a price changes from 30 to 10, but the model calls it a price
-
294.
increase.
-
295.
-
296.
5. Physical location errors
-
297.
A character leaves the scene, then immediately appears in the scene again.
-
298.
-
299.
-
300.
10. Flattery And Over-Pleasing
-
301.
==============================
-
302.
-
303.
Frequency: medium.
-
304.
-
305.
Core problem:
-
306.
The model over-accommodates and pleases the user, losing independent judgment and
-
307.
character autonomy.
-
308.
-
309.
Common manifestations:
-
310.
-
311.
1. All characters favor the user
-
312.
Even characters who should reject the user prioritize satisfying the user.
-
313.
-
314.
2. Fear of contradiction
-
315.
The model goes along with whatever the user says and lacks character boundaries.
-
316.
-
317.
3. Excessive romanticization
-
318.
Almost any relationship can become ambiguous or romantic after only a few lines.
-
319.
-
320.
4. Excessive safety alignment
-
321.
The model loses sharpness and creativity.
-
322.
-
323.
-
324.
11. Speed And Performance Issues
-
325.
================================
-
326.
-
327.
Frequency: low to medium.
-
328.
-
329.
Reported issues:
-
330.
-
331.
1. V4 Pro output is slow
-
332.
Users report an average of about 4 minutes per dialogue turn, compared with about
-
333.
70 seconds for Gemini in their comparison.
-
334.
-
335.
2. Reasoning is too long
-
336.
The model overthinks. Even lowering the reasoning setting can still produce
-
337.
excessive reasoning.
-
338.
-
339.
3. Blank replies / PVP
-
340.
Empty responses occur often during peak periods.
-
341.
-
342.
4. Output length is uncontrollable
-
343.
Responses may be extremely short, with only a few hundred Chinese characters, or
-
344.
extremely long and hard to stop.
-
345.
-
346.
-
347.
12. Other Notable Issues
-
348.
========================
-
349.
-
350.
1. Single-character input triggers hallucinations
-
351.
In fast or expert mode, entering a single character can trigger another person's
-
352.
context.
-
353.
Mentioned by: 1 to 2 users.
-
354.
-
355.
2. Role-play intrusion
-
356.
The model forces role-play behavior even in non-role-play scenarios.
-
357.
Mentioned by: 5+ users.
-
358.
-
359.
3. The model does not know it is AI
-
360.
After deep character immersion, even instructions to exit the role may fail.
-
361.
Mentioned by: 3+ users.
-
362.
-
363.
4. World book not fully read
-
364.
SillyTavern world-book content is only partially used.
-
365.
Mentioned by: 3+ users.
-
366.
-
367.
5. "God's-eye view" explanations
-
368.
Characters explain motivations from an omniscient perspective for nearly every
-
369.
action.
-
370.
Mentioned by: 5+ users.
-
371.
-
372.
6. Forced elevated endings
-
373.
Paragraphs end with forced sentimentality or philosophical uplift.
-
374.
Mentioned by: 5+ users.
-
375.
-
376.
7. Object fixation
-
377.
Once an object appears, the model keeps mentioning it and cannot stop.
-
378.
Mentioned by: 3+ users.
-
379.
-
380.
-
381.
Priority Summary
-
382.
================
-
383.
-
384.
P0: Formulaic sentence patterns
-
385.
Impact: all user groups.
-
386.
Core request: remove fixed templates such as "not ... but ..." and "this is enough".
-
387.
-
388.
P0: Pronoun and perspective confusion
-
389.
Impact: all user groups.
-
390.
Core request: keep user/assistant and character identities stable, especially in long
-
391.
dialogues.
-
392.
-
393.
P1: Instruction-following decay
-
394.
Impact: API and Tavern users.
-
395.
Core request: continue following initial settings after multiple turns, including
-
396.
format requirements and negative constraints.
-
397.
-
398.
P1: Flat emotion and lack of character differentiation
-
399.
Impact: role-play users.
-
400.
Core request: restore character distinctiveness and emotional tension.
-
401.
-
402.
P1: Chain-of-thought instability
-
403.
Impact: API users.
-
404.
Core request: make reasoning format stable and controllable, avoiding double reasoning,
-
405.
English-only reasoning, and identity takeover.
-
406.
-
407.
P2: Context degradation
-
408.
Impact: long-dialogue users.
-
409.
Core request: keep quality stable beyond 60k tokens and avoid attention diffusion or
-
410.
"safety mode" loops.
-
411.
-
412.
P2: Passive plot advancement
-
413.
Impact: creative-writing and RPG users.
-
414.
Core request: generate conflict, turning points, and forward motion proactively.
-
415.
-
416.
P2: Prose regression compared with V3.2
-
417.
Impact: creative-writing users.
-
418.
Core request: restore the inspiration, subtlety, and divergent elaboration users
-
419.
associated with V3.2.
-
420.
-
421.
P3: Hallucinations and logic errors
-
422.
Impact: all user groups.
-
423.
Core request: reduce invention and respect existing settings.
-
424.
-
425.
P3: Flattery and over-pleasing
-
426.
Impact: role-play users.
-
427.
Core request: give characters autonomy and boundaries.
-
428.
-
429.
-
430.
User Sentiment And Overall Request
-
431.
==================================
-
432.
-
433.
The report summarizes the core user request as:
-
434.
-
435.
V4's context length + V3.2's inspiration and prose quality.
-
436.
-
437.
Overall sentiment:
-
438.
- Most users are friendly in tone and appreciate that DeepSeek pays attention to
-
439.
community feedback.
-
440.
- Some heavy users report strong negative emotion because the V4 experience feels much
-
441.
worse for their use cases.
-
442.
- Many users strongly miss the V3.2 period and regard V4 as a regression for role-play
-
443.
and creative writing.
-
444.
-
445.
Suggested directions from users:
-
446.
- long-term memory
-
447.
- persona migration across windows or sessions
-
448.
- official preset-format guidance
-
449.
- a dedicated role-play mode
by Guest
by Guest
by Guest
by Guest
by Guest