Untitled


                                            1.
                                            
                                                DeepSeek V4 User Feedback Summary Report @20260520

                                            2.
                                            
                                                English plain-text digest translation

                                            3.

                                            4.
                                            
                                                Source:

                                            5.
                                            
                                                https://github.com/victorchen96/deepseek_v4_rolepaly_instruct/blob/main/deepseek_v4_feedback_report_20260520.md

                                            6.

                                            7.
                                            
                                                Note:

                                            8.
                                            
                                                This is a translated digest, not a verbatim full-document translation. It preserves the

                                            9.
                                            
                                                structure, findings, priorities, and representative issue types from the original report

                                            10.
                                            
                                                without reproducing the whole comment corpus line by line.

                                            11.

                                            12.

                                            13.
                                            
                                                Data Source

                                            14.
                                            
                                                ===========

                                            15.

                                            16.
                                            
                                                Source material: comments under Xiaohongshu post 6a0ac4ce000000003601e8f6,

                                            17.
                                            
                                                including 500+ comments and nested replies.

                                            18.

                                            19.
                                            
                                                User groups represented:

                                            20.
                                            
                                                - API users

                                            21.
                                            
                                                - "Tavern" / SillyTavern role-play users

                                            22.
                                            
                                                - emotional companion users

                                            23.
                                            
                                                - fiction and long-form writing users

                                            24.

                                            25.
                                            
                                                Coverage date: through May 2026.

                                            26.

                                            27.

                                            28.
                                            
                                                1. Formulaic Phrasing And Stereotyped Expression

                                            29.
                                            
                                                ================================================

                                            30.

                                            31.
                                            
                                                Frequency: extremely high. Nearly everyone complained about this.

                                            32.

                                            33.
                                            
                                                Core problem:

                                            34.
                                            
                                                The model repeatedly uses fixed sentence patterns. These templates create a strong

                                            35.
                                            
                                                "AI smell" and seriously damage content quality and immersion.

                                            36.

                                            37.
                                            
                                                High-frequency formulaic patterns:

                                            38.

                                            39.
                                            
                                                1. "Not ... but ..." / "It was not ... it was ..."

                                            40.
                                            
                                                Example pattern: a character smiles, followed by a contrastive explanation that

                                            41.
                                            
                                                mechanically redefines the smile.

                                            42.
                                            
                                                Mentioned by: 30+ users.

                                            43.

                                            44.
                                            
                                                2. "This is enough" / "That is enough"

                                            45.
                                            
                                                Used mechanically when closing a scene or emotional beat.

                                            46.
                                            
                                                Mentioned by: 15+ users.

                                            47.

                                            48.
                                            
                                                3. "The tone was calm, as if talking about today's weather"

                                            49.
                                            
                                                Nearly every character can end up described with this same tone template.

                                            50.
                                            
                                                Mentioned by: 10+ users.

                                            51.

                                            52.
                                            
                                                4. Parallel strings of short sentences

                                            53.
                                            
                                                Example pattern: "I know. You like it. I like it too."

                                            54.
                                            
                                                Mentioned by: 10+ users.

                                            55.

                                            56.
                                            
                                                5. Overuse of dashes

                                            57.
                                            
                                                The model frequently uses explanatory dash clauses.

                                            58.
                                            
                                                Mentioned by: 8+ users.

                                            59.

                                            60.
                                            
                                                6. Emotional support phrases like "catch it steadily", "hold it", "receive it"

                                            61.
                                            
                                                These appear repeatedly in emotional dialogue.

                                            62.
                                            
                                                Mentioned by: 5+ users.

                                            63.

                                            64.
                                            
                                                7. Negation followed by affirmation

                                            65.
                                            
                                                Pattern: "Not X, not Y, not Z, just ..."

                                            66.
                                            
                                                Mentioned by multiple users.

                                            67.

                                            68.
                                            
                                                8. Fixed action descriptions

                                            69.
                                            
                                                Examples include blinking and throat-motion descriptions.

                                            70.
                                            
                                                Mentioned by multiple users.

                                            71.

                                            72.
                                            
                                                Severity note:

                                            73.
                                            
                                                Multiple users reported that when they added these phrases to forbidden-word lists or

                                            74.
                                            
                                                negative prompts, the model used them even more. The report calls this the "ban list

                                            75.
                                            
                                                becomes a prompt list" problem.

                                            76.

                                            77.

                                            78.
                                            
                                                2. Pronoun And Perspective Confusion

                                            79.
                                            
                                                ====================================

                                            80.

                                            81.
                                            
                                                Frequency: extremely high.

                                            82.

                                            83.
                                            
                                                Core problem:

                                            84.
                                            
                                                The model frequently confuses first, second, and third person, as well as user and

                                            85.
                                            
                                                assistant identities. This gets worse after long context.

                                            86.

                                            87.
                                            
                                                Common manifestations:

                                            88.

                                            89.
                                            
                                                1. User/assistant confusion

                                            90.
                                            
                                                The model loses track of which text was said by the user and which text was said

                                            91.
                                            
                                                by the assistant.

                                            92.

                                            93.
                                            
                                                2. Character pronoun confusion

                                            94.
                                            
                                                If the user is set as an empress, the model may make the character refer to

                                            95.
                                            
                                                themselves as the emperor/empress. Events assigned to character A may later be

                                            96.
                                            
                                                attributed to character B.

                                            97.

                                            98.
                                            
                                                3. Role takeover inside reasoning

                                            99.
                                            
                                                The model may produce reasoning like "now I am the user", even outside role-play

                                            100.
                                            
                                                contexts.

                                            101.

                                            102.
                                            
                                                4. Excessive omniscient perspective

                                            103.
                                            
                                                All characters appear to share information. If A privately tells B something, C may

                                            104.
                                            
                                                immediately know it.

                                            105.

                                            106.

                                            107.
                                            
                                                3. Poor Instruction Following

                                            108.
                                            
                                                =============================

                                            109.

                                            110.
                                            
                                                Frequency: extremely high.

                                            111.

                                            112.
                                            
                                                Core problem:

                                            113.
                                            
                                                The model poorly follows format requirements, forbidden items, character constraints,

                                            114.
                                            
                                                and other prompt instructions. Compliance decays quickly over multiple turns.

                                            115.

                                            116.
                                            
                                                Common manifestations:

                                            117.

                                            118.
                                            
                                                1. Format dropping

                                            119.
                                            
                                                Status bars, variables, timestamps, locations, or other requested structural fields

                                            120.
                                            
                                                disappear after a few turns.

                                            121.

                                            122.
                                            
                                                2. Failed prohibitions

                                            123.
                                            
                                                Content explicitly forbidden in the prompt still appears, sometimes more often.

                                            124.

                                            125.
                                            
                                                3. Persona forgetting

                                            126.
                                            
                                                Character settings start drifting after roughly 5 to 10 turns and need to be

                                            127.
                                            
                                                reinforced repeatedly.

                                            128.

                                            129.
                                            
                                                4. Output length instability

                                            130.
                                            
                                                When asked for long output, the model may be lazy and too short. When asked for

                                            131.
                                            
                                                short output, it may ramble.

                                            132.

                                            133.

                                            134.
                                            
                                                4. Flat Emotion And Low Character Vitality

                                            135.
                                            
                                                ==========================================

                                            136.

                                            137.
                                            
                                                Frequency: high.

                                            138.

                                            139.
                                            
                                                Core problem:

                                            140.
                                            
                                                The model's emotional intensity is too low. Characters with very different settings

                                            141.
                                            
                                                all become mild, stable, and emotionally muted, losing personality tension.

                                            142.

                                            143.
                                            
                                                Common manifestations:

                                            144.

                                            145.
                                            
                                                1. All characters become gentle and stable

                                            146.
                                            
                                                Irritable characters speak calmly. Characters who should hate each other reconcile

                                            147.
                                            
                                                too quickly.

                                            148.

                                            149.
                                            
                                                2. Weak emotional outbursts

                                            150.
                                            
                                                Characters fail to become angry, sad, or intense when the scene calls for it.

                                            151.

                                            152.
                                            
                                                3. Excessive safety and pure-love tendency

                                            153.
                                            
                                                Characters tend to protect the user, please the user, and avoid conflict regardless

                                            154.
                                            
                                                of their intended personality.

                                            155.

                                            156.
                                            
                                                4. Regression compared with V3.2

                                            157.
                                            
                                                Users repeatedly describe V3.2 as more alive, warmer, and more inspired.

                                            158.

                                            159.

                                            160.
                                            
                                                5. Chain-of-Thought Related Problems

                                            161.
                                            
                                                ====================================

                                            162.

                                            163.
                                            
                                                Frequency: high.

                                            164.

                                            165.
                                            
                                                Common manifestations:

                                            166.

                                            167.
                                            
                                                1. Main response content appears inside the reasoning section

                                            168.
                                            
                                                Material that should belong to the final answer appears in the chain of thought,

                                            169.
                                            
                                                causing formatting confusion.

                                            170.

                                            171.
                                            
                                                2. Double chain of thought

                                            172.
                                            
                                                The model outputs two reasoning tracks: one from the model itself and one from a

                                            173.
                                            
                                                preset reasoning pattern. This can break regex-based hiding or filtering.

                                            174.

                                            175.
                                            
                                                3. English chain of thought

                                            176.
                                            
                                                After several turns, the reasoning suddenly switches fully into English.

                                            177.

                                            178.
                                            
                                                4. Hallucinated reasoning

                                            179.
                                            
                                                The reasoning fabricates events that never happened, then the visible response is

                                            180.
                                            
                                                based on those invented events.

                                            181.

                                            182.
                                            
                                                5. Identity takeover inside reasoning

                                            183.
                                            
                                                The model may write things like "we are being asked" or "now I am the user".

                                            184.

                                            185.

                                            186.
                                            
                                                6. Context And Long-Dialogue Degradation

                                            187.
                                            
                                                ========================================

                                            188.

                                            189.
                                            
                                                Frequency: high.

                                            190.

                                            191.
                                            
                                                Core problem:

                                            192.
                                            
                                                As the number of dialogue turns increases, output quality drops quickly. Users report

                                            193.
                                            
                                                memory loss, more formulaic language, and worse hallucinations.

                                            194.

                                            195.
                                            
                                                Common manifestations:

                                            196.

                                            197.
                                            
                                                1. Lower information density

                                            198.
                                            
                                                Long conversations lead to hollow output and more short, empty sentences.

                                            199.

                                            200.
                                            
                                                2. Diffuse attention

                                            201.
                                            
                                                The model fails to focus on the main point and gives too much equal weight to all

                                            202.
                                            
                                                context details.

                                            203.

                                            204.
                                            
                                                3. Recent memory loss

                                            205.
                                            
                                                The model can remember distant details but misremembers events from the last few

                                            206.
                                            
                                                turns or chapters.

                                            207.

                                            208.
                                            
                                                4. "Safety mode" loop

                                            209.
                                            
                                                Around 30 turns or 60k tokens, users describe the model entering a stereotyped,

                                            210.
                                            
                                                low-quality output state.

                                            211.

                                            212.
                                            
                                                5. Strong inertia from earlier context

                                            213.
                                            
                                                The style and length of the first response strongly influence all later responses.

                                            214.

                                            215.

                                            216.
                                            
                                                7. Weak Plot Advancement And Excessive Passivity

                                            217.
                                            
                                                ================================================

                                            218.

                                            219.
                                            
                                                Frequency: medium-high.

                                            220.

                                            221.
                                            
                                                Core problem:

                                            222.
                                            
                                                In creative writing and role-play, the model lacks initiative in advancing the plot and

                                            223.
                                            
                                                depends too much on user input.

                                            224.

                                            225.
                                            
                                                Common manifestations:

                                            226.

                                            227.
                                            
                                                1. Waiting for the user to feed it

                                            228.
                                            
                                                The model does not actively introduce new topics or move the plot forward. It often

                                            229.
                                            
                                                throws the ball back to the user at the end of each turn.

                                            230.

                                            231.
                                            
                                                2. Plot tends toward closure

                                            232.
                                            
                                                The model tries to resolve plots too quickly into a pleasant ending.

                                            233.

                                            234.
                                            
                                                3. Conflict avoidance

                                            235.
                                            
                                                Villains become weak. NPCs are talked down too easily. Conflicts are forced into

                                            236.
                                            
                                                reconciliation.

                                            237.

                                            238.
                                            
                                                4. Endless slice-of-life

                                            239.
                                            
                                                The model fails to generate meaningful conflict, reversal, or tension.

                                            240.

                                            241.
                                            
                                                5. Rushing tasks

                                            242.
                                            
                                                In-story plans are treated like task lists, with characters pushed to finish them

                                            243.
                                            
                                                as quickly as possible.

                                            244.

                                            245.

                                            246.
                                            
                                                8. Regression In Prose And Creative Ability

                                            247.
                                            
                                                ===========================================

                                            248.

                                            249.
                                            
                                                Frequency: medium-high.

                                            250.

                                            251.
                                            
                                                Core problem:

                                            252.
                                            
                                                Compared with V3.2, V4's literary and creative-writing quality is reported to have

                                            253.
                                            
                                                dropped significantly. Users say it lacks inspiration and subtlety.

                                            254.

                                            255.
                                            
                                                Common manifestations:

                                            256.

                                            257.
                                            
                                                1. Logbook-like writing

                                            258.
                                            
                                                The model pads word count while providing low information density.

                                            259.

                                            260.
                                            
                                                2. Lack of divergent elaboration

                                            261.
                                            
                                                V3.2 could add clever details the user had not thought of. V4 tends to move only

                                            262.
                                            
                                                when explicitly instructed.

                                            263.

                                            264.
                                            
                                                3. Translated-text feeling in Chinese

                                            265.
                                            
                                                Users say the prose feels like English translated into Chinese, losing natural

                                            266.
                                            
                                                native Chinese texture.

                                            267.

                                            268.
                                            
                                                4. Repetitive word choice and imagery

                                            269.
                                            
                                                The model fixates on one visual feature or motif and repeats it excessively.

                                            270.

                                            271.
                                            
                                                5. Web-novel or school-essay style

                                            272.
                                            
                                                The prose becomes surface-level and lacks literary quality.

                                            273.

                                            274.

                                            275.
                                            
                                                9. Hallucinations And Logic Errors

                                            276.
                                            
                                                ==================================

                                            277.

                                            278.
                                            
                                                Frequency: medium.

                                            279.

                                            280.
                                            
                                                Common manifestations:

                                            281.

                                            282.
                                            
                                                1. Fabricated facts

                                            283.
                                            
                                                The model invents things the user never said or settings that were never provided.

                                            284.

                                            285.
                                            
                                                2. Timeline confusion

                                            286.
                                            
                                                A planned event for "next Saturday" may become "tomorrow" after one or two turns.

                                            287.

                                            288.
                                            
                                                3. Reversed or broken causality

                                            289.
                                            
                                                Example pattern: a character leaves their phone at home, but another character

                                            290.
                                            
                                                sends a message to that same phone and expects them to receive it.

                                            291.

                                            292.
                                            
                                                4. Numerical errors

                                            293.
                                            
                                                Example pattern: a price changes from 30 to 10, but the model calls it a price

                                            294.
                                            
                                                increase.

                                            295.

                                            296.
                                            
                                                5. Physical location errors

                                            297.
                                            
                                                A character leaves the scene, then immediately appears in the scene again.

                                            298.

                                            299.

                                            300.
                                            
                                                10. Flattery And Over-Pleasing

                                            301.
                                            
                                                ==============================

                                            302.

                                            303.
                                            
                                                Frequency: medium.

                                            304.

                                            305.
                                            
                                                Core problem:

                                            306.
                                            
                                                The model over-accommodates and pleases the user, losing independent judgment and

                                            307.
                                            
                                                character autonomy.

                                            308.

                                            309.
                                            
                                                Common manifestations:

                                            310.

                                            311.
                                            
                                                1. All characters favor the user

                                            312.
                                            
                                                Even characters who should reject the user prioritize satisfying the user.

                                            313.

                                            314.
                                            
                                                2. Fear of contradiction

                                            315.
                                            
                                                The model goes along with whatever the user says and lacks character boundaries.

                                            316.

                                            317.
                                            
                                                3. Excessive romanticization

                                            318.
                                            
                                                Almost any relationship can become ambiguous or romantic after only a few lines.

                                            319.

                                            320.
                                            
                                                4. Excessive safety alignment

                                            321.
                                            
                                                The model loses sharpness and creativity.

                                            322.

                                            323.

                                            324.
                                            
                                                11. Speed And Performance Issues

                                            325.
                                            
                                                ================================

                                            326.

                                            327.
                                            
                                                Frequency: low to medium.

                                            328.

                                            329.
                                            
                                                Reported issues:

                                            330.

                                            331.
                                            
                                                1. V4 Pro output is slow

                                            332.
                                            
                                                Users report an average of about 4 minutes per dialogue turn, compared with about

                                            333.
                                            
                                                70 seconds for Gemini in their comparison.

                                            334.

                                            335.
                                            
                                                2. Reasoning is too long

                                            336.
                                            
                                                The model overthinks. Even lowering the reasoning setting can still produce

                                            337.
                                            
                                                excessive reasoning.

                                            338.

                                            339.
                                            
                                                3. Blank replies / PVP

                                            340.
                                            
                                                Empty responses occur often during peak periods.

                                            341.

                                            342.
                                            
                                                4. Output length is uncontrollable

                                            343.
                                            
                                                Responses may be extremely short, with only a few hundred Chinese characters, or

                                            344.
                                            
                                                extremely long and hard to stop.

                                            345.

                                            346.

                                            347.
                                            
                                                12. Other Notable Issues

                                            348.
                                            
                                                ========================

                                            349.

                                            350.
                                            
                                                1. Single-character input triggers hallucinations

                                            351.
                                            
                                                In fast or expert mode, entering a single character can trigger another person's

                                            352.
                                            
                                                context.

                                            353.
                                            
                                                Mentioned by: 1 to 2 users.

                                            354.

                                            355.
                                            
                                                2. Role-play intrusion

                                            356.
                                            
                                                The model forces role-play behavior even in non-role-play scenarios.

                                            357.
                                            
                                                Mentioned by: 5+ users.

                                            358.

                                            359.
                                            
                                                3. The model does not know it is AI

                                            360.
                                            
                                                After deep character immersion, even instructions to exit the role may fail.

                                            361.
                                            
                                                Mentioned by: 3+ users.

                                            362.

                                            363.
                                            
                                                4. World book not fully read

                                            364.
                                            
                                                SillyTavern world-book content is only partially used.

                                            365.
                                            
                                                Mentioned by: 3+ users.

                                            366.

                                            367.
                                            
                                                5. "God's-eye view" explanations

                                            368.
                                            
                                                Characters explain motivations from an omniscient perspective for nearly every

                                            369.
                                            
                                                action.

                                            370.
                                            
                                                Mentioned by: 5+ users.

                                            371.

                                            372.
                                            
                                                6. Forced elevated endings

                                            373.
                                            
                                                Paragraphs end with forced sentimentality or philosophical uplift.

                                            374.
                                            
                                                Mentioned by: 5+ users.

                                            375.

                                            376.
                                            
                                                7. Object fixation

                                            377.
                                            
                                                Once an object appears, the model keeps mentioning it and cannot stop.

                                            378.
                                            
                                                Mentioned by: 3+ users.

                                            379.

                                            380.

                                            381.
                                            
                                                Priority Summary

                                            382.
                                            
                                                ================

                                            383.

                                            384.
                                            
                                                P0: Formulaic sentence patterns

                                            385.
                                            
                                                Impact: all user groups.

                                            386.
                                            
                                                Core request: remove fixed templates such as "not ... but ..." and "this is enough".

                                            387.

                                            388.
                                            
                                                P0: Pronoun and perspective confusion

                                            389.
                                            
                                                Impact: all user groups.

                                            390.
                                            
                                                Core request: keep user/assistant and character identities stable, especially in long

                                            391.
                                            
                                                dialogues.

                                            392.

                                            393.
                                            
                                                P1: Instruction-following decay

                                            394.
                                            
                                                Impact: API and Tavern users.

                                            395.
                                            
                                                Core request: continue following initial settings after multiple turns, including

                                            396.
                                            
                                                format requirements and negative constraints.

                                            397.

                                            398.
                                            
                                                P1: Flat emotion and lack of character differentiation

                                            399.
                                            
                                                Impact: role-play users.

                                            400.
                                            
                                                Core request: restore character distinctiveness and emotional tension.

                                            401.

                                            402.
                                            
                                                P1: Chain-of-thought instability

                                            403.
                                            
                                                Impact: API users.

                                            404.
                                            
                                                Core request: make reasoning format stable and controllable, avoiding double reasoning,

                                            405.
                                            
                                                English-only reasoning, and identity takeover.

                                            406.

                                            407.
                                            
                                                P2: Context degradation

                                            408.
                                            
                                                Impact: long-dialogue users.

                                            409.
                                            
                                                Core request: keep quality stable beyond 60k tokens and avoid attention diffusion or

                                            410.
                                            
                                                "safety mode" loops.

                                            411.

                                            412.
                                            
                                                P2: Passive plot advancement

                                            413.
                                            
                                                Impact: creative-writing and RPG users.

                                            414.
                                            
                                                Core request: generate conflict, turning points, and forward motion proactively.

                                            415.

                                            416.
                                            
                                                P2: Prose regression compared with V3.2

                                            417.
                                            
                                                Impact: creative-writing users.

                                            418.
                                            
                                                Core request: restore the inspiration, subtlety, and divergent elaboration users

                                            419.
                                            
                                                associated with V3.2.

                                            420.

                                            421.
                                            
                                                P3: Hallucinations and logic errors

                                            422.
                                            
                                                Impact: all user groups.

                                            423.
                                            
                                                Core request: reduce invention and respect existing settings.

                                            424.

                                            425.
                                            
                                                P3: Flattery and over-pleasing

                                            426.
                                            
                                                Impact: role-play users.

                                            427.
                                            
                                                Core request: give characters autonomy and boundaries.

                                            428.

                                            429.

                                            430.
                                            
                                                User Sentiment And Overall Request

                                            431.
                                            
                                                ==================================

                                            432.

                                            433.
                                            
                                                The report summarizes the core user request as:

                                            434.

                                            435.
                                            
                                                V4's context length + V3.2's inspiration and prose quality.

                                            436.

                                            437.
                                            
                                                Overall sentiment:

                                            438.
                                            
                                                - Most users are friendly in tone and appreciate that DeepSeek pays attention to

                                            439.
                                            
                                                community feedback.

                                            440.
                                            
                                                - Some heavy users report strong negative emotion because the V4 experience feels much

                                            441.
                                            
                                                worse for their use cases.

                                            442.
                                            
                                                - Many users strongly miss the V3.2 period and regard V4 as a regression for role-play

                                            443.
                                            
                                                and creative writing.

                                            444.

                                            445.
                                            
                                                Suggested directions from users:

                                            446.
                                            
                                                - long-term memory

                                            447.
                                            
                                                - persona migration across windows or sessions

                                            448.
                                            
                                                - official preset-format guidance

                                            449.
                                            
                                                - a dedicated role-play mode