007, Making Bad Outputs Recoverable

After the payload cleanup, the backend finally looked stable enough to breathe.

The latest exchange was being fed from the right end of the previous narrator response. Recent chat was lower priority. The payload inspector was no longer misleading me. The model was finally seeing the right anchor more often.

The obvious next backend task looked like Memory Hygiene V2.

Less duplicate memory.

Less irrelevant memory.

Better retrieval.

That was all good.

But it was not the thing making the app uncomfortable to use.

The problem was simpler:

The app still made it annoying to recover from bad outputs.

And bad outputs are not going away.

What changed

This pass shifted from backend memory to app comfort.

First came the small friction fixes:

keep Return to Library visible while scrolling
make the chat header sticky
show feedback when copying the LLM payload

Small stuff.

But small stuff matters when the app is something I actually use.

If I have to scroll to the top just to leave a chat, that is friction. If I click Copy LLM Payload and get no feedback, I have to wonder whether it worked. If the app feels annoying during testing, I will avoid testing it.

That is bad for the project.

So the next feature was not smarter memory.

It was making the app less irritating.

Then came the bigger comfort feature:

Regenerate / Fix Last Response.

Because even with better context, the narrator can still mess up.

It can repeat a scene beat.

It can misread who said what.

It can invent time.

It can continue from the wrong emotional tone.

It can give an answer that is almost good, but wrong in one specific way.

The app needed a fast correction loop.

Why I touched this

The payload fixes made the system more stable, but they did not make the model perfect.

That is the important difference.

A better context compiler can reduce failure.

It cannot eliminate LLM behavior.

So the question changed again.

Not:

How do I prevent every bad response?

But:

What happens when a bad response appears?

Before this, the answer was ugly. I had to manually retry, edit around the problem, or keep pushing the scene forward and hope it recovered.

That is not good enough for long-form RP.

The user needs to be able to say:

Regenerate this.

or:

Fix this response.
Do not replay the phone reveal.
Continue from the kitchen.

without destroying the session.

That is the real product need.

UX Comfort Pass

The first comfort pass was intentionally small.

The Return to Library button needed to stay visible. I should not have to scroll back to the top of a long RP session just to leave the chat.

The payload copy button needed feedback. If I copy the LLM payload, the app should confirm it.

Not with a giant modal.

Just:

Copied!

or:

Payload copied.

Then it disappears.

That is not a deep feature. It is just basic usability.

But it made the app feel less like a prototype and more like something I could keep open while debugging.

The rule was simple:

Do not touch the backend.

Do not change hidden state.

Do not change provider behavior.

Just reduce friction.

Regenerate

Regenerate is the first real recovery tool.

The behavior is simple:

The latest user message stays the same.

The app asks the model for a new narrator response.

The old narrator response gets replaced, or at least clearly treated as discarded.

V1 does not need full branching yet.

That matters because full branching is a different system. Regenerate is a practical fix. It gives me a way to recover from a bad continuation without building a whole timeline tree immediately.

The point is speed.

If the model gives a bad answer, I should not have to repair the whole session manually.

Click regenerate.

Get another attempt.

Move on.

Fix with instruction

Regenerate is useful when the answer is generally bad.

Fix with instruction is useful when the answer is close.

Sometimes the model gets the tone right but repeats an old scene beat. Sometimes it writes well but invents time. Sometimes it understands the relationship but forgets the physical state.

In that case, I do not want a blind reroll.

I want to tell it what went wrong.

Example:

Continue from the kitchen.
Do not replay the phone reveal.
Keep the same emotional beat.

That instruction should be high-priority for the next generation.

But it should not become permanent memory.

That is important.

A correction instruction is not character history. It is not Soul memory. It is not a world event. It is a temporary repair command for the next response.

If the app stores it as memory, the system gets polluted.

So Fix with Instruction has to live in a separate category:

temporary correction context

not:

character memory

State safety

This feature is not just UI.

The dangerous part is hidden state.

If the model already produced a response, that response may have also produced a hidden-state patch. It may have updated memory, relationship state, world state, or recent events.

If I regenerate and the app simply applies another hidden-state patch on top, the discarded response still affects the Soul.

That is bad.

It means a response I rejected can still change the character.

So regenerate needs state safety.

The preferred behavior is:

Before generation:
save a snapshot

If regenerating:
restore snapshot
remove or replace old assistant response
generate again
apply only the accepted replacement

That is the real difference between a chat UI gimmick and a memory engine feature.

In a normal chatbot, regenerate just means “give me another answer.”

In Mnemosyne, regenerate also means:

Do not let rejected state become canon.

What got better

Bad outputs stopped being dead ends.

That is the real improvement.

Before this, a wrong narrator response created pressure. Either accept it and continue, or manually fight the system.

After this, there is a correction loop.

The model can fail.

The user can push back.

The app can try again.

That makes the whole project more usable.

It also changes how testing feels. I no longer need every response to be perfect. I need the app to make recovery cheap.

That is a healthier expectation for an LLM-based RP engine.

What is still rough

This is still not true branching.

That became obvious immediately.

Regenerate / Fix Last Response works for the latest output. But if I go back to an older assistant response and choose a different version, the later messages still belong to the old timeline.

That is not a branch.

That is just response variant selection.

A real branch means each response choice owns its own future.

Something like:

User 1
  Assistant 1A
    User 2A
      Assistant 2A

User 1
  Assistant 1B
    User 2B
      Assistant 2B

If I choose Assistant 1B, the app should not keep showing the later messages from Assistant 1A’s timeline as if nothing changed.

That needs a branch model.

Not just variants.

The next problem

The next issue was not only branching.

It was logs.

A consistency issue showed up, and the right move was not to immediately add more tracking.

That would be another overbuild.

The better move was to inspect the evidence first.

I need two kinds of logs:

visible chat export

Only what the user sees. No hidden state. No engine junk. Just the readable RP transcript.

And:

backend payload history

The actual LLM inputs over time, not only the current payload.

That matters because consistency bugs can come from many places:

character profile
latest exchange
status block
memory section
world snapshot
model invention

Before adding more tracking, I need to know which layer is causing the contradiction.

That is the same lesson as the payload inspector.

Inspect first.

Then build.

Next is making the session inspectable.

Regenerate makes bad outputs recoverable.

But branching and consistency debugging need better records.

The next layer should export:

visible chat log
backend payload history

Then the app can answer the question that keeps coming back:

Did the model invent this,
or did we feed it conflicting state?

That has to come before full branch timelines.

Bad outputs are recoverable now.

Next, the session needs to become inspectable.

Covered commits

a543e1c Improve chat navigation and payload copy feedback
13f8dbe Anti-replay retry payload metadata fix