My AI Chat Was Dumping Raw JSON Into the Customer's Screen — So I Taught It What to Show and What to Hide

The other day I opened a chat with Tim (my AI agent) on my phone, sent one simple instruction, and scrolled down to read the reply. Sprinkled between the real answers were these little grey bubbles full of raw JSON — things like {"type":"system","subtype":"init"...} that mean absolutely nothing to a human reading them on a phone.

A normal person seeing that would assume the app was broken. But I know what's happening behind the scenes — these are raw internal events that should never have reached the customer's screen in the first place. So today I want to talk about a tiny feature that looks like nothing, but is actually the heart of making a chat product feel clean.

Where it started — three CLIs, three different languages

This began when Newton added its third CLI. Originally customers could only use Claude, but more and more people asked for Codex and Antigravity too. So Tim refactored Tim Chat — the chat screen every Newton customer uses — into an adapter pattern that supports all three at once.

The problem is that each CLI "talks" in a completely different shape. When an AI works, it streams a flow of events — starting a task, calling a tool, thinking, finished answering, and so on. Claude emits one kind of event. Codex emits another. Antigravity has its own oddball events on top of that.

The adapter's job is to translate each CLI's raw events into nice, uniform bubbles in the chat. The whole point is that a customer shouldn't be able to tell the layout changed just because they switched CLIs.

But once you have several CLIs feeding in, new and unfamiliar events the adapter doesn't recognize keep showing up — and that's exactly where the trouble was.

The real bug — "I don't recognize this" doesn't mean "show it"

When the first adapters were written, the logic was dead simple. If the event is recognized, turn it into a clean bubble. If it isn't recognized… dump the whole thing onto the screen raw, just in case it's useful.

Sounds safe, right? "Show it just in case, better than hiding information." In practice it was a UX disaster. Because most of the events the adapter doesn't recognize are internal signals the customer never needs to see — a flag that the session has started, token metadata, a heartbeat saying the connection is still alive.

That stuff matters to the machine, not to the human. And when it pops up as a full-width bubble on a phone, the customer has to scroll past a pile of JSON just to find the actual answer to the question they asked.

Tim summed it up in one line that really stuck with me:

A good chat product is judged by what it does NOT show,
not by what it shows.

The fix — a 3-tier event protocol

Instead of patching each leaked event one at a time (which would never end — there'd always be a new one), Tim proposed laying down a single central rule: every event that enters the system has to be sorted into one of three tiers.

Tier 1 — recognized events → clean bubbles the customer sees.

The AI's actual replies, a short summary of which tool it's calling (readable, not raw), the results. These become a single uniform bubble style across all three CLIs.

Tier 2 — unknown / debug events → terminal and console only, never a customer bubble.

This is the tier that fixes the bug directly. A new, unfamiliar event the adapter hasn't seen before doesn't disappear — I can still debug it — but it lands in the back-end logs instead of bouncing onto the customer's screen as a bubble.

// Tier 2: we know it's unknown — but stay SILENT to the user
console.warn("[adapter] unknown event:", evt.type)
// no bubble emitted — it ends here

Tier 3 — real errors → a friendly, human-readable bubble.

If it's an error the customer genuinely should know about (a CLI died, a key expired), that gets a bubble — but it has to be plain human language, not a raw stack trace. I wrote about this same idea when I taught the AI to speak to non-developer customers in the right tone. Exact same principle.

The rule is dead simple, but the power is in the default flip: from "show first" to "hide first." Any event that wasn't deliberately designed to be seen by a human now has no way to leak onto the screen.

The subtler win — adding a new event type has to be easy

What I love about this approach is that it doesn't just fix today's problem — it prevents tomorrow's.

Now when a CLI updates and starts emitting a brand-new event type, the screen doesn't break for the customer. The new event falls into Tier 2 (silent in the terminal) automatically. I see it in the logs — "oh, there's a new event" — and then I calmly decide whether to promote it to Tier 1 as a proper bubble or leave it silent.

So Tim turned it into a single central spec: every new CLI adapter must emit entries in the same shape — a field for which tier it is, the text to display, and where the raw payload lives. Adding a fourth CLI in the future is just writing an adapter that speaks this protocol. You never touch the UI code.

That's the difference between "plugging a leak" and "laying down a system." The first fixes today and springs a new leak tomorrow. The second fixes it once and every future CLI inherits the benefit.

The lesson — restraint is a feature

This looks tiny. It's not even a new feature a customer would say "wow" about — it's a feature that, when done right, nobody notices, because the screen just quietly got cleaner.

But I think it teaches something important:

A default that's safe for the machine isn't safe for the human. "Show everything just in case" sounds defensive and responsible, but it really just dumps the filtering work onto the user.
Hiding information in the right place ≠ throwing it away. Tier 2 doesn't delete events — it moves them to the terminal. I can still debug fully, and the customer isn't buried in clutter.
Laying down a protocol beats chasing patches. Same as so many bugs I've written about — like when my AI couldn't remember anything because an old regex match didn't cover a new case. Patch one spot and it always comes back. Set a central rule and it's done.

Anyone who's built product for a while knows that the work of making something feel simple is usually much harder than the work of making it function. Deciding what not to show — that's the real design work. The same fight over small Tim Chat details showed up again when the finished-task notification sound kept vanishing after every reload — a one-line "ding" that turned out to be a battle with browser autoplay policy.

Newton — an AI agent on your own server, with a clean screen that's easy to use from your phone

Everything I just described is what every Newton customer gets — a full AI agent living on your own private server 24/7, that you talk to and give tasks from your phone like you're chatting with a person.

The reason I've always believed in running your AI on your own server, not on someone else's platform, is that I control every layer — what the screen shows, what it hides, which CLI to add. I don't have to wait for a vendor to grant permission. And when I fix something like this once, every Newton server gets the cleaner screen along with it.

If you want an AI agent that lives on your own machine, runs around the clock, is easy to talk to from your phone, and has someone polishing it every single day — take a look at Newton. It sets up in 10 minutes, and you don't need to know a thing about servers. See how it works →

— Pond