Understanding Narrative Compression: Day 1 — The problem
Why we're information-rich and understanding-poor — and why current AI makes it worse.
Here’s a number that should bother you more than it does: the average knowledge worker produces and consumes more information in a single week than a medieval scholar encountered in a lifetime. And yet. Ask that knowledge worker to tell you the three things that actually matter about the project they’ve been on for six months, and watch what happens.
Hesitation. Qualifications. A long exhale followed by “it’s complicated.”
It is not complicated. They just don’t have a format that lets them hold the situation as a whole.
The default mode of every information system we’ve built — human and digital — is accumulation. More data, more context, more memory, more tokens. The assumption underneath is always the same: if you gather enough information, understanding will emerge. Comprehension is a phase transition that occurs at a sufficiently large volume.
It doesn’t. And we can be precise about why.
Let S be the total signal a system has accumulated — every message, every update, every document. Let U represent understanding: your ability to model the situation well enough to predict what happens next and act on it. The implicit assumption of every information system we build is:
Understanding is a function of signal volume. Get enough S, and U follows.
This is wrong. Understanding isn't a monotonic function of accumulation. Past a threshold, more signal decreases your ability to model the situation — because noise scales with volume while structure doesn't. What you actually get is something closer to:
Understanding depends on the structure you can extract from the signal, penalized by a noise term that grows with volume. Dump more unstructured information into the pile, and the second term eats the first.
You’ve seen this. You’ve seen it in the Slack channel with 10,000 messages where nobody can actually tell you the project status — not because the information isn’t there, but because it’s all there, undifferentiated: every update, tangent, and emoji reaction is weighted the same. You’ve seen it in the AI assistant that remembers five hundred facts about how you work but can’t tell you the three that matter, because it has no basis for knowing what “matters” means.
You’ve seen it in the quarterly review deck with forty-seven slides that leaves the entire room less clear than it was when they sat down — because the deck is a chronicle of what happened, not a model of what it means.
And here’s where it gets worse. Current AI isn’t solving this problem. Current AI is the problem’s most sophisticated expression.
Every major platform is optimizing for the same thing: recall. Longer context windows. Better retrieval. More memory. The engineering bet is that if the system can remember enough, understanding follows. They’re betting on:
— the exact assumption that’s already failing.
My research paper formalizes a distinction that makes this concrete. A chronicle is a temporal sequence of events without causal or intentional structure. Chat logs, version control histories, meeting transcripts, email threads — these are chronicles. They answer when. They cannot answer why, toward what end, or on the basis of what understanding.
The gap between chronicle and understanding is the central bottleneck of knowledge work. And current AI systems — with their ever-expanding context windows and retrieval pipelines — are building faster, more comprehensive chronicles. They are engineering solutions to a storage problem when the actual bottleneck is representation.
The instinct, at this point, is to reach for a better search. Better retrieval. Better summarization. But summarization is just a shorter accumulation. It reduces volume without changing format. A summary of a meeting is still a chronicle — a compressed list of what happened. It doesn’t transmit causal structure, temporal dynamics, or predictive constraints.
In information-theoretic terms, summarization reduces |S| but doesn’t change the ratio of structure to noise. You need a format that does something fundamentally different: one that maximizes the amount of situational understanding transmitted per unit of signal.
There’s a term for that ratio. It’s called inferential yield. And the format that maximizes it has been doing this work for as long as humans have been communicating about complex situations.
But that’s tomorrow.
Today’s essay is the problem. Tomorrow is the thesis — what inferential yield actually means, formally, and why one specific format maximizes it.
The problem is clear: we’re not short on information. We’re short on compression. Everything else in this series follows from that.
Day 2 drops tomorrow. Subscribe so you don’t miss it.
🗓️ Book a session at reddep.com
🧡

