For the love of God, learn to paragraph
LLMs' wrongs redressed
Do you know what makes great writing? Great paragraphs. Back in middle school you may have learned paragraphing as the practice of indenting to introduce a new idea. You may have learned that paragraphs are coherent units of discourse, part of the comprehension apparatus of a text. You may have been taught that when writing, you should craft your paragraphs to guide the reader to expect a new thought and to integrate the new or revised information being delivered. The indentation, or white space, is a signpost for your reader that says what came before was one thing; what comes next is a different thing; adjust your focus accordingly.
Readers appreciate guidance. People generally like to know what’s coming. Paragraphs are crucial when the content of a text is unfamiliar; readers need to rely on structural cues to organize what they read. Good paragraphs at the beginning of a text build trust, as the reader can relax and presume coherence. Practiced readers can perceive paragraph boundaries even without indentation.
Strong paragraphs are essential for making strong arguments, for lodging ideas in a reader’s memory. Paragraphs are bricks. Each one has to hold together internally and bear weight in relation to the ones around it. If you can’t remember something you just read, it was likely a motley pile. LLMs have difficulty generating coherent paragraphs for reasons I will explain below. LLMs will litter their prose with bold and italicized phrases, bullets, and one-sentence-per-line formatting to disguise weak paragraphs.
I think about paragraphs from the perspective of a writer who wants to serve her reader quality prose. An economist might characterize paragraphing as a coordination game: two agents with partially aligned interests need to converge on the same interpretation of a structure. Conventions like paragraph breaks, headings, bullets, and typographic cues function as a signaling channel. A writer has an intended discourse plan, involving units, how each relates to the next, where shifts occur, what should be foregrounded, and what should be treated as support. The reader must infer that plan from the surface cues under time and attention constraints. Both parties benefit when the reader can reconstruct the writer’s intended segmentation with low friction, because the writer’s communicative goal gets achieved and the reader expends less interpretive work.
In a coordination game there are multiple plausible equilibria, each genre- and platform-dependent. You read a legal document, a journal article, a school essay, a long twitter thread, or a Substack post very differently. Each has different expectations about unit size, topic management, transitions, and how much repetition is acceptable. Over time, local equilibria are established and small cues become efficient, as both sides know what they mean. When a cue is violated, such as with a random indentation, readers may not know immediately why something feels off. Expectations about paragraphs are usually unconscious and only become salient when expectations are not met. Annoyance seems to come out of the blue.
So why are LLMs bad at paragraphing? First, there is no local equilibrium with LLM prose unless your prompt specifies one. I make the standard paragraph coordination game known to my pro models. I write in a relatively formal register, often making complex arguments, and I need strong paragraphs to guide my readers to a destination. But writers who don’t specify to whom they are writing and do not care what their readers’ expectations are put themselves at the mercy of an LLM that has been reinforced to mimic surface-level coherence.
The second reason is that LLM training data is saturated with internet text where discourse units may have few rules. Many platforms reward cue density and rapid attention cycling. Writers adapted to web usability research showing how readers tend to scan rather than read continuously. Highlighting and layout guide a reader’s eyes toward what should be noticed and where to click. Internet prose often wants you to act, not remember. The internet is also filled with reports with executive summaries and numbered sections that look like structure but often don’t have a clear argument. LLMs learn from internet prose that strong paragraphing is rare.
Third, an LLM’s training signal strongly rewards locally plausible continuation and offers only weak guidance about where a paragraph should begin or end. LLMs are optimized to predict the next token from preceding tokens, so models can produce fluent sentences and familiar formatting without necessarily coordinating those cues with a larger discourse plan. When LLMs generate a paragraph break, they are responding to local patterns, to a sentence that feels like a conclusion, a shift in topic words, the statistical likelihood that a break appears after a certain number of sentences. They are not tracking the global structure of an argument. Research on coherence boosting underscores that models can underuse distant context during generation, which makes it harder to place breaks that track an argument’s tiers over many sentences. The result is paragraphing that often reads as plausible in the moment, while weakly anchored to the overall structure.
You may notice that LLMs often generate bulleted lists, like, let’s say, three types of poor paragraphing and the cognitive burden each one costs the reader:
Under-segmentation: a long block of prose without enough indentations, where structure has to be inferred and readers must guess where units end and begin. This increases memory load and raises the probability that the reader groups propositions incorrectly.
Mis-segmentation: indentation not aligned with meaning, where readers invest in a shift that the text does not deliver, then have to repair the model by rereading, reclassifying, or abandoning integration in favor of skimming.
Over-signaling: excessive headings, bold, and micro-breaks where the cue system loses discriminative value and the reader no longer knows which signals are structural and which are decoration.
Bullet points are pebbles, not bricks. They sit atop each other without bearing any weight. It is simpler to say that under-segmentation results in confusion (as the reader is holding lots of prose in active memory); mis-segmentation results in betrayal (as the reader has to re-read to follow an argument); and over-signaling results in numbness (as the reader tries to figure out why all the bold words are jumping off the page).
LLMs “learned” during RLHF (reinforcement learning from human feedback) to produce typographical tricks like bullets to make text “readable,” likely because human evaluators working under time pressure registered formatting as evidence of coherence. Bold font and italics can make a text seem important and profound, as can the many aphorisms sprinkled throughout. But as I’ve explained in depth with regard to the “not X but Y” LLM tic, these metannoying constructions are a cognitive burden on serious readers who expect that a writer put thought and craft into guiding the reader along a path.
I agree with the theory that national reading scores have declined because of screens, if what is meant is the poorly paragraphed internet prose read on screens. People who have read more internet prose than long-form journalism or books on the printed page may no longer expect paragraphs to carry meaning at all. The more you read text where segmentation is designed for scanning, the harder it is to understand how paragraphs do real work. Young people who have mostly read on screens may not be primed to expect anything from a paragraph break; they do not gain important skills in building macro-structures and tracking a “main idea” across units. Learning to see paragraphs as functional units in a larger argument requires practice.
Writers also know it takes practice to craft good paragraphs. First drafts are rarely perfect. The point of revision is to fix things like weak paragraphs. An essay or story might be written in bits and pieces over time. Afterward you read it in its entirety and ask: does this break serve the argument? Should I combine these paragraphs? Split this big paragraph up? It turns out that good paragraphing is a global task best performed on already-produced material. Default LLM generation does not include an explicit revision phase. LLMs give a first-draft surface segmentation without any post-hoc alignment.
If you care about paragraphs, as I do, and carefully follow rules such as “do not use a pronoun in the first sentence of a paragraph whose referent is in the previous paragraph,” you see the extent to which LLM segmentation decisions can have little or no function.
One of the first things I do when opening an essay is look at the first line of each paragraph. If I see a recurring set of empty openers such as ‘This matters because,’ ‘Here’s the thing,’ ‘It’s worth noting that,’ ‘What’s interesting is,’ ‘The reality is,’ ‘But here’s the catch,’ ‘To be clear,’ ‘At its core,’ I stop reading. Each one of these false-transition openers tells you to pay attention without explaining how a new block of text connects to what you just read. LLM prose offers local emphasis markers to spike the reader’s attention without guidance as to how to think about the larger argument.
Fortunately, everyone benefits from strong paragraphs. If you are using an LLM to draft prose, understand that default generation will not result in strong paragraphs. Prompt your LLM to write a one-sentence-per-paragraph plan that states each paragraph’s role in your larger argument and to generate each paragraph against that plan. Then, revise, attending to paragraph breaks and topic sentences, asking whether each unit delivers on its promise. Use your trowel to place each brick carefully. Keep your whole plan in view and treat every paragraph break as a promise to the reader that, at length, you have their understanding in mind.



A very useful post! Thanks very much. How about producing your
own book on writing in the LLM age?
Topic, explanation/evidence, explanation/evidence, etc., conclusion/link…no?