What AI Polish Does to Hunter S. Thompson

What AI Polish Does to Hunter S. Thompson

June 15, 2026 · by Michael Morrison · 14 min read

AI models can write two kinds of prose with opposite needs — one where surface noise is texture, one where surface noise is failure. Knowing which is which is the difference between liberating a writer's voice and quietly butchering it.

I know, it’s probably blasphemy to even mention AI writing in the same sentence as Hunter S. Thompson, but let’s run with it. An AI humanizer applied to his actual writing would have butchered him. The gonzo is the signature. The mangled syntax, the staccato outbursts, the way a sentence pivots mid-thought into something disreputable — those are not flaws an editor needs to clean up. They are the prose. Polish those out and you have not produced Thompson. You have produced a competent stranger’s version of Thompson, which is to say: an unremarkable column. And that doesn’t begin to address the struggles AI would have with nakedly promoting inhuman amounts of illicit drug use.

I know all of this because I tried. Not on Thompson himself — he’s beyond the reach of any AI editor I’ve built. But one of the reporters in a spoof newspaper I’ve been working on is a Thompson-shaped voice, the kind that runs sentences off the road on purpose, and a perfectly reasonable editorial pass I’d built kept cleaning him up into someone else.

I didn’t expect to learn this the way I learned it. I ran into it building two AI-generated narrative products on opposite sides of the same craft question, and watching one of them get worse when I applied the editorial pattern that was making the other one better. The discovery wasn’t an idea I worked out in theory. It was something I noticed in production while a quality pipeline I’d been proud of was quietly degrading work I cared about.

The puzzle that came out of it has stayed with me. The same AI model — same weights, same provider, same architecture — can produce two kinds of prose with opposite needs. One kind wants surface noise as texture. The other kind wants surface noise as failure. Most people building with these tools have not yet noticed that the difference dictates almost everything else: where the human hand goes, where the model gets to be reckless, how the work gets reviewed, even whether the work should be pregenerated or generated live at the moment of use.

If you build with AI, you will eventually have to know this. If you don’t, you will produce competent strangers’ very average versions of work you meant to make. And if your first reaction is why lean on AI at all instead of writing the thing yourself — fair, and I’ve made that case in Unlocked, Not Cheated. I won’t relitigate it here. I’m building this way; the question that interests me is what the work itself demands.

Two kinds of writing

The cleanest names I’ve found for the two modes are performance writing and literary writing.

Performance writing is comedy, sketch, satire, improv-shaped scripts, gonzo journalism. Its quality lives in the moment — the audience is leaning in, alert, surprised, laughing. The structural shape is regular (setup, escalation, punchline; or premise, riff, riff, riff) and a competent model gets the shape right reliably. What the model gets wrong at the surface, the format absorbs. A slightly malformed line in a comic bit reads as commitment to the absurdity, not as failure. Awkwardness is part of the texture, which makes the audience forgiving because they are entertained, and surprise is the entire reward they came for.

Literary writing is bedtime prose, lyric essay, the kind of fiction that wants to be read in a quiet room. Its quality lives in the register — the audience is leaning out, winding down, settling into the rhythm. Consistency is the ritual; surprise is the disruption. The structural shape is a deceleration curve, felt at the sentence level. What the model gets wrong at the surface, the format does not absorb. A slightly off word — too sharp, too modern, too clinical, too pleased with itself — ejects the reader from the state the prose was trying to create. There is no laugh track to ride through. The audience is unforgiving because they are vulnerable; they came for an effect that breaks on contact with a wrong note. You don’t spend ten minutes establishing a serious tone with a calm, cool character only to force them into a surprise pratfall. That’s jarring, unexpected in a bad way.

The same model can produce both performance and literary writing. But the operational regime that produces each well runs in opposite directions from the other, and applying the wrong one to either format is a quiet form of creative vandalism.

I want to explain why this is, because the reasons are more useful than the framework.

The editor stack that Thompson would’ve gunned down

Stalefish Labs is a small studio, for the most part just me. I’ve been building two narrative-AI products under one roof: Sojourn, a bedtime story app for couples to read aloud to each other; and The Wayward Herald, a spoof newspaper where historical figures inexplicably find themselves dealing with mundane and often absurd modernity, the stuff we all understand but struggle with — DMV trips, HOA meetings, the strange labyrinthine route through an IKEA showroom when you really just came for Swedish meatballs.

I built a shared editorial stack across both products. Sensors that flag prose problems, an editor that decides what to fix, a reviser that executes the fix. The idea was: same engineering investment, same quality bar, two front-end formats. Reasonable, and partly wrong.

The stack worked beautifully on Sojourn from the start. It caught the small things that break bedtime prose — anachronisms in a story set in no specific time, modern verbs slipping into a register that wanted older ones, deceleration curves the writer flattened out under pressure. It produced prose that read as if it had been gone over by a careful human editor, because that was the design intent. Bedtime calm, wistful and full of wonder, but no jump scares or banana peel slips.

On Wayward, I had to start suppressing the same stack.

The reason was funny in itself. Each Wayward reporter who interviews the user after witnessing mayhem has a specific voice — different cadences, different verbal tics, different broken syntax. A reporter modeled loosely on a gonzo journalist will run sentences into a ditch on purpose. A reporter modeled on a Victorian naturalist will pile up clauses past the point any modern editor would tolerate. A reporter modeled on a Beat-era voice will let a paragraph dissolve into a single half-coherent observation that is the entire joke. The editor stack — trained, like every editorial system, to catch awkward prose — kept fixing those textures. The output came back smoother, cleaner, more correct, and considerably less funny. The gonzo-shaped voice came out reading like a slightly tired New York Times writer who had been asked to gesture at gonzo. Not good.

A Victorian naturalist in tweed examining a Costco receipt with deep scholarly concern. A Wayward reporter, mid-disquisition on a Costco receipt.

The first time I said it out loud — that an AI humanizer applied to the real Hunter S. Thompson would butcher him — was the moment the pattern clicked into a principle. Thankfully I wasn’t fully targeting the real HST or there would’ve been gunplay involved in this atrocity.

In comedy, surface noise is not noise. It is the medium. The deviations from clean prose are the signature. Polish them out and you have removed the writer.

In bedtime, surface noise is exactly noise. It is the thing the prose is fighting against. Polish is the medium. Leave the surface noisy and you have left the writer absent.

The same editorial pass that liberates one of those formats steals from the other, and trying to average between them only diminishes both. The work is recognizing which is which.

Where the human hand goes

Once the distinction is named, a less obvious question follows: where does human craft attention concentrate in each format? Because the AI is involved in both, and the human is involved in both, and the work of the human is not the same in both.

In comedy, the premise carries almost everything. Edgar Allan Poe attempts to assemble an IKEA shelving unit is funny before a single word is generated. The casting and the situation — the collision of contexts — is the act of authorship. The model’s job is to fall down the chute of that absurdity with energy and commitment, and the more reckless the model is in its descent, the better the result. The author’s hand sits up high in the workflow, in the setup. Below the setup, the prose details fall out semi-automatically. They don’t need polish; they need conviction. The human crafts a fun setup and mostly lets the AI play in it.

In literary writing, the premise can be left general because the work is in how it is rendered, sentence by sentence. A gardener tends the garden at the end of her life is a premise that has been written a thousand times. What makes a specific telling worth a reader’s attention is the texture of gardener and garden and evening light and the particular bowl on the particular table. The author’s hand sits low in the workflow, in the prose. Above the prose, the premise is a vessel. Below the prose, there is nothing more to do. The setup isn’t the magic the way it is for comedy, the magic is in the prose itself, so the AI can’t just romp freely.

That gives a clean rule of thumb for where to spend human attention per format:

  • Comedy: humans craft the setups. The model writes the details. We trust it to be reckless.
  • Literary: humans craft the prose. The model writes against tight authorial scaffolding. I polish what comes out with serious editorial care.

In neither case is the AI doing the creative work alone. The locus of human craft just sits in different places. Comedy wants premises and trust. Literary work wants prose and restraint.

There’s a useful piece of evidence for this in how each format fails when you get it backwards. Comedy fails upward when humans try to control the prose too tightly — every joke comes out sounding like it was workshopped to death. Bedtime fails downward when humans trust the model to set the register — every sentence comes out fine on its own and wrong in the room, a forest and trees problem. Either failure is technically correct — we’re not talking grammar or plot, we’re talking a crushing miss at what the format exists to deliver: comedy that isn’t funny, bedtime stories that jar.

Why this answers the operational question

The same distinction answers a question most AI-product teams treat as a pure engineering decision: generate live, or pregenerate? Latency and cost dominate that conversation. The craft question dominates the answer.

Comedy wants variety, and live generation supplies it; pre-rendering a comic format would calcify the bits and drain the bite. Literary work wants consistency, and pre-rendering supplies it; live-generating bedtime prose would put a vulnerable reader at the mercy of the model’s worst sentence of the night. The model’s variability is feature in one format and bug in the other, and the operational architecture should follow the format rather than the convenience of the build. Wayward runs live. Sojourn runs pregenerated. Same studio, same models, opposite architectures. Both are correct. Either applied to the wrong format is a quiet disaster. And for what it’s worth, I fought tooth and nail to get Sojourn to work live-generated, and it honestly got really close to working, maybe it still could some day. But ultimately the stories demanded the care of pregeneration plus post-love to hit the mark of bedtime stories that hold the line.

The portable principle

If you are making creative things with AI — writing, illustration, music, design, games — you will eventually arrive at the same fork. The temptation is to treat it as a technology question or a budget question. It isn’t, primarily. It’s a craft question, and it asks two things.

What kind of writing is your format? Performance, where surprise is the reward and the audience is alert? Or literary, where register is the reward and the audience is vulnerable? Most formats fall cleanly on one side; some sit on the line, and those are the hardest. Reporting is mostly literary but tolerates flashes of performance. Songwriting is performance in the recording but literary in the lyric. Knowing which side a piece of work lives on is the first question, and answering it wrong costs you the work.

Where does human craft live in your workflow? In the setup, where the model is then trusted to be reckless? Or in the prose, where the model is then polished by careful human (or human-supervised) editorial attention? Different answer for different formats, and the answer dictates almost everything operational about the project.

Most AI-creative products I see in the wild get one or both of these answers wrong. They over-edit performance work into competent strangers’ versions of what it should have been, or they under-edit literary work and produce prose that sounds plausible in isolation and wrong in a room. The model is not the problem in either case. The discipline is the problem. The format is asking for something specific, and the discipline isn’t supplying it.

The honest uncertainty

The framework is sharper than the world is. Not every format falls cleanly on one side of the line — long-form journalism, the personal essay, screenwriting, the kind of literary fiction that wants to be a little funny — some sit on the line, and figuring out where exactly is its own discipline I’m only at the start of having opinions about. The operational specifics will move as the models do, too; what gets polished, what gets pregenerated, where the editor stack helps and where it damages will all shift with the tools. The framework is anchored in how the work is consumed, not in how the model is trained, and that part I think is durable. But the binary is a lens, not a law.

What I’m more confident of is the underlying observation: the model is not the constraint. The format is the constraint, and the operational regime should follow the format, not the convenience of the build.

What stays

The Hunter S. Thompson question keeps coming back to me when I think about this, because it sits at the intersection of a few things that matter.

The first is that AI tools — generative ones especially — are exquisitely sensitive to the discipline applied to them. The same model, pointed at the same problem, can produce work that liberates a writer’s voice or work that erases it. The discipline is the difference, and the discipline is invisible from outside the work.

The second is that the discipline isn’t a technical specification. You can’t read it off a docs page. It has to be felt, format by format, by the person making the work. The editor stack that almost butchered the Wayward reporters was a technically competent system; it just had no idea what it was reviewing. That’s the kind of mistake that only gets caught by a human reading the output and saying, this is worse, and the reason is that the thing removed was the thing the joke was made of.

The third is that the discipline scales the way every other piece of judgment scales: slowly, through paying attention, through making things and watching them get better or worse. There’s no shortcut. You build a system that polishes prose, you point it at comedy, and you find out — because the comedy gets less funny — that comedy is not literary work.

One observation I’m setting aside for a future essay, because it’s a bigger argument than this one: performance and literary aren’t only craft categories. They also describe two different audience postures — alert and leaning in, or vulnerable and leaning out. The first is the posture of attention; the second is the posture of intimacy. The format question this essay walks through has a relational shape underneath it, and that shape turns out to be doing work in a lot of places I didn’t expect. I’ll come back to it.

What I take from this, and what I’d offer to anyone else building with these tools: assume that every format you work in has a discipline of its own, and that the discipline is unlikely to be obvious from the model’s documentation or from the way the previous format you worked in behaved. The model is plural. Your craft has to be plural too.

Polish, applied at the right point in the right kind of work, makes the work itself. Applied to the wrong point in the wrong kind of work, it produces a competent stranger.

Hunter S. Thompson, somewhere, is glad we figured this out before we hit publish. And probably more than a little pissed that his name is even being referenced in association with AI.

Want more like this?

Subscribe to get new posts from the lab delivered to your inbox.

or grab the RSS feed