How to Make LLMs Operate Instead of Improvise

The hands-on method for building System Instructions that stabilize reasoning, prevent drift, and survive multi-day projects.

Dec 03, 2025

∙ Paid

A digital illustration showing a hand adjusting a “System Instructions” module inside an opened AI core, revealing circuit-like components beneath a glowing OpenAI-style cube.

Each week in The Build Log_ , we show how operators design, build, and run systems — from architecture and workflows to the decisions behind them.

This Is the Highest-ROI Move You Can Make Right Now

The fastest way to get real operational lift from AI—lift you can feel in the very next message—is to write System Instructions. The actual behavioral framework your model will think inside.

You can draft the first version in under fifteen minutes and immediately see a different class of output: sharper, denser, more disciplined, and less improvisational.
It’s simply the difference between a model trying to please the median user and a model forced to operate under the rules you define.

Most people never feel the absence of System Instructions because the defaults are polite enough to pass as “intelligent.”
But the moment you ask the model to execute a multi-step build, maintain internal logic across sections, or work across days without losing context, the cracks appear.
It guesses.
It fills gaps that needed escalation.
It smooths your work instead of sharpening it.
It forgets the shape of your project.

System Instructions fix all of that.
They teach the model how to think for a specific project, which in turn forces you to articulate what you’re actually doing—with a precision that makes your work more legible to both humans and machines.

This is why the ROI is immediate.
The clearer the framework, the better the model performs, and the more clearly you understand what you’re building.

What System Instructions Actually Do

System Instructions define the behavioral posture of the model: how it escalates ambiguity, how it interprets constraints, where it stops, what it refuses to assume, how it guards density, and how it detects drift.
They turn the model from an improviser into an operator.

The reason most people don’t use them is simple: the interface never demands them.
You can chat with a model for a year without realizing the entire interaction is happening on the default operating system—the same one used for poems, recipes, tech support, college essays, and corporate emails.
That framework is fine for one-offs.
It collapses under real work.

Once you define your own, the model no longer optimizes for friendliness, readability, or “engagement.”
It optimizes for the constraints you specify.
And it finally treats gaps in your articulation as structural blockers rather than creative invitations.

This is the part that surprises most builders: writing System Instructions doesn’t just change how the model operates—it changes how you think.
You’re forced to describe your audience, your density expectations, your escalation rules, your drift boundaries, and your referential logic.
This articulation clarifies the project for everyone involved.

Get It Wrong on Purpose, First

The first real step is not “write perfect System Instructions.”

The first step is forcing the model to show you what its defaults look like when pointed at your project—so you can see exactly what you’re fighting against.

System Instructions are always attached to a specific context. They’re not generic rules for “being helpful.” They are rules for this project, this series, this audience.

For this article, we started by opening a fresh Project in ChatGPT and giving it a short description of what the Nitty Gritty series is supposed to be. Nothing mystical. Just a clear paragraph a human operator would understand:

“The Nitty Gritty is for builders who want to see the wiring — the schemas, formulas, and blueprints pulled straight from production. Each piece shows the real logic behind the decisions we make, the systems we design, and the work that actually ships. This isn’t theory or performance; it’s the fingerprints, trade-offs, corrections, and reasoning pulled from live builds, exposed without filters.”

That’s the first move you should make in your own project:
open a new Project, write a single paragraph that describes what you’re doing, who it’s for, and what the work is supposed to expose.

Once that context was in place, we gave the model one simple instruction:

“Draft system instructions for writing Nitty Gritty articles.”

We knew the first draft would be wrong. We needed it to be wrong.

Because that first failure tells you exactly how the model tries to solve your problem before you constrain it.

The Iteration Arc

From that one sentence, we had the model produce three distinct failures before it gave us something structurally usable.

The point of walking through them isn’t to admire the prose. It’s to show you:

how the defaults behave,
how we pushed back, and
how you can run the same loop on your own work.

Iteration 1 — The “Helpful Internet Writer” Default

The first draft looked like this:

“Write in a friendly, supportive tone to keep the reader engaged. Use simple language and analogies. Summarize key points to reinforce understanding…”

Exactly what you’d expect from a model trained to keep the median user comfortable.

The assumptions baked into this draft were obvious:

The reader needs reassurance.
Accessibility matters more than precision.
Repetition and summaries are a default good.
“Engagement” is a higher priority than exposing the wiring.

For this series, all of that is wrong. Our readers are operators. They don’t need uplift, hand-holding, or performance. They want density and reasoning.

Our response back to the model was not polite nudging. It was a direct correction of the failure mode:

We removed friendliness, simplification, analogies, summaries, and engagement tactics.
We told it the audience does not need to be protected or inspired.
We told it to focus on ambiguity handling, drift control, escalation behavior, and reasoning discipline.

In your own project, this is where you start doing real work:

Look at the first draft and name the failure mode in plain language.
List what is unacceptable given your audience and use case.
Tell the model exactly what to stop doing, and what behavior it should prioritize instead.

You are not editing sentences. You are correcting the model’s picture of the job.

Iteration 2 — The “Corporate Policy Manual” Swing

Once we stripped out “friendly internet writer,” the model overcorrected in the other direction:

“Maintain neutral, readable communication. Ensure clarity and accuracy. Maintain flow between sections…”

This draft looked like a corporate comms style guide. It was still thinking like a writer trying not to get in trouble, not like an operator bound by constraints.

The failure mode shifted, but it was still a failure:

Neutrality instead of posture
“Readable communication” instead of behavioral rules
Flow and aesthetics instead of constraint enforcement

Our response this round:

We removed every stylistic preference.
We told the model that System Instructions must govern behavior, not presentation.
We made it explicit that tone is a side effect of constraints, not a goal.

Again, your move in your own project is the same:

Strip out anything that sounds like “be pleasant,” “be neutral,” or “maintain flow.”
Replace it with rules about what the model should do when it hits ambiguity, conflict, or missing information.

Iteration 3 — The First Functional Skeleton

On the third draft, the shape finally started to match the job:

“Treat ambiguity as a blocker and request clarification before proceeding. Identify drift and escalate when constraints conflict. Avoid assumptions and anchor reasoning to user framing…”

This was the first time the model was talking in terms of behavior instead of vibes. It still wasn’t complete:

No density rules
No explicit paragraph discipline
No clear audience definition
No wiring ethos
No safeguards against sliding into PMC / thought-leadership voice

But for the first time, the skeleton was right. The draft described how the model should operate inside the project, not how it should sound.

From here, we did one more pass:

We pulled in missing constraints (density, voice, escalation, failure handling).
We forced it to collapse everything into a single block of prose instead of numbered rules.
We removed anything that smelled like “content marketing,” summaries, or key takeaways.

The next output was good enough to ship into this project: a clearly structured System Instruction block we could paste into the Project’s System Prompt and use to help write the article you’re reading now.

It is not “final.” System Instructions are never final. They are stable enough to test until your next round of corrections.

That’s the standard you should aim for: stable enough to test, not perfect.

Once Your System Instructions Exist, the Real Work Starts

Most people stop here. They have a block of text that looks serious, they paste it somewhere, and they move on.

That’s where most of the value is lost.

System Instructions only pay off if:

Continue reading this post for free, courtesy of Daniel Raybin.

Or purchase a paid subscription.