fundamental principles of working with llms

Software engineering has well-established principles. DRY. Separation of concerns. Write tests. Make small atomic PRs. Code review. These have been refined over decades and are basically conventional wisdom at this point. But working with LLMs and coding agents? That's an entirely new genre of skills and there's no established playbook yet. So what are the Fundamental Principles of Getting Good Output from LLMs? Whether it's coding or anything else, I have a few key tenets that, from my experience coding with agents full-time for the past 10 months, hold true in all cases. I haven't seen these principles synthesized like this anywhere, so I'll do my best. Very open to feedback or ideas especially if you disagree (DM me on X).

1. Alignment

You're absolutely right. Alignment is NUMBER ONE. This is the single most important principle. If you take nothing else from this post, take this: make sure you and your agent are aligned on what the end goal is. Specifically. What you actually mean. What the success state is. What "done" means. And, optionally, how to get there.

Funnily enough, alignment isn't just an LLM skill either. It's a life skill. Miscommunications are costly everywhere. Relationships, businesses, engineering teams. But with LLMs the cost is especially sneaky because the agent won't push back and say "I don't understand." It'll just confidently build the wrong thing.

James Clear has an apt analogy in Atomic Habits: "If a pilot leaving from LAX adjusts the heading just 3.5 degrees south, you will land in Washington, D.C., instead of New York. Such a small change is barely noticeable at takeoff — the nose of the airplane moves just a few feet — but when magnified across the entire United States, you end up hundreds of miles apart." Same thing here. Before your agent writes a single line of code, you need to be confident it understands what you want and is thinking about it the same way you are.

If you send it off with a different mental model of the goal than yours, it will do the wrong thing in subtle ways that are hard to debug. Especially with greenfield code where the agent writes so much that it's impossible to read it all. This is even worse when you're working on something niche or out-of-distribution, where the training data has a common answer for something similar but not quite the same.[1]

The models in 2026 are incredibly smart. Most bad AI output these days isn't because the model is dumb.[2] It's because you and the model were never aligned on the goal.

This article puts it well: "LLMs work best when the user defines their acceptance criteria before the first line of code is generated." Know what success looks like before you start. Sounds obvious, but almost nobody does it consistently.

2. Explain the why, not the how

Don't be prescriptive. LLMs are commoditized intelligence. That intelligence is literally what you're paying for. So if you're telling them exactly what to do step by step in excruciating detail and coming up with the plan yourself, you're basically doing the work for them. Sure it'll work but you're turning an incredibly capable reasoning engine into a dumb execution layer. You might as well write the code yourself at that point.[3] Everyone can see where the future is headed. Learn how to get the same result with less effort by communicating with the super genius in the computer, not by micromanaging it.

The skill that matters way more than knowing each specific implementation detail is being able to communicate what you want and why you want it. Explain your reasoning and your perspective. Anthropic's constitution blog post says this well: giving a model specific lists of things to do and not do is brittle. It doesn't generalize when the model encounters something out of distribution, something not explicitly covered by the rules. But giving it guiding principles and explaining why — that generalizes. The model can evaluate any new scenario against the principle, instead of hitting an edge case and going "it's not in the rules, I don't know what to do" and doing something arbitrary.

This is alignment at a deeper level. You're not just aligning on the goal, you're aligning on the reasoning behind the goal. Don't be prescriptive. Guidelines over brittle rules. That makes everything the AI does more robust.

3. Give the agent a laboratory

So now your agent understands what success looks like, the reasoning behind it, and has the freedom to figure out the approach. The next question is: can it actually verify that it succeeded? Can it check its own work without coming back to you?[4]

This is where leverage comes from. If the agent can verify its own output against the success criteria, it can iterate without you in the loop. It just keeps working until it meets the bar you set. LLMs are really good at hill-climbing toward verifiable goals. That's what they have been RL'd to do. When you give them a way to check their own work, you're enabling one of their core strengths. It unlocks autonomy. And that's leverage: a multiplier on your effort because the work is no longer blocked on you. You don't have to keep checking in and nudging. The agent handles it and works until completion.

If the agent asks YOU to verify something, you've hit the limits of what your harness can do. That's a good signal to stop and figure out how to give it that capability. Every time it needs a human in the loop to validate something, you lose the autonomy and the leverage that makes this whole thing powerful.

This article calls this "giving the agent a laboratory" and it's one of the best framings I've seen. The agent needs to be able to run its own experiments, see its own errors, iterate on its own output. That's why I give my agents Docker containers with their own browser. They can spin up dev servers, test UIs, scrape the web, install whatever they want. They can go absolutely nuts in there and I don't care. In fact that's exactly what I want. I also recently added gog CLI to my claude-ting setup and made an entire Google account just for my agents. They can make Google docs, send emails, whatever. I guess they can watch YouTube too... if they could process videos. The point is: the more tools they have, the more they can do without asking me. This is also what made OpenClaw so popular and powerful.

4. Iteration is inevitable

No matter how good your alignment is, no matter how clear your success criteria are, things will go wrong. It's hard to define everything perfectly upfront. In practice, mistakes happen and things get missed.

Could be your mistake. Could be the LLM's mistake. After all, these are probabilistic processes. Since we're always just sampling from a probability distribution of tokens, there's going to be a certain degree of inherent randomness. The model might just get it wrong sometimes.

So plan for iteration. Don't expect perfection on the first turn. The system should be designed to converge on the right answer, not to nail it in one shot. This is why the laboratory matters. Not because the first attempt will be perfect, but because the second and third attempts need a fast feedback loop to get there.[5]

5. Every mistake is engineerable

Iteration is inevitable, but the same mistake happening twice? That's a failure of your harness. When something goes wrong, don't just fix it. Make sure it can never happen again.

Mitchell Hashimoto writes about this as "engineering the harness." Mitchell, you're absolutely right. There are two levers:

Prompting — Update your CLAUDE.md, AGENTS.md, or whatever docs your agent reads at the start of every session. If the agent keeps making the same kind of mistake, write a rule that prevents it. The docs should encode every hard-won lesson. Skills or other specialized docs work for this too if the lesson isn't universally applicable to your codebase.
Tooling — Write scripts, linters, automated checks that catch the error before it gets to you. If a class of mistake can be detected programmatically, it should be.

Even better: add a line to your AGENTS.md about continuous improvement of the harness. Have the agent itself reflect after completing a task. What went wrong? What would have helped? What should be added to the docs? Each task improves the next one. Your system should be getting smarter over time, not just you.[6]

6. Automate everything

If you find yourself doing the same flow with an LLM multiple times, stop. That's unnecessary finger exercise. Make the workflow instructions into a reusable skill you can reference. Same leverage principle: build once, use forever.

As you use the skill, you'll naturally find issues with it. Edge cases it doesn't handle, steps that could be better, unnecessary parts. You update it. It gets better. That's the spirit of continuous improvement. The skill is a living thing that gets refined every time you run it.

And eventually some workflows don't even need you to kick them off. They can be fully automated. Scheduled agent runs. Cron jobs. Things that just happen in the background while you sleep. That's the endgame of leverage.

The bigger picture

LLMs are commoditized intelligence. Intelligence used to be the bottleneck for everything. Now you can spin it up on demand.

That gives you leverage. You build a system once, then leverage it infinitely. A good CLAUDE.md benefits every future session. A test harness benefits every future bug fix. A skill benefits every future invocation. The work compounds.

It also gives you autonomy. The AI frees up your mental bandwidth. You stop being the one implementing and start being the one deciding what to implement. What to build next. What the right approach is. How the pieces fit together. Your job becomes the WHAT, not the HOW.[7]

As many before me have said, taste and motivation become a lot more important in this era. You're not just limited by technical ability anymore. You're limited by how good your ideas are, whether you can see what needs to be built, and whether you have the drive to ship it.[9]

What separates people who are effective with AI from people who aren't isn't the model. It's the harness. The tools, the docs, the eval criteria, the feedback loops, the workflows. Harness engineering is software engineering, just a different abstraction layer. And honestly, the specific code quality of any given output barely matters anymore. It's trivial to refactor. What matters is the speed at which you can ship without introducing regressions, and how much of the process you can automate. In other words, how much leverage you can create.

In my opinion, the harness is AGI. Not the model by itself. A god-tier harness plus current models could almost equal AGI. We're way closer than people think, though I feel that the goalposts have shifted a bit. The bottleneck isn't model intelligence.[8] It's how you wield it and the systems you design around it.

Go build your harness.

I ran into this trying to get an LLM to implement "Live RSI", which is TradingView-style RSI that updates every tick, basically what the RSI would be if the current price were the current candle's close. The model kept giving me standard RSI / a broken implementation of live RSI instead. It didn't understand the concept even when I thought I'd explained it clearly (I guess I hadn't), and made the wrong implementation multiple times. Annoying.↑
That said, sometimes the model is just dumb. I've had quite a few experiences with Claude Opus 4.6 in Claude Code where it was absolutely boneheadedly stupid. I'd specify exactly what needed to happen, think I'd made it crystal clear, and it would just drop details. Forget things I told it from the start of the conversation. Miss stuff that should be obvious. This happened on multiple occasions and was incredibly frustrating. It didn't seem to be a problem before mid-January, and has been weirdly inconsistent since. This is honestly one of the main reasons why I think Codex > Claude right now. Codex thinks deeper, follows instructions well, catches edge cases almost to a fault. Claude has his strengths but the weakness is that he just misses things. I really hope they fix this with the next Opus release so I can switch back. ↑
An archaic practice of 2023 and earlier where developers wrote code by hand, typing the syntax character by character into a text editor. ↑
I forgot to talk about this explicitly in this post, but Moss's blog post on backpressure is a great read on this topic and definitely shaped my thinking here. ↑
Also, f*** spec-driven development. Some people think you can write a 100-page document that defines exactly the application you want to build and the AI will build it perfectly. One, it's impossible to account for everything, and two, at that point you're just coding with extra steps. Iterate more. ↑
Claude Code's auto-memory system kind of works for this too, but I'm not sold on how robust it is. If a learning is universally applicable to the repo and relevant across coding agents, it should be codified explicitly and checked into the repo, not floating in one agent's memory. ↑
At the moment, this is really only achievable by SWEs. The models don't build things with taste and can't reliably make big things work end-to-end in one go. Good SWE taste is a hard thing to train into a model, so I expect it'll be like this for the foreseeable future. SWEs are definitely not out of a job yet. You still need to make sure the code actually works, know how to debug, and generally being technical is a huge advantage. ↑
Obviously, as Dario himself has stated, some things are still missing. Continuous learning. Driving tasks across long time horizons. Etc. But I think most of these can be solved at the harness layer, not necessarily the model layer. It'll get easier as RL improves and LLMs' task-solving abilities generalize further. ↑
The other most important skill to practice right now is extreme clarity of thought and being good at communicating it in writing. That's the interface layer between you and the machine. If you can't think clearly and write clearly, your work will devolve into slop regardless of which model you're using. Basically, it's a skill issue. ↑