Software Documentation That Engineers Actually Read

The Problem With Most Documentation

Most software documentation has one of two problems: there's too little of it, or there's too much of it in the wrong places. Both are equally useless.

Too little documentation means engineers waste time reverse-engineering decisions that should have been written down, new team members take months to become productive, and tribal knowledge evaporates whenever someone leaves. Too much documentation — comprehensive wikis that nobody reads, auto-generated API docs with no examples, architectural diagrams that haven't been updated in two years — creates noise that buries the signal.

The goal isn't comprehensive documentation. The goal is useful documentation: the minimum viable set of documents that genuinely helps engineers understand the system, make good decisions, and solve problems without interrupting each other. Here's how to build it.

The Documentation Hierarchy

Different documentation types serve different purposes. Understanding which type serves which purpose helps you write the right document instead of the most comprehensive one.

Level 1 — Orientating documentation: Helps new engineers understand what the system is and how to start working with it. READMEs, architecture overviews, "getting started" guides.

Level 2 — Decision documentation: Captures why things are the way they are. Architecture Decision Records, design documents, post-mortems.

Level 3 — Reference documentation: Describes the specifics of an interface or system. API docs, configuration references, data model documentation.

Level 4 — Operational documentation: Helps engineers operate the system in production. Runbooks, on-call guides, deployment procedures.

Each type has a different audience, a different level of detail, and a different maintenance burden. Write the right type for the job.

READMEs That Actually Orient

A README is a contract with the engineer who just pulled your repository for the first time. It has thirty seconds to tell them what they need to know to get started, or they'll give up and ping Slack instead.

A useful README answers exactly these questions, in this order:

What is this? One or two sentences. Not marketing copy. What does this service do in concrete terms?

How do I run it locally? Step-by-step instructions assuming nothing. Include every dependency, every environment variable, every setup command. Test these instructions on a clean machine periodically — they drift.

What are the key concepts? If the system has domain-specific terms or non-obvious architectural concepts, explain them briefly. Link to deeper documentation.

How do I run the tests? The command, what it runs, and how to interpret the output.

How do I deploy? Or where to find deployment documentation if it's complex enough to warrant its own document.

Who owns this? Team name, Slack channel, on-call rotation. The README is often how someone figures out who to contact when things break.

That's it. Don't put architectural design in the README — that belongs in an ADR or design doc. Don't put API reference in the README — that belongs in API docs. The README exists to orient, and it should do that quickly.

What Kills READMEs

Stale instructions. A README that lies is worse than no README. If the setup instructions no longer work, the README has negative value — it wastes time and erodes trust. Either fix it or delete the section.

Assuming context. "Configure your environment variables" is not setup documentation. "Copy .env.example to .env and fill in the following values:" is.

Everything in one file. A 2,000-line README is not a README. It's a poorly organized wiki. Extract the depth into linked documents.

Architecture Decision Records: Documentation That Ages Well

ADRs are covered in detail in another post, but they belong in the documentation discussion because they're the type of documentation most reliably ignored — and the most valuable when present.

An ADR captures why a significant decision was made. Not what the decision was (that's visible in the code), but the context, alternatives, and trade-offs. This ages extremely well because the context of a decision is exactly what gets lost over time.

The key practice: write the ADR as part of the decision-making process, not after. Bring the draft to the architecture review. Update it based on the discussion. Merge it with the code change it documents. Three sentences written while the decision is fresh are worth more than three paragraphs written six months later from memory.

API Documentation Engineers Will Actually Use

Auto-generated API documentation is a floor, not a ceiling. OpenAPI/Swagger specifications, generated from code annotations, provide accurate reference documentation automatically. This is necessary. It's not sufficient.

What generated docs don't provide:

Getting started guides. How does a new integration developer make their first successful API call? Walk them through authentication, a simple request, and error handling in a single, end-to-end example. This can't be generated.

Conceptual explanations. The difference between a draft and a published resource. When to use PATCH vs PUT. What "idempotency key" means in your context. Generated docs can describe fields; they can't explain concepts.

Error code documentation. Every machine-readable error code your API returns deserves a human-readable explanation and suggested remediation. "Order cannot be placed while payment is pending" is useful. "ERR_422" is not.

Realistic examples. Generated examples often use placeholder values. Real examples with actual representative data — the kinds of payloads your API actually produces and consumes — reduce integration errors.

Change log. What changed between API versions, and what should integrators do about it?

Runbooks: Documentation That Works at 3am

A runbook is operational documentation for a specific service or system — the procedures an on-call engineer needs when something goes wrong. It's the documentation equivalent of a decision in advance.

A runbook that isn't clear enough to follow at 3am under incident stress is not a runbook. It's a document.

Good runbooks are:

Specific. "Check the service health" is not a runbook step. "Run kubectl get pods -n payment-service and verify all pods are in Running state" is.

Linked to alerting. When an alert fires, the runbook link should be in the alert. Engineers should never have to search for the runbook for a specific alert.

Actionable. For each symptom, what do you check? What do you do? What's the escalation path if the standard remediation doesn't work?

Tested. Runbooks that have never been followed in practice are full of errors. Run through them during game days or chaos engineering exercises to find the gaps before they become incident gaps.

Current. Every architecture change that affects operations should trigger a runbook review. Runbooks that describe a system that no longer exists are dangerous.

The Documentation That Doesn't Need to Exist

Not everything needs documentation. Being selective about what you document is as important as writing good documentation for the things that matter.

You don't need to document:

Code that is self-explanatory (well-named functions with clear logic)
Implementation details that are visible in the code
Decisions that are trivially reversible
Architecture diagrams for their own sake, with no reader in mind

You do need to document:

The non-obvious reasons behind decisions
Anything a new engineer would need to know to get productive
Operational procedures for production systems
API contracts that other teams consume
Complex domain concepts that aren't obvious from the code

Making Documentation Sustainable

Documentation that nobody maintains is documentation that nobody trusts. Make documentation maintenance sustainable:

Co-locate documentation with code. Docs that live next to the code they describe are updated when the code changes. Docs in a separate wiki are updated when someone remembers.

Make documentation part of the definition of done. If a feature requires an API change, the ADR and API docs are part of that feature, not separate work.

Review documentation in code review. If a PR changes behavior and the relevant runbook or README isn't updated, the PR isn't done.

Delete stale documentation ruthlessly. Outdated documentation is worse than no documentation. A quarterly documentation audit that deletes more than it creates is a sign of a healthy documentation practice.

The benchmark for useful documentation is simple: does it help the people who need to use it? If your on-call engineer reaches for the runbook and finds what they need, the runbook is working. If your new hire reads the README and can run the service in an hour, the README is working. If nobody reads the wiki, the wiki isn't working.

Write for the reader. Write for the moment they need it most.

If you're building out an engineering documentation practice or auditing your current state, I'm happy to consult.