← Back to Autonomy ← Companion to Vol. I, No. 2
A Short Illustrated Monograph · Vol. I, No. 3

Ten Ideas Every Engineer Needs Beyond the Code

By Majid Mazouchi
Problem Fit Workflows Documentation Error UX Accessibility Compliance Data Rights Fairness True Cost Sunsetting

Preface

From System to Product

The previous volume of this series argued that a working demo is not a working system. This one argues that a working system is not a working product. The technical work in volume two is necessary but not sufficient. Most products that ship still fail, and they fail for reasons the code cannot fix.

A demo proves it can work. A system proves it keeps working. A product proves someone wants it, can find it, can use it, and chooses to keep using it.

The ideas in this volume are easy to skip — they live outside the editor, outside the IDE, outside the satisfaction of green tests. They concern the people who use what you build, the laws that govern it, the data it consumes, the team that maintains it, and the day, eventually, when it has to be turned off. None of them are technical in the strict sense. All of them are engineering decisions in the broadest sense — choices made early that compound, for better or worse, for years.

If volume two was about the gap between possible and durable, this one is about the gap between durable and worth keeping.

A note on scope Not every idea here applies to every project equally. An internal tool used by ten people has different obligations than a public-facing service used by ten million. The discipline is in matching the rigor to the responsibility — over-engineering an exploratory prototype is just as wasteful as under-engineering a regulated product.
Part I

What You're Actually Building

A correct answer to the wrong question is the most expensive kind of correct. Most engineering effort is spent on the second question; the first deserves equal weight.
I.

Problem-Fit Comes Before Tech-Fit

Build the right thing, then build the thing right

A solution looking for a problem is, with rare luck, a hobby. With normal luck, it is a cost center.

Engineering culture rewards technical novelty: a clever architecture, a sophisticated model, a benchmark beaten. None of these matter if the thing being built doesn't solve a real, painful, frequent problem for someone willing to change behavior to use it. Most products that fail don't fail in production — they ship to indifference. The technical demo worked. The market shrugged.

The framing that helps: are you building a painkiller or a vitamin? Painkillers solve acute, specific pain. Users find them, adopt them quickly, pay for them. Vitamins are nice-to-have improvements; users say they want them in interviews and never quite get around to using them. Most engineering teams believe they are building a painkiller. Most are not.

Drag the dot to position your feature. The quadrant tells you whether it's a product or a problem.

Top-Left

Cool tech, no demand. The graveyard of clever startups.

Solution looking for problem
Top-Right

Real pain, real solution. This is the only square that ships.

Product opportunity
Bottom-Left

Don't go here. Don't pass go.

Avoid
Bottom-Right

Important problem, technology not yet ready. R&D, not product.

R&D territory
No real demand Real user need Not feasible Feasible today
Figure i.1 The quadrant most engineering-led teams gravitate to is the upper-left — clever tech, but the user need was assumed rather than verified. Verifying the right side of the diagram is what user research, customer development, and "talking to ten users" exist for.
Practical Before writing code, write a one-paragraph problem statement that names a specific user, a specific painful situation, the current workaround, and the cost of that workaround. If you cannot fill in those four blanks from real conversations, you are guessing — and the technical work that follows is leveraged on a guess.
II.

The User's Workflow, Not Just the User

Tools live inside processes, not in isolation

A great tool that doesn't fit the workflow is just clutter — extra cognitive load between the user and the thing they were already doing.

Engineers tend to optimize the central step — the algorithm, the prediction, the transformation. Users live in the whole pipeline. They come from somewhere (an email, a spreadsheet, a meeting), your tool is one step, and they leave to somewhere else (a report, a colleague, a decision). If you don't understand what comes before and after your step, your tool either won't be adopted, or it will be adopted with workarounds that defeat its purpose.

What the demo handles
predict()
What the user actually does
receive request
(email)
find & download
data files
clean & format
in Excel
predict()
interpret
results
write report
share via email
or Teams
Where adoption fails
friction here
friction here
friction here
demo wins
friction here
friction here
friction here
Figure ii.1 The demo handles one step beautifully. Six other steps live around it — and each one is where adoption fails. A 10× improvement on step four, with no change to the others, often nets to zero in real-world use.
Practical Map the full workflow before writing the tool. Sit with three users for an hour each and watch them do the thing. Note every click, copy-paste, and context-switch. Your tool should reduce total workflow time, not just optimize the step you find interesting.
Part II

The Human Surface

Code talks to code; products talk to people. Most of what users judge is not the algorithm — it is everything wrapped around it.
III.

Documentation Is a Feature

Code without docs has exactly one user

A library nobody can onboard to is a library nobody uses. The threshold for a working product is not "the code runs" — it is "a stranger can succeed with it in fifteen minutes."

Documentation is not a single thing; it is four things, each answering a different question. The Diátaxis framework names them clearly. Most projects have only the third — auto-generated reference — and wonder why adoption stalls. Reference is what an expert needs. Tutorials are what a beginner needs. Most users are not yet experts.

Tutorials
"Teach me"

Hand-held learning. Concrete steps, guaranteed-to-work outcomes. The user follows along; the user feels successful. Five-minute "hello world" beats three-hour exhaustive intro.

How-To Guides
"Help me solve this"

Goal-oriented recipes. The user has a specific problem; the guide tells them how to do this one thing. Cookbook style. Skimmable.

Reference
"What does this do?"

Exhaustive, accurate, machine-friendly. Every parameter, every return, every edge case. Auto-generated is fine — and is also where most projects stop.

Explanation
"Why is it like this?"

Concepts, design rationale, trade-offs, history. Reads like an essay. Helps experienced users build a mental model so they can extend it themselves.

Figure iii.1 The Diátaxis framework (Daniele Procida). Most projects ship reference and call it documentation. New users need tutorials. Returning users need how-to guides. Architects need explanation. Each is a distinct artifact with distinct rules.
Practical Treat the README as a five-minute tutorial — copy-pasteable commands, expected output, one realistic example. Add how-to guides as users ask "how do I…" questions. Reference grows automatically. Explanation gets written when you need to onboard a teammate or defend a design choice.
IV.

Errors, Empty States & Edge Paths

A product is judged by how it behaves when things go wrong

The happy path is where you sell the demo. The unhappy paths are where the product lives.

Real users hit empty databases on day one. They paste malformed data. They lose connectivity halfway through. They try things you didn't anticipate. A product spends most of its operational time in these edge paths, and the quality of its responses there determines how it feels to use. Three rules for an error message: tell the user (1) what happened, (2) why, and (3) what to do about it.

Demo grade
Error 500: Internal Server Error
Product grade
We couldn't save your changes — the server timed out. Your edits are still in the editor. Retry, or copy the text somewhere safe while we investigate.
Demo grade
ValueError: invalid literal for int() with base 10: '42.0'
Product grade
"42.0" isn't a whole number. Try removing the decimal, or use the decimal-allowed field below.
Empty state — demo
No results.
Empty state — product
No tickets match these filters. Try removing the date range, or [view all open tickets].
Figure iv.1 Three before-and-after pairs. The "bad" versions are technically accurate. The "good" versions tell the user what happened, suggest a next step, and respect their time. Both took the same engineering effort to ship.
Practical Audit every throw, every error toast, every empty UI state. For each, write the message a calm, helpful colleague would say. Stack traces belong in the logs; users get sentences. Special love for the empty state on day one — that is the user's first impression of an empty database.
V.

Accessibility & Internationalization

Excluding users by accident is still excluding them

A product that works for you and not for someone with a screen reader is not a product, it is a beta — and a quiet declaration of who counts.

Roughly 15% of the world has some form of disability. Color-blindness alone affects ~8% of men. Half the world doesn't read English first. Right-to-left scripts (Arabic, Hebrew) flip every assumption you embedded in your CSS. None of these are edge cases — they are the ordinary distribution of human users. Designing for them well is design done right; designing for them badly excludes a meaningful slice of your potential audience without anyone noticing internally.

Try the same component in different conditions:
Confirm Your Booking

You're about to reserve Seat 14B on the 8:45 AM departure. Total: $189.

Standard rendering: passes WCAG contrast, comfortable size, LTR layout. Most users handle this fine.
Figure v.1 The same component, four conditions. "Low contrast" fails WCAG AA for normal-vision users — and is unreadable for many low-vision users. "Tiny font" fails for ~30% of users over 50. "RTL" reveals every left-aligned assumption in the layout. All three are everyday conditions, not edge cases.
Practical Three tests that catch most of it. (1) Tab through your UI with the keyboard only. (2) Run an automated checker (axe, WAVE, Lighthouse) — it won't catch everything but it catches the embarrassing things. (3) If you're going international, prepare for RTL and pluralization rules early; retrofitting them is painful.
Part III

Trust, Law & Data

Compliance is not a barrier to building. It is a constraint that shapes the architecture — and one that compounds expensively when discovered late.
VI.

Compliance Is Architecture

Retrofit is the most expensive kind of refactor

"We'll add compliance later" is a sentence that has cost more engineering hours than any other in the modern era.

Regulatory frameworks are not paperwork. They have architectural consequences. GDPR's right to deletion implies you can find every copy of a user's data and remove it — including from backups, analytics warehouses, and any model trained on it. SOC 2 implies an audit trail of who accessed what and when. The EU AI Act's transparency requirements imply explainable decisions for high-risk systems. HIPAA implies encryption at rest, in transit, and in audit logs. None of these are bolt-on features. They are properties of the architecture from day one — or expensive surgeries on day five hundred.

Select your context above to see which frameworks apply and what they require.

Figure vi.1 A simplified picker for the most common frameworks. Real legal advice belongs to lawyers — but understanding which frameworks apply is an engineering responsibility, because they shape the architecture you ship. Note: simplified for illustration; actual obligations are nuanced.
Practical Three architectural primitives that cover most of it: a single, queryable record of where every piece of personal data lives; an immutable audit log of all access; and a policy-as-code layer that decides who can see what, evaluated at every read. Build these in week one, not month thirty.
VII.

The Data Lifecycle

Data is borrowed, not owned

Where did it come from? What consent covered it? How long can you keep it? When someone asks you to forget them — can you?

Data has a lifecycle, and every stage has obligations. Most teams handle "use" well — that is the engineering they enjoy. The other stages — collection, consent, retention, deletion — get less attention until a regulator, a journalist, or a deletion request makes them suddenly central. The right time to think about how you delete a user is before you collect them.

i.Collect
ii.Consent
iii.Use
iv.Retain
v.Delete
Figure vii.1 Click each stage. Most engineering teams optimize stage three. The other four are where most data incidents happen. "We don't have data, we have liability" is the polite description of skipping stage two.
Practical Tag every dataset with provenance, consent basis, retention period, and deletion procedure — the same way you tag code with license and author. If you can't fill in those four fields, you don't yet have a dataset, you have an exposure.
VIII.

Fairness & Failure Modes

Aggregate metrics hide subgroup failures

A model is only as fair as the worst subgroup it serves. The aggregate number is a comforting average that hides the people the system fails.

Performance distributes unevenly. A 95% accurate classifier can be 99% on the majority subgroup and 78% on a minority one. If the dataset under-represents that subgroup, the model never learned them well. If the failure mode is consequential — credit decisions, medical triage, hiring — the disparity becomes a harm. Aggregate metrics let you ship without seeing it. Subgroup metrics force you to look.

Reporting style:
Overall model accuracy: 94.5%  — "ships well in the demo"
Figure viii.1 Toggle to "by subgroup": the comforting 94.5% conceals a 78% failure on Group D — a population the system serves badly while the dashboard says everything is fine. This is how systems acquire the technically-correct-but-socially-disastrous label.
Practical Define subgroups that matter for your domain (age, gender, geography, device, language, customer tier — domain-specific). Report performance per subgroup as a default in every model card. Set per-subgroup minimums alongside the aggregate target. Run a structured FMEA — failure mode and effects analysis — before shipping any high-stakes decision system.
Part IV

The Long Tail

Most of a system's life happens after launch. The economics, the maintenance, and the eventual end deserve as much design as the launch itself.
IX.

The True Cost of Ownership

Compute is rarely the largest line

A system that costs $200 a month in compute and consumes ten engineer-hours a week in operational toil is not a $200 system. It is a $4,000 system with a misleading invoice.

The cost of a product includes everything it takes to keep running: compute, sure, but also on-call time, incident response, support tickets, postmortem time, the cognitive overhead of a complicated runbook, and the opportunity cost of the engineers tied to it. Most TCO calculations omit the human time, and most cost surprises come from precisely that omission. When in doubt, count the meetings.

Compute / mo
On-call / mo
Support / mo
Incidents / mo
Total monthly TCO
 
Figure ix.1 Push the on-call slider up. The compute bill barely moves. The total triples. This is the line item that quietly determines whether a product is sustainable — and the one most often invisible in the budget review.
Practical Track operational toil as a line item, not a footnote. A useful target: less than 50% of the team's time on operations (Google's classic SRE rule). When toil rises, the right answer is usually engineering investment in automation or simplification — not "work harder."
X.

Maintenance, Sunsetting & the Exit Ramp

Every system has an end. Plan for it.

A product without a sunset plan becomes a zombie — drifting on, owned by no one, costing money, blocking the next thing.

Software is not built once and forgotten. The team that built it rotates. The dependencies it relies on age out. The use case it was designed for evolves. Eventually — sometimes after eighteen months, sometimes after eighteen years — the most valuable thing you can do for the product is end it cleanly. The discipline of controlled sunsetting is what separates a mature product organization from a graveyard of half-living services.

Two timelines, side by side: the controlled sunset (announce, deprecate, migrate, end-of-life) and the unmanaged decay that happens when no one was assigned to do this work.

Figure x.1 The controlled timeline ends with a quiet, dignified shutdown that respects users. The unmanaged one ends with an emergency, an angry support thread, or a regulator. The work to get one outcome rather than the other is largely planning, written down, before the moment of decision.
A feature is not complete when it works in a demo. It is not complete when it works in production. It is complete when it has been used, maintained, and — eventually, gracefully — retired.
Practical For every production system, three artifacts that take a morning to write and save quarters of pain: a named owner in CODEOWNERS, a runbook with the top three failure modes and what to do, and a one-page deprecation policy ("if we sunset this, here is the timeline and the user communication"). Most teams write the third only after they need it. Most teams regret that.
Set in EB Garamond & Cardo.  ·  A reference monograph for engineers between durable and worth keeping.