Published on

Most Bugs Live at the Junction Between Layers: Two Pauses Before You Write Code

Authors
  • avatar
    Name
    Jack Qin
    Twitter

The earlier pieces were all about "how to write." This one is about something different: what to think about before you write. Because there's a large class of bug that isn't "the code is wrong" but "one step was skipped before starting" — they were settled before the first line was written, and no amount of careful implementation can save them.

This class of bug has two sources that look unrelated but share one cure. One is inconsistency from duplicated code; the other is format assumptions at layer boundaries. Neither can be prevented by "writing more carefully," because the problem isn't in any single line of code — it's in the relationships between code and code, between layer and layer. What blocks them are two pauses before you start: one to search for reuse, one to draw the data flow. This piece takes apart what each pause defends against and why it works.

Two classes of bug, one common thread

The first class is inconsistency from duplicated code. Copy-pasting a piece of validation logic into another file saves you five minutes at the time; three months later someone fixes a bug in the original, the copy doesn't budge, and behavior starts to diverge. The key insight here: the cost of copy-paste isn't at the moment of copying — it's at every future moment when only one of the copies gets changed. Duplication is the number-one source of inconsistency bugs because it turns "one fact" into "multiple copies that drift independently."

The second class is format assumptions at layer boundaries. The API returns format A but the frontend assumes B; the database stores X and the service drops a field converting it to Y; the same logic is implemented across multiple layers, each one differently. Behind this is a statistical fact: most bugs occur at the junction between layers, not inside a layer. Because inside a layer you can see all the context, while the two sides of a boundary are often written by different people at different times, and their assumptions about "what the data looks like" were never aligned.

The common thread of both classes: both can be blocked before you start, with a single "pause + question." So below I write these two pauses out as actionable steps.

Pause one: before writing new code, search whether it already exists

This pause defends against duplication. Its logic is simple — you can't stay consistent with code you don't know exists, so the first step is to go find it.

Step one: search first.

# search for similar function names
grep -r "functionName" .

# search for similar logic
grep -r "keyword" .

Step two: ask these questions.

QuestionIf yes...
Is there already a similar function?Use it or extend it
Has this pattern been used elsewhere?Follow the existing pattern
Could this become a shared utility?Create it in the right place
Am I copying code from another file?Stop — extract it into shared

The last row is the most important: copy-paste is a stop signal, not a shortcut. When your fingers are about to Ctrl+C a piece of logic, that's exactly the moment to stop and extract it into shared.

When to abstract and when not to — there's a symmetric trap here, and both sides are equally common:

  • Do abstract: the same code appears 3+ times; the logic is complex enough to harbor bugs; multiple people may need it.
  • Don't abstract: used only once; a trivial one-liner; the abstraction itself is more complex than the duplication.

Abstraction has a cost. DRY where it's worth being DRY, but don't build an abstraction more confusing than the duplication in a place used only once — "over-abstraction" and "unchecked duplication" are two sides of the same coin: both come from failing to honestly assess "how many times will this logic actually appear."

A sneaky pitfall: asymmetric mechanisms producing the same output. This is an advanced variant of the duplication problem, worth noting on its own. When two different mechanisms must produce the same set of files (e.g. init uses recursive directory copy, update uses manual files.set()), a structural change (rename, move, new subdirectory) only propagates through the automatic mechanism. The manual one quietly drifts — it's not copied code, but there's an implicit "must produce the same result" contract between it and the automatic mechanism, and nothing enforces that contract.

  • Symptom: init works perfectly, but update creates files at the wrong path, or simply misses files.
  • Prevention checklist: when migrating a directory structure, search out all code paths referencing the old structure; if one path is auto-derived (glob/copy) and another is manually enumerated, the manual one must be updated in sync; add a regression test comparing the output of both mechanisms.

After a bulk change, do the triple: look back — did everything get changed? search again — grep for anything missed. think again — is this worth abstracting now?

Pause two: before building a cross-layer feature, draw the data flow

This pause defends against format assumptions at boundaries. Its logic: since bugs concentrate at junctions, lay every junction out explicitly before you start.

Step one: draw the data flow.

source → transform → store → fetch → transform → display

For each arrow, ask three questions: what format is the data in? What could go wrong here? Who's responsible for validating it? Each arrow is a potential format-assumption point, and drawing it turns an implicit assumption into an explicit question.

Step two: identify the boundaries.

BoundaryCommon problems
API ↔ ServiceType mismatch, missing field
Service ↔ DatabaseFormat conversion, null handling
Backend ↔ FrontendSerialization, date format
Component ↔ ComponentProps shape change

Step three: define a contract for each boundary. Each boundary must answer: what's the exact input format? What's the exact output format? What errors can occur?

Three common cross-layer mistakes, each with its cure:

  • Implicit format assumption: assuming a date format without checking. → Cure: explicit format conversion at the boundary.
  • Scattered validation: the same thing validated across multiple layers. → Cure: validate once at the entry point, inner layers trust already-validated data.
  • Leaky abstraction: a component knows the database schema. → Cure: each layer knows only its neighbor.

That last one, "each layer knows only its neighbor," is the core rule of this pause. A component shouldn't know the database's column names, a service shouldn't know the frontend's rendering details — the moment one layer reaches past its neighbor to know a further layer, the boundary leaks, and any change at the far end punches straight through.

Counter-examples

// Counter-example 1: copying validation logic from another file instead of extracting it into shared
// fileA.ts
function isValidEmail(s: string) {
  return /.+@.+/.test(s)
}
// fileB.ts
function isValidEmail(s: string) {
  return /.+@.+/.test(s)
} // ❌ copy-paste, will diverge later
// Cure: extract into a shared utility, import in both places
// Counter-example 2: an implicit cross-layer format assumption
// the backend returns "2026-05-30T00:00:00Z" (UTC), the frontend treats it as local time directly
const day = new Date(dto.date).getDate() // ❌ one time-zone offset and you're a day off
// Cure: explicitly agree on and convert the date format at the boundary
// Counter-example 3: leaky abstraction, the component knows the database column names
function Row({ data }) {
  return <td>{data.asset_tbl_col_07}</td>; // ❌ the component shouldn't know the DB schema
}
// Cure: map the DTO to a semantic model at the boundary, the component knows only the model
// Counter-example 4: the same validation scattered across layers, each written its own way
// validated once at the entry point, validated again in the service (with slightly different rules) // ❌
// Cure: validate once at the entry point, inner layers trust already-validated data

Putting it into practice

  1. grep before you start: before writing any function that "looks generic," search whether it already exists; copy-paste is a stop signal, not a shortcut.
  2. Rule of three: abstract when the same logic appears a third time; don't build an abstraction for one use.
  3. Draw it when it crosses 3 layers: when a feature spans 3+ layers, involves multiple teams, has complex data formats, or has historically had bugs, draw the data flow and pin a contract at each arrow.
  4. Validate once, at the entry: don't let the same rule live separately across multiple layers.
  5. Each layer knows only its neighbor: a component shouldn't know the database schema; map the DTO to a semantic model at the boundary.
  6. Triple after a bulk change: look back → grep for misses → consider abstracting.

The transferable layer

On the surface, one pause is about reuse and the other about data flow, but they defend against the same kind of thing: relational bugs — not in any single line of code, but in the relationships between code and code, between layer and layer. Duplication is the relationship of "multiple copies that should stay consistent but nothing enforces it"; a boundary assumption is the relationship of "two sides whose understanding of the data shape should align but never did." This class of bug can't be prevented by "writing more carefully," because the problem isn't in the line you're writing at all.

The only effective approach is to spend a few dozen seconds before you start laying these relationships out explicitly: does this logic already exist? How many boundaries does this data cross, and what's the contract on each side of each one? These two pauses before writing code tend to block more bugs than all the care during writing put together.