Published on

Merge or Split a Module: Decompose the Domain by "Shared Lifecycle," Not by "Business Relatedness"

Authors
  • avatar
    Name
    Jack Qin
    Twitter

Discussions of domain decomposition almost always focus on "how to split," but the genuinely hard part — and the genuinely easy-to-get-wrong part — is its inverse: when not to split, and when to merge two things together. And when judging "should we merge," the most common faulty basis is "are they business-related": is a tank alert related to tank configuration? It is, so merge them. Is a flow meter related to a tank? Also related, so merge that too? Follow the "business-related" line and you end up either kneading everything into one ball or drawing cross-module calls between any two "related" capabilities.

The problem is in the criterion itself. "Business-related" is too weak a signal — in a complete business system, almost any two capabilities are somewhat related. The criterion you should actually use is a different one: do they share the same aggregate root, the same lifecycle. Capabilities that share a lifecycle (different facets of the same entity) should be merged; otherwise, even if business-related, they should be split. This criterion is far more precise than "relatedness," because it points to an observable fact — are they changing the state of the same thing.

This post follows on from "The Modular Monolith Backend" (which covered "what enforces the boundaries") and turns instead to "how the boundaries themselves are drawn." Using the 11 backend modules of an environmental-monitoring platform as a worked example, it unpacks several judgments in a real decomposition: what to merge and what to split, why all reading-type data is immutable, why module collaboration splits into two paths, and why a "stateless" diagnostic module still counts as a module.


The Core Tension the Decomposition Resolves

This mining environmental-monitoring system has a wide business surface: dust particulate readings, water flow, tank levels, device calibration, GPS positioning, weather data, heatmaps, weekly reports, email, user permissions… don't carve out modules and it congeals into a big ball of mud; carve too finely and you manufacture a heap of cross-module calls. The core tension of decomposition is exactly this straintoo coarse and you lose isolation, too fine and you manufacture a collaboration hell — and you have to find the granularity between them.

The constraints have been covered: single-process deployment, a single PostgreSQL instance isolated by schema, cross-module via .Contracts only, async via messages. 11 modules, each owning one schema (except one deliberately schema-less):

#ModuleschemaResponsibility
1MonitoringmonitoringMonitoring devices, PM readings, dust levels, site observations, scrape triggering (core domain, largest data volume)
2FlowMetersflow_metersWater flow meter readings, water-usage tracking
3TankManagementtanksTank capacity, refills, corrections, dual-threshold alerts
4ReportingreportingWeekly reports, AI-generated descriptions
5EmailemailEmail schedules, templates, senders, send history
6AssetsassetsDevice registration, calibration tracking, GPS positioning
7WeatherweatherWeather data cache, station management
8Geospatialgeospatial (PostGIS)Heatmaps, geofencing, spatial queries
9IdentityauthUsers, RBAC, groups, authentication
10SettingsconfigApp settings, mine-site configuration, feature flags
11HealthCheck(no schema)Cross-module device/sensor health diagnostics

The Merge Criterion: Shared Lifecycle Merges, Otherwise Splits

The hardest part of module decomposition isn't "splitting" but "when to merge." This system has a few explicit merge decisions, each measurable by the "shared aggregate root/lifecycle" yardstick:

  • TankManagement merged four pieces: the early tank-alerts, tank-configuration, correction-management, and refill-management were merged into one module. The reason isn't "they all have to do with tanks" (that's the weak criterion) but that they share the same tank lifecycle — capacity configuration, level monitoring, refill records, correction records, dual-threshold alerts are essentially different facets of the one "tank" aggregate root. They all read and write the state of the same tank, and forcing them apart would only manufacture massive internal collaboration;
  • Assets merged asset management and asset positioning: both share the assets schema, with registration, calibration, and GPS positioning revolving around the same "asset" entity's lifecycle;
  • Monitoring stays the core domain and isn't subdivided further: devices, readings, dust levels, site observations, and road temperature data are many, but all tightly revolve around the one core "monitoring" aggregate.

Conversely, FlowMeters deliberately handles only flow readings — tanks, calibration, and refills go to their respective modules. The criterion is precisely lifecycle: a flow meter's data collection and a tank's level management are business-related, but have different lifecycles — one is a high-frequency reading stream (continuously appended facts), the other is discrete operational events (actions like refills and corrections). Different lifecycles, so split, even if they're business-adjacent.

Core idea: merge by "do they share the same aggregate root/lifecycle," not by "are they business-related." Related but lifecycle-independent: split. Related and lifecycle-shared: merge. The virtue of this yardstick is that it's restatable and arguable — "are they changing the state of the same aggregate root" is a verifiable fact, while "are they related" is just a feeling. Whether a decomposition decision can be restated and challenged determines whether it gets casually overturned in the next person's hands.


Why Readings Must Be Immutable

The domain model has a consistent pattern: all "reading"-type data are immutable value objects.

  • Reading (PM reading), DustLevel (dust-level aggregate), FlowReading (flow reading), TankLevel (tank-level reading), WeatherObservation (weather observation), AssetLocation (GPS location), CalibrationRecord (calibration record) — all immutable.

Whereas aggregate roots (Site, Tank, Asset, FlowMeter, WeatherStation) are mutable, carrying configuration and state.

This distinction isn't a stylistic preference but corresponds to an ontological difference: a reading is "a fact collected in," a device is "an entity carrying configuration." Fact and configuration have fundamentally different semantics on "can it be changed" — a PM reading represents "at some moment, some sensor measured this value," an already-happened historical event, and changing it equals falsifying the historical record; a tank's capacity configuration represents "the current setting," which is meant to be updated by operations.

For a monitoring system that must do compliance auditing, this distinction is a hard requirement: a reading, once written, is an unalterable historical fact, and modeling it as an immutable value object locks "no one can change a historical reading" at the type level. The transferable insight: when a domain contains both "facts that have happened" and "current configuration/state," modeling the former as immutable and the latter as mutable often directly corresponds to the business's audit and correctness requirements — it's not technical fastidiousness.


Two Paths for Module Collaboration: Events vs. Cross-Schema Read-Only

Modules have real collaboration needs, but they take two different paths — and the basis for the split into two paths is whether the collaboration is a "write" or a "read."

Integration events (write-side collaboration) — when a state change has occurred and another module needs to respond, publish an event:

Publishing moduleEventWho consumes / does what
TankManagementTankLevelCriticalWorker sends an alert email to the configured recipients
AssetsDeviceCalibrationExpiredTriggers the calibration-reminder flow
MonitoringScrapingCompletedGeospatial triggers a heatmap recompute
EmailEmailSent / EmailFailedAudit, follow-up processing
IdentityUserGroupMembershipChangedPermission cache invalidation

An event is "I changed, whoever cares can take it," and the publisher neither knows nor cares who's listening — that's the key to loose coupling.

Cross-schema read-only (read-side collaboration) — for purely reporting/diagnostic reads, directly Dapper-querying another module's tables is allowed:

  • Geospatial's heatmaps are derived from a cross-schema Dapper query over monitoring.readings;
  • The HealthCheck module is entirely cross-schema read-only, querying monitoring.*, flow_meters.*, assets.*, config.* for diagnostics.

The criterion is clear: read models OK, write coupling not. Module A may read Module B's tables to produce a report, but must never write Module B's tables — writes must go through events or commands.

Why this boundary is drawn at "read vs. write" rather than "cross-schema or not" is worth getting clear on. What module isolation fundamentally guards against is "coupling," and coupling comes mainly from writes — A directly writing B's tables binds A to B's internal schema, and the moment B changes its table structure A breaks, which is exactly what the module boundary exists to prevent. But reads don't create this structural coupling (A reading B's tables means at most that A's query has to follow when B changes its schema, but A hasn't tampered with B's state or bypassed B's business rules). So cutting reads off too is mistaking the boundary's purpose (preventing write coupling) for a means (forbidding all cross-schema access), and the result is forcing out a pile of event round-trips that exist purely to shuffle data — reporting needs squeezed into event hell by a "strict boundary." A precise boundary should block what truly creates coupling, not block all crossing.


HealthCheck: Why a "Stateless" Thing Still Counts as a Module

HealthCheck is special: it has no schema of its own, no domain model, owns no data, and is entirely cross-schema read-only diagnostic queries (flow meters over capacity, prolonged no-data, abnormal proportion of vehicles with zero readings, PM10 below the rolling average, etc.). Even its threshold configuration is owned by the Settings module.

Why does this still count as a module? Because it has a clear responsibility boundary (device/sensor health diagnostics) and an independent API surface. This forces a clarification of "the essence of a module": the essence of a module is "a cohesive set of capabilities," not "ownership of a set of data." Most modules happen to both cohere in capability and own data, so people easily mistake "owning data" for a necessary condition of a module. HealthCheck proves it isn't — it owns not a single row of data, yet its capability (diagnostics) is highly cohesive with a clear external surface. Gathering the diagnostic logic into one module is far clearer than scattering it across the data modules. To judge "should this be a standalone module," look at whether the capability is cohesive and the responsibility clear, not at whether it has its own tables.


A Recurring Optimization Pattern: Pulling the Frontend's Multiple Queries Back to the Backend

Several modules' optimization records share a common pattern — pulling computation that the frontend used to round-trip multiple times back to a single backend computation:

  • TankManagement: the frontend originally had to fire 3N sequential queries to compute current levels (for N tanks, each queried for refills, corrections, baseline); now the backend computes it in one SQL;
  • Monitoring: 6 legacy RPC calls merged into 4 REST endpoints;
  • Reporting: 3 separate chart-description generation functions merged into one async endpoint with a dataType parameter;
  • Email: variable-substitution logic moved from the frontend to the Worker — the frontend only configures the template, and the Worker queries the data source to resolve variables at send time.

Core idea: an N+1-style series of frontend round-trips is essentially leaking a computation that should have been done once at the data layer out to the client. This "leak" framing is useful — the frontend firing 3N queries to assemble a level isn't the frontend being dumb; it's the backend failing to reclaim "compute the level," a responsibility that should have been its own, so that responsibility leaks to the frontend, which then has to assemble it through multiple round-trips. Once reclaimed by the backend, the frontend degrades from "orchestrating multiple queries" to "fetching one result" — both faster and simpler. When you see the frontend doing a lot of orchestration queries, assembling one result across multiple requests, it's often not the frontend's problem but some computation responsibility leaking to the wrong layer.


Implementation Details

A Module's Standard Triple

Each module is organized by the same structure, which new modules fill in: the API surface (external HTTP endpoints), the domain model (aggregate roots + value objects), and events and jobs (the integration events published/consumed + Quartz scheduled jobs).

Feature Flags Pushed Down into Site Configuration

The Settings module's MineSite aggregate root carries a set of per-site feature flags:

public sealed class MineSite
{
    public string TimeZone { get; set; } = "Australia/Perth";  // authoritative timezone source
    public decimal? Latitude, Longitude { get; set; }

    // Per-site feature flags
    public bool? DustLevelEnabled { get; set; }
    public bool? FlowMeterEnabled { get; set; }
    public bool? HeatmapEnabled { get; set; }
    public bool? SensorEnabled { get; set; }
    public bool? AssetLocationEnabled { get; set; }
    public bool? GeofenceEnabled { get; set; }
}

Different sites enable different capabilities, controlled by this set of flags. MineSite.TimeZone is the authoritative source for system-wide site-level timezone computation — all timezone-related computation ultimately traces back here, never hardcoded elsewhere. This is again an application of "single source of truth": a thing like the timezone, which silently goes wrong the moment it's scattered, must have one authoritative origin.

Configuration Ownership: Whoever Uses It Owns It

There's an ownership adjustment worth noting: the heatmap's rendering configuration (bbox bounds + zoom) was originally in the Settings module, and later moved to the Geospatial module. The reason is "whoever uses it owns it" — heatmap rendering is Geospatial's responsibility, so the related config should belong to Geospatial.

This adjustment guards against a common degeneration: a module like Settings easily becomes a "junk drawer where every config gets dumped." A junk drawer's problem is that it has no cohesion — the only thing its contents share is "they're all config," and "is it config" has nothing to do with the "cohesive capability" a module should have. Letting config follow "who uses it" finds each piece of config its truly cohesive home. The criterion: config should sit with the capability that uses it, not with other config — the latter classifies by "data type," the former by "responsibility," and modules should be split by responsibility.


Where It Applies: the Sweet Spot and Failure Points of This Granularity

What it buys is clear: boundaries drawn with a restatable basis (merge by shared lifecycle, split by independent lifecycle), clear collaboration paths (writes via events, reads may cross schema), immutable readings satisfying audit, consistent optimization direction (continually reclaiming N+1 to the backend). The easy pitfalls all correspond to one of the above criteria being violated:

  • Over-splitting manufactures internal collaboration hell: had TankManagement kept four separate modules, there'd be massive cross-module round-trips around the one tank — capabilities sharing an aggregate root should merge; don't split for the sake of "more modules";
  • The cross-schema read boundary must hold to "read-only": allow cross-schema writes and schema isolation exists in name only;
  • Config ownership must be clear, or Settings bloats into a junk drawer;
  • Not every module needs data: HealthCheck has no schema and no model, but with a clear responsibility it should stand alone.

When this granularity isn't a fit:

  • A narrow business domain: if the whole system is simple CRUD, 11 modules is over-engineering, and a handful — or no split at all — suffices;
  • A module truly needs independent scaling: currently all modules share one deployment; if Monitoring's data volume grows large enough to need independent scaling, take the "extract into a standalone service" path from "The Modular Monolith" — this .Contracts + events design is reserved precisely for this.

The sweet spot is: a broad business domain, clear domain boundaries, a small team, and the possibility of splitting out some modules in the future. The monitoring platform's domain is naturally suited to carving by "monitored object" (dust, water flow, tanks, assets, weather, geography), and the decomposition granularity closely matches the domain's structure.

The transferable layer: those seemingly subjective questions in domain decomposition — "merge or split," "does it count as a module," "who owns it" — actually all have criteria harder than "feels related": merge/split by shared aggregate root/lifecycle, module-or-not by whether the capability is cohesive, config ownership by who uses it. What these criteria share is that they all point to observable, arguable facts, not intuition. A decomposition decision can withstand the next person's hands only if it can be restated and challenged — a boundary that can't be restated gets casually overturned sooner or later.