Modular Monoliths: The API Layer Decision That Haunts Your Architecture

You’re staring at a Controllers folder with 87 files. The UserController alone handles authentication, profile management, preferences, and admin operations. Your modular monolith, those beautifully separated domain modules in their neat folders, feels like a lie. The presentation layer has become a monolith within your modular monolith.

This is the moment that breaks the fantasy. You’ve done the hard work of carving out bounded contexts, defining module boundaries, and keeping domains clean. But the API layer? It’s a sprawling mess that turns every deployment into a game of “did we break the checkout flow again?” The question isn’t academic anymore: Should each module own its API surface, or does a centralized presentation layer make more sense?

The Two Architectural Paths

Based on patterns emerging across engineering teams, you’re choosing between two fundamentally different philosophies:

Option 1: Per-Module APIs

Each bounded context declares its own routes, controllers, DTOs, and middleware. An orders module contains /orders/routes/create.py, /orders/routes/fulfillment.py, etc. If you extract it later, you lift the entire folder and go.

Option 2: Centralized Orchestration

A single Web API project aggregates everything. Your routes/ folder mirrors your module structure (routes/orders.py, routes/payments.py), but all HTTP concerns live in one place. Modules remain pure, framework-agnostic logic.

The right choice depends on variables most architects don’t measure until it’s too late: team cognitive load, framework stability, and how close you’re to service extraction.

When the Centralized Layer Becomes a Prison

A developer recently posted their project structure showing a single WebAPI project buckling under its own weight. The controllers had become overwhelming, unwieldy not just in size but in conceptual clarity. This is the hidden tax of the centralized approach: searchability dies.

Ownership blurs: Which team owns the UserController when it touches authentication (platform team), profile features (user team), and admin functions (ops team)?
Merge conflicts multiply: Every feature touching the same controller file creates coordination overhead
Framework coupling deepens: Your modules import web framework details, making framework migrations painful

One commenter described moving from this approach to per-module APIs: “It helped draw clear boundaries of ownership for teams, and improved searchability/readability quite a bit.” The difference is stark, finding an endpoint becomes cmd+P, typing orders/routes, and seeing exactly five files instead of hunting through a 2,000-line controller.

The Per-Module Promise (And Its Performance Cost)

The Backend Engineering Adventures newsletter advocates for per-module routes when frameworks are “well-defined, established, unlikely to change.” The pattern is clean:

orders/
├── models/
├── services/
├── routes/
│   ├── create.py
│   ├── fulfillment.py
│   └── returns.py
└── main.py

This approach shines when:

A module crosses the 10+ endpoint threshold, the complexity tipping point where extraction becomes thinkable
Multiple teams own different modules
You need module-specific middleware (different auth for internal vs external APIs)
Extraction is on a 3-6 month horizon

But there’s a cost. Each module now imports Flask, FastAPI, or Express. Your payments module might use decorators that only make sense in a web context. When you extract it, you’re not just moving business logic, you’re untangling framework dependencies.

Worse, you risk circular dependencies. As the newsletter warns: routes imports main - ❌ NO! This creates testing nightmares and makes refactoring feel like defusing a bomb.

The Team-Size Inflection Point

Here’s where architecture meets sociology. A small team of 3-5 engineers can maintain a centralized API layer through shared mental models and constant communication. The cognitive load is manageable because everyone knows the full surface area.

But at 15+ engineers split across 3+ teams, the centralized layer becomes a coordination tax. One team can’t ship because another team’s PR is blocking the controller file. Code reviews require expertise across multiple domains. The layer that was supposed to simplify has become your most expensive meeting generator.

A commenter nailed this: “size of the team, how team members interact, technical knowledge of the team”, these are the real deciding factors. If your team is growing fast, per-module APIs aren’t just better, they’re survival.

The Framework-Free Module Compromise

There’s a hybrid approach that trades purity for flexibility: keep modules framework-free, but centralize the API layer initially.

my-python-api/
├── orders/           # No web imports here
│   ├── models/
│   ├── services/
├── routes/           # All web concerns here
│   └── orders.py
└── main.py

This works brilliantly when:

Technical decisions are still volatile
You’re prototyping and might switch from REST to gRPC
You want to test module boundaries before committing to per-module APIs

The services layer returns plain objects. The routes layer handles HTTP status codes, serialization, and framework-specific concerns. When (if) you migrate to per-module APIs, you’re mostly moving files, not rewriting logic.

The catch? You still have that centralized chokepoint. The newsletter author admits this is temporary: “I normally start with this approach and then eventually move to (1) if necessary.” It’s a scaffolding, not a destination.

The Boundary Clarity Test

The DEV Community article on boundaries provides the ultimate decision framework. A strong boundary must:

Own a specific business capability (not just CRUD entities)
Control its data evolution exclusively
Publish clear contracts and events
Handle core decisions without constant sync calls

Apply this to your API layer decision. If your orders module truly owns fulfillment decisions, why does it share a controller file with payments? The boundary is leaking.

Per-module APIs enforce this discipline. When you open orders/routes/, you see exactly what capabilities orders expose. The cognitive model matches the business model. When you need to change how returns work, you know exactly where to look and who to talk to.

The Extraction Threshold

The most pragmatic advice from the research: split when the pain of not splitting exceeds the pain of splitting. Concrete triggers include:

10+ endpoints in a module (complexity threshold)
Multiple teams actively developing different modules
Unique middleware/auth requirements emerging
3-6 month timeline for potential extraction

Before these triggers, a centralized layer might be “good enough.” After them, it’s technical debt accruing interest daily.

One team described their modular monolith as “different varieties of modular monoliths that handle a piece of that areas logic” but admitted the backend API became “spaghetti because it started as a lift and shift of a legacy portal.” The lift-and-shift origin story is common, centralized APIs feel natural when you’re porting a monolith. But they’re a transitional state, not a target architecture.

Making the Call

There’s no universal answer, but there is a universal process:

Start centralized if you’re <5 engineers and modules are <5 entities each
Measure the cognitive load: How long does it take a new engineer to find and modify an endpoint?
Track cross-team coordination: Are API layer changes requiring multi-team approvals?
Watch for middleware divergence: Do modules need different auth, rate limiting, or serialization?
Plan the migration: If you hit 2+ of the extraction triggers, schedule the split

The modular monolith is a journey, not a destination. The API layer decision is the fork in the road where you choose between team autonomy and architectural simplicity. Choose wrong, and you’ll spend months unpicking dependencies. Choose right, and your modules stay truly modular, even as they grow.

The controllers folder with 87 files is telling you something. Listen.

Where do you land?

Are you team “per-module APIs” or “centralize until extraction”? The debate is far from settled, and the right answer depends on your team’s context more than any architectural purity test.