The DDD Repo Myth: Realizing Domain Models at Scale

Somewhere around line 50,000, your meticulously crafted Domain-Driven Design repository starts to suffocate. Not because the business logic is wrong, but because you followed the book. You created that pristine src/domain folder, separated infrastructure concerns, and modeled aggregates that faithfully mirror your domain experts’ whiteboard diagrams. Then production traffic hit, and your rich domain model turned into a performance dumpster fire requiring 47 table joins to update a single status field. Developers studying the canonical blue book often find themselves hunting for a reference implementation that doesn’t exist, no canonical 100k+ LOC repository showcases DDD by the book because the book never accounted for your database connection pool limits.

The folder structure is the least interesting decision you’ll make. While you’re debating directory hierarchies, the real killer is hiding in your aggregate boundaries. Vaughan Vernon offered a more pragmatic path in his implementation guide, though some find the chapter sequencing leaves something to be desired. But neither book fully prepares you for the gap between clean architecture theory and practice when your monolith crosses six figures in lines of code.

The Fat Aggregate Trap

The most insidious DDD antipattern isn’t in your folder layout, it’s in your object graph. When modeling, we naturally think about what things contain before what they do. So you build the canonical Project aggregate:

public class Project
{
    public Guid Id { get, private set, }
    public string Name { get, private set, }

    private readonly List<ProjectTask> _tasks = new();
    private readonly List<TeamMember> _members = new();
    private readonly List<Document> _documents = new();

    // ... methods to manipulate everything
}

Looks clean. Follows the aggregate root concept perfectly. Then you write the repository:

public async Task<Project> GetByIdAsync(Guid id)
{
    return await _db.Projects
        .Include(p => p.Tasks)
        .Include(p => p.Members)
        .Include(p => p.Documents)
        // ... and probably five more tables
        .FirstOrDefaultAsync(p => p.Id == id);
}

Congratulations, you’ve built a God Aggregate. Every write operation, attaching a document, assigning a task, adding a comment, must load this entire object graph into memory. Database locks span half your schema. Concurrent updates deadlock. Your domain model is now a distributed systems stress test masquerading as business logic.

Lean Aggregates: Consistency Over Containment

The fix isn’t another folder reorganization, it’s surgical amputation of your aggregate boundaries. The rule is simple but brutal: only data that must be consistent in the same transaction belongs together.

Consider the Document attachment rule. In the fat aggregate version, Project checks Status == Completed before allowing the attachment. But does attaching a document really require loading every task and team member? Probably not.

public class Document
{
    public Guid Id { get, private set, }
    public Guid ProjectId { get, private set, }

    public static Document Attach(Guid projectId, string name)
    {
        if (string.IsNullOrWhiteSpace(name))
            throw new DomainException("Document name is required.");
        return new Document(projectId, name.Trim());
    }
}

Now Document is its own aggregate. But wait, what about that business rule preventing attachments to completed projects? That spans two aggregates now, so it belongs in a domain service, not inside either aggregate:

public class DocumentDomainService
{
    public void EnsureCanAttach(Project project)
    {
        if (project.Status == ProjectStatus.Completed)
            throw new DomainException("Cannot attach documents to a completed project.");
    }
}

The application service orchestrates:

public async Task AttachDocument(Guid projectId, string fileName)
{
    var project = await _projectRepository.GetByIdAsync(projectId), // Light load
    _documentDomainService.EnsureCanAttach(project);

    var document = Document.Attach(projectId, fileName);
    await _documentRepository.AddAsync(document);
}

Suddenly your writes touch two small tables instead of ten. Concurrency contention drops. Your database stops crying. The modular monoliths trend isn’t about microservices envy, it’s about realizing that monolithic codebases survive on these kinds of strict internal boundaries.

The Splitting Trap

But don’t reach for the refactoring tool just yet. There’s a wrong way to slim aggregates: splitting by table size rather than consistency boundaries.

Imagine you split ProjectTask and TaskAssignment into separate aggregates because the task table is getting big. But the business rule demands that a task cannot enter InProgress status without an assignee in the same transaction. If you split these, you now need distributed transactions or complex sagas to enforce what should be a simple invariant. You end up loading both aggregates anyway, constantly checking cross-aggregate conditions, and creating temporary invalid states that business users absolutely hate.

The litmus test: if two things must always be valid together atomically, they belong in the same aggregate. Everything else is inter-module communication in modular monoliths waiting to become your next debugging nightmare.

What 100k+ LOC Actually Looks Like

So where are those mythical reference implementations? They’re hiding in plain sight, but they don’t look like the textbook diagrams. Organizations that successfully scale DDD, like those maintaining lessons from decade-old monoliths that outlasted microservices hype, don’t organize by layer. They organize by bounded context, with fierce internal consistency boundaries.

Vertical slices over horizontal layers
Package-private aggregates that expose only intention-revealing interfaces
Domain services that handle cross-aggregate rules explicitly, not through lazy-loaded navigation properties
Infrastructure concerns colocated with domain logic within modules, not banished to a separate infra folder

The reality is clear: longevity comes from behavioral cohesion, not architectural purity. Your repository structure should scream what the system does not which pattern catalog you read.

The Pattern Catalog Problem

This obsession with correct DDD structure stems partly from our industry’s critique of software architecture pattern catalogs, we’ve been sold the idea that there are 2,000 distinct patterns to memorize, when in reality, most scalable systems are just five fundamental shapes wearing different trench coats. DDD repositories don’t fail because you put your entities in the wrong folder, they fail because you modeled your aggregates around data containment instead of transactional behavior.

When you hit that 100k line cliff, you won’t be saved by moving files from src/domain to src/modules. You’ll be saved by realizing that Project doesn’t need to know about every Document in memory to enforce business rules.

Stop Looking for the Perfect Template

The threads asking for typical DDD topology at scale miss the point. There isn’t one, because DDD isn’t a file layout, it’s a modeling discipline. The organizations shipping 100k+ LOC monoliths without performance catastrophes aren’t following a repository template, they’re ruthless about consistency boundaries.

Start with lean aggregates. Question every Include() in your repository. If you’re loading data to enforce a rule that spans aggregates, extract it to a domain service and accept the orchestration complexity. Trade that complexity for the ability to scale without sharding your database or rewriting into microservices out of desperation.

Your folder structure is a communication tool. Your aggregate boundaries are an architectural load-bearing wall. Get the latter right, and you can put the files wherever you want. Get it wrong, and no amount of directory nesting will save you from the God Aggregate apocalypse.