PostgreSQL as Your Entire Data Stack: The Multi-Tool Orthodoxy Is Crumbling

There is a deep cultural reflex in modern engineering: whenever a problem appears, reach for a packaged solution instead of thinking from first principles. The result is architectural cargo culting and lots of missed opportunities.

“We need an audit trail, let’s use a temporal database.” “Our application is slow, let’s cache that using an in-memory key-value store.” “Analytics are too slow, let’s spin up a data warehouse.”

Sound familiar? This is the multi-tool orthodoxy that has dominated data engineering for the last decade. But a growing counter-movement is asking an uncomfortable question: How many elements does a data-centric stack really need?

The answer, increasingly, is “one.” And that one is PostgreSQL.

The Case for Simplification

Eduardo Bellani’s recent blog post, “All you need is PostgreSQL,” makes a provocative argument that has been circulating through data engineering circles: the default assumption for your data problems should be that your company can do fine with just PostgreSQL. The post, which gained significant traction on Reddit’s r/dataengineering, challenges the deeply ingrained cultural reflex in modern engineering to reach for a packaged solution whenever a problem appears.

The result of this reflex? Architectural cargo culting. Companies stack product on top of product on top of PostgreSQL, inflating the number of moving parts, operational risk, headcount demand, and overall system entropy. Complexity grows not because the problems demand it, but because someone reached for a tool they saw in a conference talk.

This isn’t just a theoretical argument. Eduardo Bellani’s detailed walkthrough demonstrates how PostgreSQL 18, with standard extensions available on RDS, can handle auditing, write throughput, transactional queries, analytical queries, and application decoupling, all the things teams typically reach for separate tools to solve.

The Five Objections to PostgreSQL-Only

Bellani identifies five common arguments teams make for reaching beyond PostgreSQL:

“I’ll need auditing and reconstructing state”
“Write throughput is too low”
“The transactional queries are too slow”
“The analytical queries are too slow”
“My app will be coupled to the Database”

Each of these objections gets systematically dismantled with production-ready PostgreSQL 18 code. Let’s walk through the most compelling parts.

Auditing Without an Event Store

The auditing argument is one of the most common justifications for introducing event sourcing infrastructure. The thinking goes: “We need to track every state change, so we need a dedicated event store or temporal database.”

Bellani’s response is elegant: system-time temporal tables. Using the temporal_tables extension, PostgreSQL automatically writes old versions of each row to a history table on every UPDATE or DELETE. Combined with a tstzrange period, you get a complete audit trail suitable for AS OF queries and state reconstruction.

-- Add system-time period column to transfer table
alter table finance.transfer add column if not exists sys_period tstzrange default tstzrange(current_timestamp, null);
alter table finance.transfer alter column sys_period set not null;

-- Focused history table - only what we need for status transitions
create table if not exists finance.transfer_status_log (
  like finance.transfer
);

-- Use temporal_tables versioning procedure
create trigger transfer_save_status_history
  before insert or update or delete
  on finance.transfer
  for each row
  execute procedure versioning('sys_period', 'finance.transfer_status_log', true);

No event store. No append-only log infrastructure. Just a history table maintained by a trigger. The same pattern handles account auditing, transfer state machine history, and any other temporal tracking requirement.

The OLAP Objection: Where PostgreSQL Gets Interesting

The most common pushback to the “all you need is PostgreSQL” argument comes from the analytical side. As one Reddit commenter put it, PostgreSQL “really isn’t ideal for OLAP.” And they’re not wrong, if you’re scanning the full transaction history to compute a balance, a row-oriented database will struggle.

But the solution isn’t to reach for a columnar engine. It’s to think differently about the problem.

Bellani’s approach uses a balance ledger, a table that stores running balance snapshots after every transaction, maintained incrementally via triggers. This shifts work from read time to write time, turning what would be a full-table aggregation into a single index-backed lookup.

create table finance.balance_ledger (
  routing_number  finance.routing_number not null,
  account_number  finance.account_number not null,
  as_of           timestamptz not null,
  current_total   bigint not null,
  available_total bigint not null,
  primary key (routing_number, account_number, as_of)
);

The triggers that maintain this ledger are straightforward. When a settled transaction is inserted, both the current and available totals change. When a pending transaction is inserted, only the available total changes. Each trigger fetches the most recent ledger row for the account and appends a new snapshot.

This is the same principle behind incremental view maintenance: the cost of keeping derived data up to date is borne by the process changing the base data. The trade-off is acceptable because transactions are append-only, and per account, concurrent writes are rare.

The Benchmark That Silences Skeptics

The real test of any architecture is whether it works under load. Bellani ran a pgbench benchmark simulating a startup scenario: 10,000 transfers per day with an 80/20 read/write split, exercising the full write and read paths through the updatable view.

The results are striking:

Metric	Value
Overall TPS	766
Failed transactions	0 (0.000%)
Average latency	13.044 ms
Read latency (avg)	8.080 ms
Write latency (avg)	32.915 ms
Read transactions	36,793 (80%)
Write transactions	9,191 (20%)

The startup target of 10,000 transfers per day is roughly 0.12 TPS. The benchmark exceeded that by over 6,000x on a development laptop. On an RDS instance sized for the working set, performance would be significantly better.

The Capacity Planning That Makes It Work

The key insight that makes PostgreSQL viable for this workload is understanding the working set. Bellani’s analysis shows that for a mid-to-large bank processing 50 million transfers per day, the working set for 2 days of data is approximately 110GB. AWS RDS supports up to 4TiB of memory per instance for PostgreSQL-compatible instances.

Scale	Transfers/day	Working set size (2 days)
Startup	10,000	22M
Mid/large bank	50,000,000	110G
Global processor	900,000,000	1.9T

The working set estimation is based on careful analysis: each transfer row is roughly 160 bytes, with log tables at the same width and transaction tables slightly narrower at around 120 bytes. With index overhead factored in, the numbers hold up.

HOT Updates: The Write Throughput Hack Nobody Talks About

One of the most clever optimizations in this design is the use of Heap-Only Tuple (HOT) updates. By keeping the primary key aligned with immutable columns and ensuring that status and transfer_period don’t participate in any indexes, PostgreSQL can write new tuple versions to the same page and skip index updates entirely.

alter table finance.transfer set (fillfactor = 70);

This reserves 30% free space per page, allowing updated tuples to fit on the same page. The benefits are substantial:
– 2-3x faster updates compared to non-HOT updates
– No table and index bloat from status changes
– Simpler vacuum maintenance
– Smaller WAL, since there is less write activity overall

This is the kind of optimization that only becomes obvious when you’re thinking from first principles rather than reaching for a caching layer.

The Decoupling Argument: Views as Interfaces

One of the most persistent objections to a PostgreSQL-centric architecture is the fear of coupling. “If my app talks directly to the database, I can’t change the schema without breaking everything.”

Bellani’s response is to use views as the canonical way to implement modularity in SQL DBMSes. As Codd originally envisioned, views provide logical data independence: application programs remain unaffected when the internal representation of data changes.

The finance.transfer_activity view serves as a single unified stream through which applications read, insert, and update data. If the underlying table structure changes, the view definition is updated once, and every application continues to work unchanged.

This is decoupling achieved at the data level, with no network hops, no serialization overhead, and no distributed consistency problems. The view is the interface, and the base tables are the implementation.

The Benchmark That Matters

The benchmark results tell the real story. On a development laptop, running 10 concurrent clients with an 80/20 read/write split:

766 TPS overall with 0 failed transactions
8ms average read latency for the UNION ALL view over 4 tables plus balance ledger lookup
33ms average write latency for transactions exercising the full constraint set
0 serialization failures despite 10 concurrent clients under SERIALIZABLE isolation

The startup target of 10,000 transfers per day is roughly 0.12 TPS. The benchmark exceeded that by over 6,000x. This isn’t theoretical, it’s a working system on modest hardware.

Where PostgreSQL Hits Its Limits

The honest answer is that PostgreSQL isn’t ideal for everything. As one Reddit commenter noted, “the OLAP part is pretty weak, as expected, it is not designed for this.” The title shouldn’t be read as “All you need for data engineering is PostgreSQL.”

For massive analytical workloads scanning billions of rows, columnar engines like ClickHouse will outperform PostgreSQL by orders of magnitude. The PostgreSQL vs. columnar engines for analytics debate is real, and for good reason.

But the question isn’t whether PostgreSQL can replace every specialized tool. The question is whether the default assumption should be that PostgreSQL is sufficient, and whether the burden of proof should be on introducing additional infrastructure.

The Industry Is Moving Toward Convergence

The timing of this argument is no coincidence. The industry is in the midst of a major convergence trend. Databricks recently unveiled LTAP, EDB introduced converged analytics, and pgEdge launched ColdFront, all aiming to collapse the OLTP/OLAP divide.

What’s interesting is that these approaches all make different tradeoffs. Databricks asks enterprises to adopt a proprietary lakehouse as the operational center of gravity. Snowflake’s pg_lake requires applications to distinguish between PostgreSQL and analytical tables. EDB still requires archived data to be brought back into active PostgreSQL before it can be modified.

ColdFront, by contrast, treats Iceberg only as a transparent storage tier behind PostgreSQL, automatically moving older data out of the database while keeping applications on the same tables and SQL. This approach keeps PostgreSQL as the primary interface, which is exactly the philosophy Bellani advocates.

The DuckDB Dependency

One fascinating subplot in this convergence story is the increasing dependence on DuckDB. ColdFront uses DuckDB to execute queries against data stored in Iceberg. Snowflake’s pg_lake routes Iceberg queries through pgduck_server. Databricks’ Lakebase also relies on DuckDB internally.

This creates what analysts describe as a concentration risk: if DuckDB faces licensing changes, security vulnerabilities, or performance bottlenecks, the impact would ripple across multiple products simultaneously. It’s a reminder that simplification at one layer can create hidden dependencies at another.

The Verdict: How Far Can PostgreSQL Go?

The answer depends on your scale and your workload. For startups and mid-market companies processing up to 50 million transfers per day, the working set fits comfortably in memory on modern cloud instances. The benchmark shows 766 TPS with 0 failures on a development laptop, that’s 6,000x headroom over the startup target.

For global processors like Visa, handling 900 million transactions per day, the working set balloons to 1.9TB. That’s still within reach of AWS RDS instances supporting up to 4TiB of memory, but the operational complexity increases.

The real question isn’t whether PostgreSQL can replace every specialized tool. It’s whether the default assumption should be that PostgreSQL is sufficient, and whether the burden of proof should be on introducing additional infrastructure.

The Verdict

The multi-tool orthodoxy is crumbling under its own weight. Every new tool in the stack adds operational risk, headcount demand, and system entropy. The argument for PostgreSQL as a data-centric stack isn’t about technological purity, it’s about recognizing that most teams don’t need the complexity they’re reaching for.

As Bellani concludes: “All of this runs on a single vanilla PostgreSQL 18 instance with standard extensions available on RDS. No distributed-systems cosplay, no infrastructure proliferation, no operational complexity tax.”

The next time your team reaches for a specialized tool, ask yourself: have you actually pushed PostgreSQL to its limits, or are you just following the conference talk playbook? The answer might save you years of operational pain.

This post was inspired by Eduardo Bellani’s excellent deep dive and the subsequent discussion on r/dataengineering. For more on PostgreSQL’s growing role in the data stack, check out our analysis of PostgreSQL’s growing dominance in the data stack and the broader debate on simplifying vs. expanding the data stack.