There is a deep cultural reflex in modern engineering: whenever a problem appears, reach for a packaged solution instead of thinking from first principles. The result is architectural cargo culting and lots of missed opportunities.
“We need an audit trail, let’s use a temporal database.” “Our application is slow, let’s cache that using an in-memory key-value store.” “Analytics are too slow, let’s spin up a data warehouse.”
Sound familiar? This is the multi-tool orthodoxy that has dominated data engineering for the last decade. But a growing counter-movement is asking an uncomfortable question: How many elements does a data-centric stack really need?
The answer, increasingly, is “one.” And that one is PostgreSQL.
The Case for Simplification
Eduardo Bellani’s recent blog post, “All you need is PostgreSQL,” makes a provocative argument that has been circulating through data engineering circles: the default assumption for your data problems should be that your company can do fine with just PostgreSQL. The post, which gained significant traction on Reddit’s r/dataengineering, challenges the deeply ingrained cultural reflex in modern engineering to reach for a packaged solution whenever a problem appears.
The result of this reflex? Architectural cargo culting. Companies stack product on top of product on top of PostgreSQL, inflating the number of moving parts, operational risk, headcount demand, and overall system entropy. Complexity grows not because the problems demand it, but because someone reached for a tool they saw in a conference talk.
This isn’t just a theoretical argument. Eduardo Bellani’s detailed walkthrough demonstrates how PostgreSQL 18, with standard extensions available on RDS, can handle auditing, write throughput, transactional queries, analytical queries, and application decoupling, all the things teams typically reach for separate tools to solve.
The Five Objections to PostgreSQL-Only
Bellani identifies five common arguments teams make for reaching beyond PostgreSQL:
- “I’ll need auditing and reconstructing state”
- “Write throughput is too low”
- “The transactional queries are too slow”
- “The analytical queries are too slow”
- “My app will be coupled to the Database”
Each of these objections gets systematically dismantled with production-ready PostgreSQL 18 code. Let’s walk through the most compelling parts.
Auditing Without an Event Store
The auditing argument is one of the most common justifications for introducing event sourcing infrastructure. The thinking goes: “We need to track every state change, so we need a dedicated event store or temporal database.”
Bellani’s response is elegant: system-time temporal tables. Using the temporal_tables extension, PostgreSQL automatically writes old versions of each row to a history table on every UPDATE or DELETE. Combined with a tstzrange period, you get a complete audit trail suitable for AS OF queries and state reconstruction.
-- Add system-time period column to transfer table
alter table finance.transfer add column if not exists sys_period tstzrange default tstzrange(current_timestamp, null);
alter table finance.transfer alter column sys_period set not null;
-- Focused history table - only what we need for status transitions
create table if not exists finance.transfer_status_log (
like finance.transfer
);
-- Use temporal_tables versioning procedure
create trigger transfer_save_status_history
before insert or update or delete
on finance.transfer
for each row
execute procedure versioning('sys_period', 'finance.transfer_status_log', true);
No event store. No append-only log infrastructure. Just a history table maintained by a trigger. The same pattern handles account auditing, transfer state machine history, and any other temporal tracking requirement.
The OLAP Objection: Where PostgreSQL Gets Interesting
The most common pushback to the “all you need is PostgreSQL” argument comes from the analytical side. As one Reddit commenter put it, PostgreSQL “really isn’t ideal for OLAP.” And they’re not wrong, if you’re scanning the full transaction history to compute a balance, a row-oriented database will struggle.
But the solution isn’t to reach for a columnar engine. It’s to think differently about the problem.
Bellani’s approach uses a balance ledger, a table that stores running balance snapshots after every transaction, maintained incrementally via triggers. This shifts work from read time to write time, turning what would be a full-table aggregation into a single index-backed lookup.
create table finance.balance_ledger (
routing_number finance.routing_number not null,
account_number finance.account_number not null,
as_of timestamptz not null,
current_total bigint not null,
available_total bigint not null,
primary key (routing_number, account_number, as_of)
);
The triggers that maintain this ledger are straightforward. When a settled transaction is inserted, both the current and available totals change. When a pending transaction is inserted, only the available total changes. Each trigger fetches the most recent ledger row for the account and appends a new snapshot.
This is the same principle behind incremental view maintenance: the cost of keeping derived data up to date is borne by the process changing the base data. The trade-off is acceptable because transactions are append-only, and per account, concurrent writes are rare.
The Benchmark That Silences Skeptics
The real test of any architecture is whether it works under load. Bellani ran a pgbench benchmark simulating a startup scenario: 10,000 transfers per day with an 80/20 read/write split, exercising the full write and read paths through the updatable view.
The results are striking:
| Metric | Value |
|---|---|
| Overall TPS | 766 |
| Failed transactions | 0 (0.000%) |
| Average latency | 13.044 ms |
| Read latency (avg) | 8.080 ms |
| Write latency (avg) | 32.915 ms |
| Read transactions | 36,793 (80%) |
| Write transactions | 9,191 (20%) |
The startup target of 10,000 transfers per day is roughly 0.12 TPS. The benchmark exceeded that by over 6,000x on a development laptop. On an RDS instance sized for the working set, performance would be significantly better.
The Capacity Planning That Makes It Work
The key insight that makes PostgreSQL viable for this workload is understanding the working set. Bellani’s analysis shows that for a mid-to-large bank processing 50 million transfers per day, the working set for 2 days of data is approximately 110GB. AWS RDS supports up to 4TiB of memory per instance for PostgreSQL-compatible instances.
| Scale | Transfers/day | Working set size (2 days) |
|---|---|---|
| Startup | 10,000 | 22M |
| Mid/large bank | 50,000,000 | 110G |
| Global processor | 900,000,000 | 1.9T |
The working set estimation is based on careful analysis: each transfer row is roughly 160 bytes, with log tables at the same width and transaction tables slightly narrower at around 120 bytes. With index overhead factored in, the numbers hold up.
HOT Updates: The Write Throughput Hack Nobody Talks About
One of the most clever optimizations in this design is the use of Heap-Only Tuple (HOT) updates. By keeping the primary key aligned with immutable columns and ensuring that status and transfer_period don’t participate in any indexes, PostgreSQL can write new tuple versions to the same page and skip index updates entirely.
alter table finance.transfer set (fillfactor = 70);
This reserves 30% free space per page, allowing updated tuples to fit on the same page. The benefits are substantial:
– 2-3x faster updates compared to non-HOT updates
– No table and index bloat from status changes
– Simpler vacuum maintenance
– Smaller WAL, since there is less write activity overall
This is the kind of optimization that only becomes obvious when you’re thinking from first principles rather than reaching for a caching layer.
The Decoupling Argument: Views as Interfaces
One of the most persistent objections to a PostgreSQL-centric architecture is the fear of coupling. “If my app talks directly to the database, I can’t change the schema without breaking everything.”
Bellani’s response is to use views as the canonical way to implement modularity in SQL DBMSes. As Codd originally envisioned, views provide logical data independence: application programs remain unaffected when the internal representation of data changes.
The finance.transfer_activity view serves as a single unified stream through which applications read, insert, and update data. If the underlying table structure changes, the view definition is updated once, and every application continues to work unchanged.
This is decoupling achieved at the data level, with no network hops, no serialization overhead, and no distributed consistency problems. The view is the interface, and the base tables are the implementation.
The Benchmark That Matters
The benchmark results tell the real story. On a development laptop, running 10 concurrent clients with an 80/20 read/write split:
- 766 TPS overall with 0 failed transactions
- 8ms average read latency for the
UNION ALLview over 4 tables plus balance ledger lookup - 33ms average write latency for transactions exercising the full constraint set
- 0 serialization failures despite 10 concurrent clients under
SERIALIZABLEisolation
The startup target of 10,000 transfers per day is roughly 0.12 TPS. The benchmark exceeded that by over 6,000x. This isn’t theoretical, it’s a working system on modest hardware.
Where PostgreSQL Hits Its Limits
The honest answer is that PostgreSQL isn’t ideal for everything. As one Reddit commenter noted, “the OLAP part is pretty weak, as expected, it is not designed for this.” The title shouldn’t be read as “All you need for data engineering is PostgreSQL.”
For massive analytical workloads scanning billions of rows, columnar engines like ClickHouse will outperform PostgreSQL by orders of magnitude. The PostgreSQL vs. columnar engines for analytics debate is real, and for good reason.
But the question isn’t whether PostgreSQL can replace every specialized tool. The question is whether the default assumption should be that PostgreSQL is sufficient, and whether the burden of proof should be on introducing additional infrastructure.
The Industry Is Moving Toward Convergence
The timing of this argument is no coincidence. The industry is in the midst of a major convergence trend. Databricks recently unveiled LTAP, EDB introduced converged analytics, and pgEdge launched ColdFront, all aiming to collapse the OLTP/OLAP divide.
What’s interesting is that these approaches all make different tradeoffs. Databricks asks enterprises to adopt a proprietary lakehouse as the operational center of gravity. Snowflake’s pg_lake requires applications to distinguish between PostgreSQL and analytical tables. EDB still requires archived data to be brought back into active PostgreSQL before it can be modified.
ColdFront, by contrast, treats Iceberg only as a transparent storage tier behind PostgreSQL, automatically moving older data out of the database while keeping applications on the same tables and SQL. This approach keeps PostgreSQL as the primary interface, which is exactly the philosophy Bellani advocates.
The DuckDB Dependency
One fascinating subplot in this convergence story is the increasing dependence on DuckDB. ColdFront uses DuckDB to execute queries against data stored in Iceberg. Snowflake’s pg_lake routes Iceberg queries through pgduck_server. Databricks’ Lakebase also relies on DuckDB internally.
This creates what analysts describe as a concentration risk: if DuckDB faces licensing changes, security vulnerabilities, or performance bottlenecks, the impact would ripple across multiple products simultaneously. It’s a reminder that simplification at one layer can create hidden dependencies at another.
The Verdict: How Far Can PostgreSQL Go?
The answer depends on your scale and your workload. For startups and mid-market companies processing up to 50 million transfers per day, the working set fits comfortably in memory on modern cloud instances. The benchmark shows 766 TPS with 0 failures on a development laptop, that’s 6,000x headroom over the startup target.
For global processors like Visa, handling 900 million transactions per day, the working set balloons to 1.9TB. That’s still within reach of AWS RDS instances supporting up to 4TiB of memory, but the operational complexity increases.
The real question isn’t whether PostgreSQL can replace every specialized tool. It’s whether the default assumption should be that PostgreSQL is sufficient, and whether the burden of proof should be on introducing additional infrastructure.
The Verdict
The multi-tool orthodoxy is crumbling under its own weight. Every new tool in the stack adds operational risk, headcount demand, and system entropy. The argument for PostgreSQL as a data-centric stack isn’t about technological purity, it’s about recognizing that most teams don’t need the complexity they’re reaching for.
As Bellani concludes: “All of this runs on a single vanilla PostgreSQL 18 instance with standard extensions available on RDS. No distributed-systems cosplay, no infrastructure proliferation, no operational complexity tax.”
The next time your team reaches for a specialized tool, ask yourself: have you actually pushed PostgreSQL to its limits, or are you just following the conference talk playbook? The answer might save you years of operational pain.
This post was inspired by Eduardo Bellani’s excellent deep dive and the subsequent discussion on r/dataengineering. For more on PostgreSQL’s growing role in the data stack, check out our analysis of PostgreSQL’s growing dominance in the data stack and the broader debate on simplifying vs. expanding the data stack.




