Architecture • 2 min • 2026-04-18

Lessons from Scaling PostgreSQL to Millions of Requests

Real-world lessons on how PostgreSQL behaves under heavy load and what actually breaks first in production systems.

Intro

PostgreSQL is often underestimated in early-stage architectures. Many teams assume it will stop scaling at a certain point and rush toward sharding or NoSQL solutions prematurely.

In reality, PostgreSQL can handle significantly more load than expected — if the system around it is designed correctly.

Bottlenecks

Our scaling issues did not start with storage limits. They started with query design and access patterns:

  • inefficient indexes under mixed workloads
  • long-running transactions blocking critical paths
  • connection exhaustion under traffic spikes
  • read/write contention during peak usage

The database itself was not failing — it was being misused under load.

Migration strategy

Instead of immediately introducing sharding, we focused on:

  • query optimization and index tuning
  • separating read-heavy and write-heavy workloads
  • introducing read replicas for scaling reads
  • implementing connection pooling (PgBouncer)

This delayed the need for architectural complexity while solving real performance bottlenecks.

Event system

We also moved reporting and analytics workloads out of the transactional path using async processing.

Instead of blocking user operations:

User request → DB write → analytics query

We moved to:

User request → DB write → event → async analytics pipeline

This alone removed significant load from the primary database.

Infrastructure

Key improvements included:

  • strict connection pooling limits
  • caching layer for hot reads
  • query time monitoring
  • slow query logging with actionable alerts
  • separation of OLTP vs OLAP workloads

Results

  • query latency reduced significantly under peak load
  • DB CPU usage stabilized instead of spiking unpredictably
  • connection saturation issues disappeared
  • system became easier to reason about under stress

Lessons

The key insight:

Most “scaling problems” in databases are actually query design problems.

Sharding should be a last step, not a first reaction.