Benchmarks

Mammoth benchmarks are intentionally small and focused. They are meant to prove specific runtime behavior rather than produce universal performance claims.

Concurrent Delivery Benchmark

Location:

benchmark/concurrent_delivery.rb

Run:

bundle exec ruby benchmark/concurrent_delivery.rb

This benchmark exercises the same downstream execution boundary used by Mammoth when runtime.adapter: concurrent is enabled:

TransactionEnvelope
      ↓
Mammoth::ConcurrentDeliveryRuntime
      ↓
Mammoth::DeliveryProcessor
      ↓
DeliveryWorker-compatible sink

The default matrix compares:

concurrency: 1
concurrency: 5
concurrency: 10
concurrency: 25

with configurable synthetic sink latency.

Configuration

MAMMOTH_BENCH_TRANSACTIONS=5000 \
MAMMOTH_BENCH_EVENTS_PER_TRANSACTION=4 \
MAMMOTH_BENCH_LATENCY_MS=25 \
MAMMOTH_BENCH_CONCURRENCY=1,5,10,25,50 \
MAMMOTH_BENCH_PRESERVE_ORDER=false \
bundle exec ruby benchmark/concurrent_delivery.rb

Set MAMMOTH_BENCH_JSON=1 to emit machine-readable JSON after the table.

Interpretation

This benchmark should be read as a downstream delivery benchmark:

one upstream replication stream
        ↓
many downstream concurrent deliveries

It does not create extra PostgreSQL replication slots or replication connections. That separation is the core Mammoth runtime story.

Not Covered

This benchmark does not measure:

  • PostgreSQL write throughput
  • pgoutput decoding throughput
  • network behavior
  • retry behavior
  • checkpoint recovery
  • Toxiproxy failure scenarios

Those should be covered by separate end-to-end examples and resilience tests.

Findings

The benchmark measures Mammoth's ability to scale downstream delivery throughput while consuming a single PostgreSQL logical replication stream.

Benchmark Configuration

  • 10,000 transactions
  • 4 events per transaction
  • 40,000 total events
  • preserve_order: false

Fast Sink (10ms)

Simulates a fast downstream webhook.

Concurrency Transactions/sec Events/sec Avg Latency (ms) P95 Latency (ms) Elapsed (s)
1 96.50 385.98 10.204 10.404 103.631
5 482.26 1929.04 10.235 10.451 20.736
10 955.04 3820.17 10.287 11.047 10.471
25 2419.65 9678.61 10.173 10.330 4.133

Interpretation

Throughput scales nearly linearly as delivery concurrency increases.

At a concurrency level of 25, Mammoth achieves approximately:

  • 25x transaction throughput
  • 25x event throughput

while maintaining essentially identical delivery latency.


Realistic Webhook (50ms)

Simulates a more realistic external webhook endpoint.

Concurrency Transactions/sec Events/sec Avg Latency (ms) P95 Latency (ms) Elapsed (s)
1 19.85 79.40 50.206 50.405 503.795
5 99.27 397.07 50.234 50.419 100.737
10 198.40 793.61 50.181 50.402 50.403
25 495.11 1980.44 50.224 50.420 20.198

Interpretation

When downstream systems become slow, concurrency becomes increasingly valuable.

With a 50ms delivery latency:

  • Concurrency 1 processes only 19.85 transactions/sec.
  • Concurrency 25 processes 495.11 transactions/sec.

This demonstrates approximately a 25x throughput increase while maintaining a single PostgreSQL replication stream.


Architectural Implications

Mammoth separates:

PostgreSQL Logical Replication

→ TransactionEnvelope Aggregation

→ Concurrent Delivery Execution

This allows delivery throughput to scale independently from PostgreSQL replication resources.

Increasing delivery concurrency does not require additional logical replication connections.

A single PostgreSQL replication stream can drive thousands of event deliveries per second through the cdc-concurrent runtime.

Key Result

Increasing delivery concurrency from 1 to 25 improved throughput from:

  • 19.85 tx/sec
  • to 495.11 tx/sec

while maintaining a single PostgreSQL logical replication stream.