Table Synchronization Architecture
Overview
Springtail synchronizes tables from a primary PostgreSQL database using a sophisticated architecture that coordinates table copying with an ongoing three-stage replication pipeline (Log Writer → Log Reader → Committer). The PgCopyTable component operates independently to perform bulk table copies while the pipeline ensures consistency by tracking PostgreSQL transaction IDs (XIDs) and using snapshot-based visibility rules to determine which mutations should be applied during and after the copy. This document describes the architecture for synchronizing tables into Springtail, with particular emphasis on:- How the replication pipeline is stalled during table copies
- How PostgreSQL XIDs (pgxids) are tracked and used
- How snapshot-based visibility ensures correct synchronization points
- How log replay interacts with table synchronization
Key Components
PgLogMgr
Location:src/pg_log_mgr/pg_log_mgr.{cc,hh}
The orchestration hub for Springtail’s replication pipeline. Manages a three-stage processing pipeline:
- Log Writer Thread - Connects to PostgreSQL replication stream and writes to log files
- Log Reader Thread - Reads log files and parses transactions (via PgLogReader)
- Committer - Processes committed transactions and coordinates garbage collection
- Copy Thread - Coordinates table synchronization requests from Redis queue, using PgCopyTable to perform the actual copying while ensuring consistent snapshots through pipeline coordination
- Manages state transitions during table synchronization
- Coordinates pipeline stalling via STALL messages in the logger queue
- Executes table copies via PgCopyTable
- Assigns Springtail XIDs to transactions
PgCopyTable
Location:src/pg_repl/pg_copy_table.{cc,hh}
A separate component (not part of the main replication pipeline) that handles the actual table copying using PostgreSQL’s binary COPY protocol. Invoked by PgLogMgr’s Copy Thread and interacts with the pipeline through SyncTracker to ensure consistent snapshots.
Key Features:
- Multi-threaded: Uses 4 worker threads to copy tables in parallel
- Transaction isolation: Captures PostgreSQL snapshots (xmin/xmax/xips) at copy time
- Schema preservation: Extracts full table metadata including columns, types, indexes
- Sync coordination: Marks tables as “in-flight” via SyncTracker
- Replication messaging: Emits TABLE_SYNC messages via
pg_logical_emit_message()
- Lock table in ACCESS SHARE MODE (prevents schema changes)
- Capture snapshot via
pg_current_snapshot() - Mark table as in-flight in SyncTracker
- Execute
COPY table TO STDOUT WITH (FORMAT binary) - Parse binary data and insert into snapshot table
- Emit
pg_logical_emit_message()with TABLE_SYNC_MSG
SyncTracker
Location:src/pg_log_mgr/sync_tracker.{cc,hh}
Tracks table synchronization state and determines when mutations should be skipped during log replay. Acts as the bridge between PgCopyTable and the replication pipeline.
Four Primary Data Structures:
_resync_map- Tables where resync was issued but not yet picked up by copy thread_resync_picked_map- Tables picked for resync at specific XID_inflight_map- Tables whose COPY is currently in-flight (snapshot metadata stored here)_table_map- Completed syncs indexed by table_id (persists for skip logic)_sync_map- Completed syncs indexed by snapshot PG_XID (used for commit detection)
Snapshot::should_skip():
PgLogReader
Location:src/pg_log_mgr/pg_log_reader.{cc,hh}
Reads replication logs and applies mutations to the write cache. Coordinates with SyncTracker to skip mutations during table syncs.
Skip Logic Integration:
Every mutation (INSERT/UPDATE/DELETE) checks:
- Line 229-235: INSERT/UPDATE/DELETE mutations
- Line 313-317: TRUNCATE operations
- Line 487-491: CREATE_TABLE/ALTER_RESYNC DDL
- Line 501-505: CREATE_INDEX/ALTER_TABLE/DROP_TABLE DDL
COPY_SYNC message arrives (line 1193-1197) or during transaction commits (line 1351), _check_sync_commit() is called:
Synchronization Flow
Full Table Sync Process
1. Sync Request Initiation
Trigger Points:- Startup sync (STATE_STARTUP_SYNC)
- ALTER TABLE requiring resync
- Explicit resync via Redis queue
- Table validation changes (ALTER_RESYNC)
PgLogMgr::_copy_thread() blocks on Redis queue for sync requests
2. Pipeline Stall Initiation
Critical Section:PgLogMgr::_do_table_copies()
_logger_queue) bridges the writer and reader threads. When a STALL message is pushed:
- Writer thread continues writing replication data but queues it
- Reader thread processes the STALL message in its main loop
- Reader sets state to
STATE_SYNCINGand blocks - Writer receives acknowledgment and begins table copy
3. Table Copy Execution
Assign Target XID:target_xid for all tables in this sync batch.
Execute Copy:
- Pop table from queue
- Connect to PostgreSQL
-
Lock table:
LOCK TABLE schema.table IN ACCESS SHARE MODE -
Capture snapshot:
Returns:
(pg_xid, "xmin:xmax:xid,xid,...") -
Mark in-flight in SyncTracker:
-
Create snapshot table:
-
Execute COPY:
-
Parse binary data:
- Verify header:
"PGCOPY\n\377\r\n\0" - Read tuples in PostgreSQL binary format
- Insert into snapshot table
- Verify header:
-
Emit sync message:
4. Resume Pipeline
After all table copies complete:How the System is Stalled
Stall Mechanism Architecture
The stall mechanism uses inter-thread coordination via state synchronizer and queue messages: Components:- StateSynchronizer - Thread-safe state machine with atomic test-and-set
- Logger Queue - Passes STALL message from writer to reader
- Committer Queue - Receives TABLE_SYNC_START to block commits
- Writer continues writing - Replication stream isn’t disconnected, just queued
- Reader blocks safely - No partial transaction application
- Snapshots are consistent - Captured while mutations are queued
- No race conditions - State transitions are atomic
Commit Blocking Details
WhenSyncTracker::block_commits() is called:
- Stops advancing the committed XID
- Prevents garbage collection from removing data being copied
- Resumes when table swap completes
How PostgreSQL XIDs are Tracked and Used
XID Architecture Overview
Springtail maintains two parallel XID spaces:-
PostgreSQL XIDs (pg_xid) - 32-bit transaction IDs from PostgreSQL
- Subject to wraparound at 2^32
- Used for snapshot visibility
- Tracked with epoch for 64-bit uniqueness
-
Springtail XIDs (xid) - 64-bit global transaction IDs
- Monotonically increasing
- Never wrap around
- Used for internal MVCC and garbage collection
_xid_ts_tracker (WalProgressTracker) that maps pg_xid → Springtail xid.
XID Assignment Flow
During Normal Operation:target_xid), ensuring atomic visibility.
Result Structure:
XID Tracking During Sync
Three Critical XIDs:-
Target XID - Springtail XID assigned before copy starts
- Used for snapshot table creation
- Used for system table updates
- Used for swap operation
-
Copy PG XID - PostgreSQL XID when snapshot was taken
- Captured via
pg_current_xact_id() - Stored in
PgCopyResult::pg_xid - Used in TABLE_SYNC message
- Captured via
-
Snapshot XIDs - xmin/xmax/xips defining visibility
- Captured via
pg_current_snapshot() - Used by
Snapshot::should_skip()for mutations
- Captured via
Snapshot Visibility Rules
PostgreSQL Snapshot Format:"xmin:xmax:xid1,xid2,..."
Example: "1000:1010:1002,1005,1008"
- xmin = 1000 - Oldest transaction still running
- xmax = 1010 - One past highest completed XID
- xips = [1002, 1005, 1008] - Transactions in progress between xmin and xmax
Table Swap and Commit
When Swap Occurs
The swap happens when all in-progress transactions at snapshot time have committed. Check Logic in SyncTracker::check_commit():check_commit(X). If X matches the pg_xid from a table sync snapshot, all transactions visible to that snapshot have now committed, so it’s safe to swap.
Swap Process
In PgLogReader::_check_sync_commit():swap_sync_table() operation:
- Updates
springtail.tableswith new table root pointer - Updates
springtail.namespacesif needed - Creates index entries in
springtail.indexes - Invalidates client caches
- Returns DDL statements for FDW propagation
Recovery and Error Handling
Log Recovery on Startup
Entry Point:PgLogMgr::startup()
Recovery Phases
Phase 1: Repair Logs Scans replication logs to find last valid committed LSN:- Query
xid_mgrfor last committed XID - Revert
springtail.tables,springtail.namespaces, etc. to that XID - Ensures system catalog consistency
- Read log from last committed LSN
- Skip transactions already committed before crash
- Prevents duplicate application
- Replay transactions that were in-progress at crash time
- Use snapshot visibility to skip invalid mutations
- Process messages after last committed transaction
- Re-apply schema changes and mutations
- Rebuild write cache state
Handling Copy Failures
Worker Thread Error Handling:- Transient errors (connection loss, deadlock) → Re-queue table
- Table not found → Log error, mark as dropped, continue
- Other errors → Fatal, stop process
XID Consistency Across Restarts
XID Manager Persistence: The XidMgr tracks committed XIDs to persistent storage. On restart:Message Flow Diagram
Full Sync Lifecycle
Performance Considerations
Parallel Table Copying
- 4 worker threads copy tables concurrently
- Each worker maintains independent PostgreSQL connection
- Results aggregated when all workers complete
- Thread-safe queue coordinates work distribution
Binary COPY Protocol
- Efficient binary data transfer (vs. text COPY)
- No serialization overhead
- Direct field parsing into internal format
- Typical throughput: 100K+ rows/second per table
Memory Management
- Write cache batching: 4MB extent size before flush
- Snapshot tables: Stored in mutable B-trees, not memory
- Queue flow control: Memory/file hybrid mode based on watermarks
- Log archiving: Old logs cleaned based on min active timestamp
Stall Duration Minimization
The pipeline stall only occurs during:- Snapshot capture (~milliseconds)
- Table metadata extraction (~seconds)
Configuration
Key Constants
From pg_log_mgr.cc:Tuning Recommendations
- Increase worker threads for databases with many small tables
- Decrease batch size if memory pressure is high
- Increase queue size if replication lag spikes during sync
- Enable log archiving for regulatory compliance
Summary
Springtail’s table synchronization architecture achieves zero downtime and consistency through:- Snapshot isolation - Captures PostgreSQL snapshots to establish visibility boundaries
- Pipeline coordination - Stalls log processing during snapshot capture only
- Skip-based replay - Uses PostgreSQL XID visibility rules to skip redundant mutations
- Atomic swap - Swaps tables when all in-flight transactions commit
- Recovery support - Replays logs correctly after crashes
- Parallel execution - Copies multiple tables concurrently for performance
- No mutations are lost during table sync
- No mutations are double-applied
- Table data is consistent with a specific point in the replication stream
- The system can recover from failures at any stage