- Published on
The Hidden Cost of Failed Long-Running Transactions in PostgreSQL
981 words5 min read–––
Views
- Authors
- Name
- Saikrishna Reddy
The Cost of Failed Long-Running Transactions in PostgreSQL
When a long-running transaction fails in PostgreSQL, it can leave behind a significant amount of database bloat. Let's explore why this happens and what it means for your database performance.
How PostgreSQL Handles Row Updates
MVCC and Row Versions
PostgreSQL uses Multi-Version Concurrency Control (MVCC), which means:
-- When you update a row
UPDATE users SET status = 'active' WHERE id = 1;
-- PostgreSQL actually:
-- 1. Creates a new version of the row
-- 2. Updates indexes to point to the new version
-- 3. Marks the old version as obsolete
Index Updates
-- Creating an index
CREATE INDEX idx_users_status ON users(status);
-- During updates, PostgreSQL must:
-- 1. Add new index entries for new row versions
-- 2. Keep old index entries until cleanup
-- Unless using HOT updates (Heap-Only-Tuples)
Note: HOT updates can skip index updates if:
- The updated columns aren't indexed
- There's enough free space in the page (fill_factor < 100%)
What Happens During Transaction Rollback?
When a long-running transaction that modified millions of rows fails:
- New row versions become "dead tuples"
- Index entries pointing to dead tuples remain
- Space isn't immediately reclaimed
-- Example of a failed transaction
BEGIN;
UPDATE large_table SET status = 'processed'
WHERE created_at < current_date; -- Affects millions of rows
-- Transaction fails or is rolled back
ROLLBACK;
-- Result:
-- - Millions of dead tuples created
-- - Index bloat
-- - Performance degradation
Cleanup Approaches
Different databases handle this situation differently:
1. PostgreSQL's Lazy Approach (VACUUM)
-- Automatic cleanup via autovacuum
-- Check autovacuum status
SHOW autovacuum;
-- Manual vacuum
VACUUM large_table;
-- Monitor dead tuples
SELECT relname, n_dead_tup, n_live_tup
FROM pg_stat_user_tables
WHERE relname = 'large_table';
2. Eager Approach (Other Databases)
-- Some databases use undo logs and require complete cleanup
-- before database restart
-- PostgreSQL doesn't implement this approach
Performance Impact
Reading Dead Rows
-- When PostgreSQL reads a page, it must:
-- 1. Check transaction status for each row
-- 2. Filter out dead rows
-- 3. Process only valid rows
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM large_table WHERE id = 123;
-- Output might show:
-- Buffers: shared hit=100
-- Even though only one row is needed
-- Due to dead rows filling pages
Space Utilization
-- Check table bloat
SELECT
table_name,
pg_size_pretty(pg_total_relation_size(table_name::regclass)) as total_size,
pg_size_pretty(pg_table_size(table_name::regclass)) as table_size,
pg_size_pretty(pg_indexes_size(table_name::regclass)) as index_size
FROM information_schema.tables
WHERE table_schema = 'public';
-- Function explanations:
-- pg_total_relation_size: Total disk space used by the table, including:
-- * Main table
-- * All indexes
-- * TOAST data
-- * FSM (Free Space Map)
-- * VM (Visibility Map)
-- pg_table_size: Size of the main table only, including:
-- * Main fork
-- * FSM
-- * VM
-- * (excludes indexes and TOAST)
-- pg_indexes_size: Total size of all indexes on the table
-- TOAST (The Oversized-Attribute Storage Technique)
-- Stores large column values in a separate table
-- Automatically used for:
-- * Large variable-length values
-- * Large column values that are compressed
-- * Values that are stored externally
Best Practices
- Transaction Management:
-- Break large transactions into smaller batches
BEGIN;
UPDATE large_table
SET status = 'processed'
WHERE id IN (
SELECT id
FROM large_table
WHERE status = 'pending'
LIMIT 1000
);
COMMIT;
- Regular Maintenance:
-- Schedule regular VACUUM operations
VACUUM ANALYZE large_table;
-- Monitor bloat levels
SELECT * FROM pg_stat_progress_vacuum;
- Table Configuration:
-- Set appropriate fillfactor for frequently updated tables
-- fillfactor determines how full PostgreSQL will pack table pages
-- Default is 100 (completely full)
ALTER TABLE frequently_updated_table SET (fillfactor = 80);
-- fillfactor explanation:
-- 100: Pages are packed completely full
-- * Best for read-only tables
-- * No space for updates (forces new versions to other pages)
-- 80: Pages are packed to 80% full
-- * Leaves 20% free space for updates
-- * Enables HOT updates (Heap-Only Tuples)
-- * Reduces page splits and index fragmentation
-- * Good for tables with frequent updates
Conclusion
The choice between lazy and eager cleanup approaches represents different trade-offs:
- Lazy Approach (PostgreSQL)
- Faster rollback
- Background cleanup
- Potential performance impact until cleanup
- Eager Approach
- Immediate cleanup
- Longer rollback time
- Guaranteed clean state
Neither approach is universally "right" or "wrong" - it depends on your specific requirements:
- Need for immediate database availability
- Tolerance for background cleanup
- Maintenance windows availability
- Performance requirements
Understanding these trade-offs helps you make informed decisions about database design and maintenance strategies.