How to Use iqtoolkit/pgtools: HOT Update Optimization Checklist (Part 2)
How to Use iqtoolkit/pgtools: HOT Update Optimization Checklist (Part 2)
This is the second post in our series on using iqtoolkit/pgtools for PostgreSQL performance optimization.
In Part 1, we covered what HOT updates are, how to detect them, and how to apply fillfactor recommendations.
In this post you'll learn:
- How to validate that your fillfactor changes actually improved HOT behavior
- Why HOT% might _still_ be low after tuning
- Strategies for investigating index-related issues
- When (and when not) to reach for more aggressive rewrite operations
- Setting up continuous monitoring with pgtools
Validating Changes in Production
After applying ALTER TABLE ... SET (fillfactor=...) and a rewrite operation, you need solid data to confirm improvement.
The Validation Window
Here's the key insight: statistics reset on rewrite.
When you run VACUUM (FULL), CLUSTER, pg_squeeze, or pg_repack, PostgreSQL resets the counters in pg_stat_user_tables for that table. This is actually helpful—it gives you a clean baseline.
The validation process:
- Note the timestamp before your rewrite operation
- Run the rewrite (
VACUUM (FULL),CLUSTER,pg_squeeze, orpg_repack) - Let your application run for the same duration as before (e.g., if you collected stats over 24 hours, wait 24 hours again)
- Re-run the pgtools checklist and compare
Example Validation Query
After your change, you can run:
SELECT
schemaname,
tablename,
n_tup_upd,
n_tup_hot_upd,
ROUND(100.0 * n_tup_hot_upd / NULLIF(n_tup_upd, 0), 2) AS hot_ratio,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_stat_user_tables
WHERE schemaname = 'public'
AND n_tup_upd > 1000 -- Focus on meaningful workloads
ORDER BY n_tup_hot_upd DESC;
What you're looking for:
hot_ratioshould improve compared to before the change- If it doesn't improve, that signals a different root cause (see troubleshooting below)
Troubleshooting: Why Didn't HOT% Improve?
So you lowered fillfactor, ran the rewrite, and HOT% is still low. Here are the most common culprits.
1. Updates Are Changing Indexed Columns
This is the most common reason HOT updates aren't happening.
The Problem:
If you're running UPDATE mytable SET indexed_column = value ..., PostgreSQL _must_ update all indexes on indexed_column. HOT is impossible in that case—it's a physical constraint, not a tuning opportunity.
How to Find It:
Use the pgtools index analysis script to identify which columns in your indexes are being frequently updated:
psql -d your_db -Xq -f analysis/index_column_update_analysis.sql \
| jq -r '.analysis[] | select(.update_frequency > 10) | {table, index, columns}'
Or, if you're using application-level tracking, correlate your UPDATE statements with your table indexes.
The Fix:
- Consider if the index is necessary. If updates always change a certain indexed column but queries rarely use that column, drop the index.
- Use a partial index if only a subset of rows need it (e.g.,
WHERE status = 'active'). - Defer the column update to a separate, less-frequent operation if your business logic allows.
2. Fillfactor Is Still Too High (or You Skipped the Rewrite)
Lower fillfactor _only helps new inserts and page splits_. If you set fillfactor=80 but never rewrote the table, the old pages are still packed at 100%.
The Fix:
- Run the rewrite operation (
VACUUM (FULL),CLUSTER,pg_squeeze, orpg_repack) _after_ setting fillfactor. - Choose the least disruptive approach for your workload:
pg_squeeze: Non-blocking external tool (recommended first option—uses replication slots, minimal performance impact).-
pg_repack: Non-blocking external tool (alternative—uses triggers which have more overhead than replication slots).-
VACUUM (FULL): Simplest built-in option, but locks the table (use during maintenance windows only).-
CLUSTER: Locks but also optimizes physical order on disk (good if you have a natural sort key).3. Vacuum Isn't Keeping Up
If dead tuples accumulate faster than VACUUM can reclaim them, page fragmentation gets worse, and HOT updates _still_ can't find free space.
Diagnosis:
SELECT
schemaname,
tablename,
n_dead_tup,
n_live_tup,
ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_ratio
FROM pg_stat_user_tables
WHERE n_live_tup > 10000
ORDER BY dead_ratio DESC;
If dead_ratio is above ~10% on update-heavy tables, autovacuum is likely falling behind.
The Fix:
- Tune autovacuum parameters for that table:
ALTER TABLE mytable SET (
autovacuum_vacuum_scale_factor = 0.01, -- vacuum at 1% dead tuples
autovacuum_analyze_scale_factor = 0.005 -- analyze at 0.5%
);
- Consider manual
VACUUMruns on a schedule during off-peak hours. - Monitor
autovacuum_workersand ensure they're not resource-starved.
4. Row Width Is Expanding (LOB Columns, JSON, Text)
Large columns can make it impossible to fit an updated row on the same page, even with lower fillfactor.
Diagnosis:
SELECT
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) / (n_live_tup + 1)) AS avg_row_size
FROM pg_stat_user_tables
WHERE schemaname = 'public'
AND n_live_tup > 0
ORDER BY avg_row_size DESC;
The Fix:
- Extract large columns to a separate table (consider a JSONB column in a "details" table instead of inline).
- Compress data if you're storing serialized text (JSONB compression, pgcrypto, bytea with compression).
- Use TOAST settings to force large values out-of-line:
ALTER TABLE mytable ALTER COLUMN large_col SET STORAGE EXTERNAL;
Advanced Monitoring: Setting Up Continuous Tracking
Once you've tuned a table, don't set it and forget it.
Create a Baseline View
Create a view that captures your "optimized" tables and their target HOT ratios:
CREATE VIEW hot_optimization_baseline AS
SELECT
schemaname,
tablename,
current_setting('fillfactor'::text)::int AS fillfactor_setting,
75 AS target_hot_ratio, -- Adjust based on your SLA
n_tup_upd,
n_tup_hot_upd,
ROUND(100.0 * n_tup_hot_upd / NULLIF(n_tup_upd, 0), 2) AS current_hot_ratio,
CASE
WHEN ROUND(100.0 * n_tup_hot_upd / NULLIF(n_tup_upd, 0), 2) >= 75 THEN 'GOOD'
WHEN ROUND(100.0 * n_tup_hot_upd / NULLIF(n_tup_upd, 0), 2) >= 50 THEN 'FAIR'
ELSE 'NEEDS ATTENTION'
END AS status
FROM pg_stat_user_tables
WHERE schemaname = 'public'
AND tablename IN (
'payments', 'orders', 'inventory' -- Your optimized tables
);
Alert on Regression
Query this baseline periodically (daily or weekly) and alert if status drops below your target:
psql -d your_db -Xq -c "
SELECT tablename, current_hot_ratio, status
FROM hot_optimization_baseline
WHERE current_hot_ratio < 75 AND n_tup_upd > 0;
" | mail -s "HOT Optimization Alert" dba-team@company.com
Real-World Case Study: The Payments Table
Let's walk through a complete optimization cycle:
Initial State:
paymentstable: 50M rows,fillfactor=100, ~500k updates/day- HOT ratio: 12%
- Index on
(customer_id, created_at) - Index on
(status, amount)
Analysis:
- Ran pgtools, found HOT recommendations
- Checked for indexed column updates:
UPDATE payments SET status = ...happens frequently - Checked dead tuples: 8% dead ratio—autovacuum was keeping up
Strategy:
- Status is indexed, so we can't make it HOT in all cases
- But we _can_ improve fillfactor and let autovacuum be more aggressive
- Set
fillfactor = 75(more conservative than default 100) - Ran
pg_squeezeon Sunday evening (non-blocking rewrite with minimal performance impact) - Tuned autovacuum for that table
Result (after 1 week):
- HOT ratio improved from 12% to 43%
- Index size growth slowed by ~30%
- Update latency dropped slightly
- Not perfect (still updating
status), but significant improvement
When _Not_ to Optimize for HOT
Not every table benefits from HOT tuning. Save your effort for:
- Update-heavy tables (>10k updates/day)
- Tables where most updates don't touch indexed columns
- Tables where index bloat is a known pain point
Skip HOT optimization for:
- Tables that are append-only or rarely updated
- OLAP-style tables (batch inserts, rare updates)
- Temporary or staging tables
- Situations where you're already using partial indexes or partitioning for other reasons
Next Steps
- Run pgtools on your workloads to identify candidates
- Validate baseline metrics before making changes
- Test fillfactor changes in staging first
- Monitor continuously after production rollout
- Iterate based on real workload patterns
The investment in careful monitoring often pays dividends in reduced index bloat and more predictable performance.
Coming up: In Part 3, we'll dive into using pgtools for index bloat analysis and recovery strategies for severely bloated indexes.