Tracking Your Database Health Score Over Time

Database tuning has an accountability problem. You spend two hours on a Tuesday fixing autovacuum settings, dropping unused indexes, and creating missing ones. Your team asks "did it help?" and the best you can offer is "queries feel faster." Without trend data, you cannot prove the work mattered --

What if you could show a graph proving your Vacuum domain score went from 60 to 95 after the change?

Why Point-in-Time Checks Are Not Enough

Running diagnostic queries during an incident tells you the current state. It says nothing about trajectory. Did the autovacuum tuning actually reduce dead tuple accumulation, or did a new high-churn table eat the gains? Is the sequential scan pattern gone, or did a new query introduce a different one?

Without historical data, you cannot distinguish between "we fixed the problem" and "the problem temporarily subsided."

Building Health Score Tracking Yourself

In theory, you could build this infrastructure:

-- You'd need something like this for every check
CREATE TABLE health_check_log (
    checked_at_utc timestamptz DEFAULT now(),
    domain text,
    check_name text,
    status text,  -- pass, warning, critical
    metric_value numeric,
    details jsonb
);

-- Then schedule daily inserts for each check
INSERT INTO health_check_log (domain, check_name, status, metric_value)
SELECT 'vacuum', 'dead_tuple_ratio',
       CASE WHEN max_ratio > 20 THEN 'critical'
            WHEN max_ratio > 5 THEN 'warning'
            ELSE 'pass' END,
       max_ratio
FROM (
    SELECT max(100.0 * n_dead_tup / NULLIF(n_live_tup, 0)) AS max_ratio
    FROM pg_stat_user_tables
) t;

Now multiply that by every check across every domain. Define scoring thresholds for each. Build the trend queries. Create visualization. Maintain it as PostgreSQL versions change and new checks become relevant.

In practice, nobody builds this. The effort exceeds what most teams can allocate, so health checks remain ad-hoc exercises performed during incidents and forgotten afterward.

The Three-Step Workflow

When you do have historical health data, the workflow becomes structured:

1. Review findings by priority. Start with P1 (critical) and P2 (high) findings -- these have the largest impact on your score and the highest risk.

-- Example P1 finding: autovacuum disabled on a table with 5M dead tuples
ALTER TABLE events RESET (autovacuum_enabled);
-- Score impact: Vacuum domain +15 points

2. Fix using generated SQL. Apply the fix, then wait for the next check cycle to verify improvement.

-- Example P2 finding: shared_buffers at 4% of RAM
ALTER SYSTEM SET shared_buffers = '8GB';
-- Requires restart, but Configuration domain will jump to 100

3. Suppress accepted trade-offs. If a finding represents a deliberate choice (e.g., you keep a table intentionally denormalized and accept the bloat), suppress it. This prevents noise in your trend data while keeping the finding visible for reference.

What to Watch For in the Trends

The absolute score matters less than the direction:

Score of 78 that was 65 last week -- things are improving, keep going
Score of 90 that was 95 last week -- something degraded, investigate
Overall score steady but one domain declining -- new queries deployed without supporting indexes, or autovacuum falling behind on new tables

A gradual decline in any single domain is the most valuable signal. Your overall score might mask it because other domains compensate. Per-domain tracking catches patterns like "Indexes dropped from 95 to 75 over three weeks" that would be invisible in an aggregate number.

Making It Sustainable

The most effective way to use health scores:

Post-deployment checks -- Compare domain scores before and after. A 10-point drop in any domain warrants investigation.
Sprint retrospectives -- Spend 5 minutes reviewing which domains improved, which degraded, and whether new findings appeared.
Team KPI -- Share the dashboard alongside application metrics. When the score drops after a deployment, you have an early signal before it escalates.
Alert on degradation -- A 10-point drop in any domain within a week triggers a review. A new P1 finding triggers immediate action.

The key insight is that the delta tells you more than the absolute number. Track the direction, and small problems never compound into incidents.

Full article with trend visualizations: Tracking Your Database Health Score Over Time

What if you could show a graph proving your Vacuum domain score went from 60 to 95 after the change?

Why Point-in-Time Checks Are Not Enough

Without historical data, you cannot distinguish between "we fixed the problem" and "the problem temporarily subsided."

Building Health Score Tracking Yourself

In theory, you could build this infrastructure:

-- You'd need something like this for every check
CREATE TABLE health_check_log (
    checked_at_utc timestamptz DEFAULT now(),
    domain text,
    check_name text,
    status text,  -- pass, warning, critical
    metric_value numeric,
    details jsonb
);

-- Then schedule daily inserts for each check
INSERT INTO health_check_log (domain, check_name, status, metric_value)
SELECT 'vacuum', 'dead_tuple_ratio',
       CASE WHEN max_ratio > 20 THEN 'critical'
            WHEN max_ratio > 5 THEN 'warning'
            ELSE 'pass' END,
       max_ratio
FROM (
    SELECT max(100.0 * n_dead_tup / NULLIF(n_live_tup, 0)) AS max_ratio
    FROM pg_stat_user_tables
) t;

In practice, nobody builds this. The effort exceeds what most teams can allocate, so health checks remain ad-hoc exercises performed during incidents and forgotten afterward.

The Three-Step Workflow

When you do have historical health data, the workflow becomes structured:

1. Review findings by priority. Start with P1 (critical) and P2 (high) findings -- these have the largest impact on your score and the highest risk.

-- Example P1 finding: autovacuum disabled on a table with 5M dead tuples
ALTER TABLE events RESET (autovacuum_enabled);
-- Score impact: Vacuum domain +15 points

2. Fix using generated SQL. Apply the fix, then wait for the next check cycle to verify improvement.

-- Example P2 finding: shared_buffers at 4% of RAM
ALTER SYSTEM SET shared_buffers = '8GB';
-- Requires restart, but Configuration domain will jump to 100

What to Watch For in the Trends

The absolute score matters less than the direction:

Score of 78 that was 65 last week -- things are improving, keep going
Score of 90 that was 95 last week -- something degraded, investigate
Overall score steady but one domain declining -- new queries deployed without supporting indexes, or autovacuum falling behind on new tables

Making It Sustainable

The most effective way to use health scores:

Post-deployment checks -- Compare domain scores before and after. A 10-point drop in any domain warrants investigation.
Sprint retrospectives -- Spend 5 minutes reviewing which domains improved, which degraded, and whether new findings appeared.
Team KPI -- Share the dashboard alongside application metrics. When the score drops after a deployment, you have an early signal before it escalates.
Alert on degradation -- A 10-point drop in any domain within a week triggers a review. A new P1 finding triggers immediate action.

The key insight is that the delta tells you more than the absolute number. Track the direction, and small problems never compound into incidents.

Full article with trend visualizations: Tracking Your Database Health Score Over Time

Tracking Your Database Health Score Over Time

Why Point-in-Time Checks Are Not Enough

Building Health Score Tracking Yourself

The Three-Step Workflow

What to Watch For in the Trends

Making It Sustainable

Related Stories

Hiring Senior Full Stack Developer (Remote, USA)

How I Built a Multi-Tenant WhatsApp Automation Platform Using n8n and WAHA

I Built an Instant SEO Audit API — Here's What I Learned About Technical SEO

SJF4J: A Structured JSON Facade for Java

Tracking Your Database Health Score Over Time

Why Point-in-Time Checks Are Not Enough

Building Health Score Tracking Yourself

The Three-Step Workflow

What to Watch For in the Trends

Making It Sustainable

Related Stories

Hiring Senior Full Stack Developer (Remote, USA)

How I Built a Multi-Tenant WhatsApp Automation Platform Using n8n and WAHA

I Built an Instant SEO Audit API — Here's What I Learned About Technical SEO

SJF4J: A Structured JSON Facade for Java