WHAT THIS IS

A real health monitor has four jobs.

Most health-score systems are dashboards. They show a number and let humans figure out what to do with it. That's not what this automation is. The job of a real health monitor is to detect a customer's health change earlier than a human would notice, decide what action that change calls for, and trigger that action in the system that owns it — without a human being asked to interpret a score.

Four jobs. One: pull signals from three different domains daily — product usage, engagement, commercial. Single-domain scores miss obvious risks (a customer can be using the product fine while their primary contact has gone silent). Two: score with reasoning, not just a number. The CSM has to be able to read the top three contributing factors in 5 seconds. Three: route by category. Healthy gets advocacy outreach, watch gets digest review, at-risk gets real-time alert plus an AI-generated save plan. Same scoring engine, three different downstream behaviors. Four: log every score with full context so you can backtest the model six months in.

When this is built right, your CSMs catch churn 30–60 days earlier than their gut would, save-plan prep time drops from 90 minutes to under 10, and your retention improves 4–8 percentage points within a year. When it's built wrong, you've shipped an alert system that fires on noise, CSMs mute the channel, and the automation actively makes retention worse by burning out the team.

BEFORE

Quarterly health reviews

CSM team does a quarterly health review by hand — eyeballs the dashboard, picks the 20 accounts that look bad, schedules calls. By the time the review happens, the at-risk accounts have already been bleeding for two months. Three of them have already verbally agreed to evaluate a competitor. The review catches the obvious cases and misses the silent ones — the customer who's still paying but disengaged.

AFTER

Daily score with intervention triggers

Every customer scored every day. Sarah's account drops from 78 to 52 on Tuesday — usage down 60% after a key champion left, no CSM contact in 3 weeks. CSM Slack DM fires at 9am Wednesday with the score breakdown and an AI-generated save plan. CSM books a call by Wednesday afternoon. By Friday, the customer's primary contact has been replaced and the score is climbing. 30 days earlier than the quarterly review would have caught it.

FIT CHECK

Who this is for, who it isn't.

Health monitoring pays back fastest for B2B SaaS with subscription revenue and a CSM team — but only if you've got the data infrastructure to feed signals into the scorer. Without product analytics + CRM + billing data wired to the same warehouse, this automation produces noise instead of insight.

HIGH LEVERAGE FOR

Build this if any of these are true.

You're a B2B SaaS with subscription revenue, 200+ active customers, and at least one CSM (or PSM, or implementation manager) responsible for retention.
Your gross retention rate is below 92% and you can't pin down where the leakage comes from. Health monitoring surfaces the categorical patterns first.
You have product analytics with reliable event tracking, a CRM with active CSM workflows, and a billing system you can query (Stripe, Chargebee, etc.). Without all three, the score is incomplete.
You're growing customer count faster than CSM headcount. Manual health reviews stop scaling around 75 accounts per CSM; this is what extends that to 200+.
You've already built the customer onboarding sequence automation. Health monitoring is the natural next step downstream — they share the same product event source.

SKIP IF

Skip or wait if any of these are true.

You don't have product analytics yet. Build that first. A health score without usage data is reading half the patient.
You're under 100 customers. Manual quarterly reviews are still cheaper and more accurate at that scale; the per-customer signal volume is too low for the AI to do better than a competent CSM.
Your CS team isn't sized to handle the alerts this will surface. The automation is going to find more at-risk customers than the quarterly review did. If there's no CSM to action them, you've built a sadness factory.
Your retention is already at 95%+. The marginal gains are small and the automation is hard to justify. Spend the build budget elsewhere.
You're hoping this replaces CSMs. It won't. The good version makes one CSM as effective as two; it doesn't replace them.

Decision rule: If you've got 200+ customers, gross retention under 92%, and the data infrastructure to feed three signal domains into a scoring engine, this is one of the highest-leverage Tier-2 automations available. Skip if data plumbing isn't ready or if your CS team can't absorb the alerts.

THE HONEST MATH

What this saves, by the numbers.

The savings here are mostly retained ARR from churn caught early — that line dominates everything else. CSM time saved on save-plan prep is real but smaller. Expansion revenue surfaced from healthy-tier signals is the third source, often underestimated.

UNIVERSAL FORMULA

(At-risk accounts caught × save rate × ACV) + (CSM hrs saved × loaded hourly cost) + (expansion surfaced × ACV × close rate)

At-risk accounts caught = the customers the automation flags 30–60 days earlier than the quarterly review would. Save rate = realistic 30–50% of caught at-risks (it's hard to save a customer who's already 80% out the door). Expansion surfaced = healthy-tier customers with growth signals the AE wouldn't have seen otherwise.

SMALL OPERATOR

300 customers · $4K ACV · 88% gross retention

$60K

per year saved

AT-RISK CAUGHT: 36/yr × 35% × $4K = $50K CSM TIME: 180 hrs × $70 = $13K EXPANSION: $24K (gross) MINUS BUILD + TOOLING: $27K NET YEAR 1: ~$60K

MID-SIZE

1,200 customers · $24K ACV · 90% gross retention

$240K

per year saved

AT-RISK CAUGHT: 120/yr × 40% × $24K = $1.15M (gross) CSM TIME: 720 hrs × $80 = $58K EXPANSION: $360K (gross) MINUS TOOLING + OPS: $48K NET YEAR 2+: ~$240K conservative

LARGER SCALE

8,000 customers · $96K ACV · 92% gross retention

$540K

per year saved

AT-RISK CAUGHT: 640/yr × 45% × $96K = $27.6M (gross) CSM TIME: 3,200 hrs × $90 = $288K EXPANSION: $2.3M (gross) MINUS TOOLING + OPS + DATA: $120K NET YEAR 2+: ~$540K conservative

What's not in those numbers: Compound retention impact (a customer saved in year 2 keeps paying in years 3 and 4 — typical NPV multiplier is 2.5–4×), reduced CSM burnout from clearer triage, NPS lift from earlier intervention, and second-order effects on net revenue retention from earlier expansion identification. Most operators see 2–3× the conservative numbers above by year three as the model accumulates training signal.

HOW IT WORKS

The architecture, end to end.

Customer health architecture is fundamentally different from event-driven automations — it's a daily polling loop, not a triggered pipeline. The trunk pulls three signal domains (product, engagement, commercial), the AI scores and categorizes, and three downstream paths route by tier. Healthy customers feed the advocacy and expansion engines. Watch-tier surfaces in the CSM weekly review with auto-nudges. At-risk gets real-time alerts plus a generated save-plan template. Click any node for the architectural detail; click a path label to highlight one route.

+ Click any node to expand. Click a path label below to highlight one route through the graph.

HEALTHY WATCH AT RISK

TRUNK · DAILY POLLING + SCORING

⏱

TRIGGER

Daily polling cycle

Time-driven, not event-driven. Runs daily for every active customer at 3am local time.

02

PRODUCT

Pull usage + adoption metrics

WAU, feature adoption, time since last login, error rate. Each signal vs 30-day baseline.

03

ENGAGEMENT

Pull touchpoint + sentiment

CSM touches, NPS, sentiment from meeting notes, days since contact response.

04

COMMERCIAL

Pull billing + contract state

Failed payments, renewal proximity, downgrade events, expansion signals.

AI

AI / SCORE

Calculate health score + reason

Score 0–100, category, top 3 factors, predicted churn window, narrative for the CSM.

PATH · HEALTHY

✓

HEALTHY

Update score + check expansion

No CSM action. Expansion signals flagged to AE. Healthy customers are where expansion revenue lives.

✓↓

HEALTHY

Add to advocacy queue

Strong NPS / positive sentiment customers tagged for case studies, references, reviews, CAB.

PATH · WATCH

◐

WATCH

Surface in CSM weekly review

No real-time alerts. Monday digest with trendline, factors, recommended action.

◐↓

WATCH

Trigger automated nudge

CSM-branded auto-send keyed to the specific contributing factor. Only auto-send in the flow.

PATH · AT RISK

⚠

AT RISK

Real-time CSM alert

Slack DM to CSM (and lead for accounts above ARR threshold) within 60 sec of score change.

⚠↓

AT RISK

Generate save-plan template

AI-generated plan specific to the contributing factors. Saves CSM 90 min of prep per account.

OUTPUT

✓

OUTPUT

Log to score history

Every score logged. 6 months of history turns this from a tool into a retention playbook.

TOOLS YOU'LL USE

Stack combinations that actually work.

Three stack combinations cover most builds. The decision usually comes down to where your data already lives — if you've got a warehouse, build there; if you don't, use a vertical SaaS that brings its own data layer. The vertical SaaS option is faster to ship but less flexible at scale.

COMBO 1

Vitally + Salesforce + Amplitude

$640–$1,800/mo

Vitally / Catalyst· health-score engine Salesforce· CRM + workflow Amplitude / Mixpanel· product analytics

Tradeoff: Fastest to ship. Vitally and Catalyst handle the score engine + CSM workflow + alerting natively. Salesforce keeps the CRM record canonical. Amplitude provides the product-event source. Hits a ceiling when you need custom scoring logic — vertical SaaS lets you tune weights but not radically change the model.

COMBO 2

HubSpot + dbt + Make.com + Claude

$320–$900/mo

HubSpot· CRM + workflow Make.com + dbt· orchestration + transformation Claude Sonnet· scoring + reasoning

Tradeoff: The HubSpot stack with custom scoring. dbt models the signals daily, Make orchestrates, Claude scores. Cheaper than Vitally and gives you full control over the scoring logic and prompt. Higher build complexity. Best for HubSpot-native shops with at least a part-time data engineer.

COMBO 3

Self-hosted: Postgres + n8n + Claude

$160–$520/mo

Any CRM· system of record n8n + Postgres· workflow + score history Claude Sonnet· scoring

Tradeoff: Cheapest at scale. Self-hosted n8n on a $40/mo server, Postgres for score history, Claude Sonnet for the scoring prompt (~$0.30/customer/month at 8,000 customers). Best for compliance-heavy industries that can't ship customer data to a third-party SaaS. Highest build complexity.

MINIMUM VIABLE STACK

HubSpot Pro + Make.com + GPT-4o-mini

Cheapest viable. HubSpot Professional for the CRM and CSM workflows, Make.com for orchestration ($30/mo), GPT-4o-mini for scoring (~$0.10/customer/month). Skip the dbt layer for v1 — write the SQL in Make. Validates the model before investing in proper data infrastructure. About $300/mo for a 200-customer business.

PRODUCTION-GRADE STACK

Vitally + Salesforce + Amplitude + Slack

Production-grade for 1,000+ customers. Vitally Enterprise (~$800/mo), Salesforce Enterprise, Amplitude Growth tier, Slack Enterprise Grid. About $1,500–$2,400/mo all-in. Adds the white-glove vendor-managed experience and supports the customer health team workflows out of the box.

THE BUILD PATH

How to actually build this.

Six steps from zero to a production health monitor. The biggest mistake teams make is shipping the at-risk alerts before they've validated the scoring model — you fire 40 false-positive alerts in week one, the CSM team mutes the channel, and the automation never recovers trust.

01

Define what at-risk means for your business

Pin down what 'at-risk' actually predicts. Pull two years of churned customer data. For each, document what their usage looked like 60 days before churn, what their engagement looked like, what their commercial state looked like. The patterns that show up across 70%+ of churned customers are your at-risk signals.

What's at risk: Defining at-risk by gut feel. Without backtesting against actual churned customers, you're scoring on hypothesis, not data. The CSMs will know within a month and stop trusting the score.

ESTIMATE 5–10 days

02

Get all three signal sources into one place

Product analytics, CRM, and billing data have to be queryable from one engine. If you have a data warehouse, build dbt models that join them. If you don't, build the daily extract job that pulls from each source into a Postgres table. Without unified data, the AI can't read the full picture per customer.

What's at risk: Stale data. If product analytics is 24 hours behind and billing is real-time, the score is calculating against inconsistent snapshots. Make freshness explicit — every signal table needs an as-of-date column.

ESTIMATE 8–14 days

03

Build the scoring prompt + validate against history

Write the AI scoring prompt with your at-risk definition embedded. Expected output: score 0–100, category, top 3 contributing factors with weights, predicted churn window. Validate against last 18 months of customers — does the model would have caught the customers who actually churned? Aim for 70%+ recall before going live. Iterate on the prompt until you hit it.

What's at risk: Prompt that scores well on the validation set but degrades on new data. Hold out the most recent 3 months of customers from the validation set as a true test — the prompt should perform similarly there.

ESTIMATE 7–14 days

04

Build the three routing paths

Healthy: write to CRM, check expansion signals, queue advocacy candidates. Watch: weekly review digest, automated nudge from CSM email. At-risk: real-time Slack alert, save-plan template generation. Build them in order of business risk — at-risk first (revenue impact), watch second, healthy last.

What's at risk: Alert fatigue at the at-risk tier. The first version will fire too many alerts. Add a CSM-feedback loop that lets them mark alerts as 'real' or 'noise' — the false-positive rate is what you tune the threshold against.

ESTIMATE 8–12 days

05

Wire up score history + observability

Every score, category, and contributing factor written to a score history table with full snapshot of the input signals. Build the observability dashboard: score distribution, category transitions over time, alert volume, alert-to-action conversion rate, model accuracy over time. Without this, you can't tune anything.

What's at risk: Score history without snapshot signals. If you log only the score and not the inputs that produced it, you can't backtest model changes — you'd be retroactively scoring against current data, not historical data.

ESTIMATE 4–7 days

06

Run quarterly model audits

Last step is ongoing: every quarter, audit the scoring model against actual outcomes. Of the customers marked at-risk in Q1, how many churned? How many did you save? Of the customers marked healthy in Q1 who churned, what signals did the model miss? Update the prompt or the at-risk definition based on what the audit surfaces.

What's at risk: Skipping the audit and assuming the model is still calibrated. Customer behavior shifts (new product features, pricing changes, market conditions) — the model has to evolve too. Without quarterly audits, accuracy decays silently.

ESTIMATE 3–5 days · ongoing

TOTAL BUILD TIME 4–8 weeks · 1 builder + 1 data engineer

COMMON ISSUES & FIXES

Where this fails in real deployments.

Five failure modes that wreck health-score systems in production. Every team that builds this has hit at least three.

01

Alert fatigue kills the channel

Week one of go-live, the at-risk alert fires 50 times across the team. Half of them turn out to be false positives — customers who briefly dropped usage during a vacation, or accounts the CSM was already actively working. By week three, CSMs are muting the alert channel. By week six, real at-risk alerts are being missed in muted channels.

How to avoid: Tier the alerts by ARR and confidence. High-ARR + high-confidence at-risk = Slack DM to the CSM directly. Medium = goes to the weekly digest. Low = gets logged but not surfaced. Add a CSM-feedback button on every alert ('this was real' / 'this was noise') and use the feedback to tune thresholds. Cap the maximum daily alert volume per CSM at 5.

02

The model misses the silent churners

Customer with steady usage, no support tickets, no obvious risk signals. They churn anyway because their primary contact left for a competitor and the new champion never engaged. The model never flagged them because all the surface signals looked fine. Six months in, you realize the model has 80% recall on loud churners and 20% recall on silent ones.

How to avoid: Add an 'engagement breadth' signal that tracks how many distinct people from the account are using the product. A drop from 8 to 1 logged-in user matters even if total usage is steady. Add 'champion change' detection: if your primary contact stops responding while a new contact starts engaging, that's a yellow flag worth catching.

03

The score becomes a vanity metric

Six months in, every CS team meeting starts with 'health scores are up.' Average score across the customer base is climbing. But gross retention isn't improving. Eventually you realize the CSMs have been gaming the score — closing tickets faster, scheduling pro-forma check-ins, reaching out just before each scoring run. The score is a measurement, not an outcome.

How to avoid: Never make the score itself a CSM KPI. The KPI is gross retention rate. The score is a tool to drive that KPI. Audit the relationship quarterly — if scores are climbing without retention climbing, the model has been gamed and needs revision. Penalize signals that are easy to manipulate (last-touch dates, ticket close rates) by weighting them lower than usage signals.

04

Stale signal data triggers wrong alerts

Product analytics pipeline breaks Monday morning. By Tuesday the daily score is calculating against 36-hour-old usage data while billing is real-time. Several customers show as 'usage dropped to zero' — but they were just blocked by the data lag. Real-time at-risk alerts fire to CSMs about customers who are actually fine. CSMs reach out, customers are confused, trust degrades.

How to avoid: Every signal source needs a freshness check before scoring runs. If product analytics is more than 6 hours stale, skip the daily run for affected customers and alert the data team. Score-history table should record which signals were available — never silently score against stale data.

05

Save-plan templates feel generic

AI-generated save plans go out and the first customer responds 'this is clearly a template — you don't actually understand our business.' Looks worse than no outreach at all. CSMs stop using the templates, fall back to writing from scratch, and the 90-minutes-saved benefit evaporates.

How to avoid: The save plan is a draft for the CSM, not an outbound message. Pull customer-specific context into every paragraph: their actual usage drop with numbers, their actual contributing factors, the actual case study from a similar customer. The CSM reviews and personalizes for 5 minutes before sending — not 90, not 0. Auto-send is never enabled at the at-risk tier.

DIY VS HIRE

Build it yourself, or get help.

This is a Tier-2 build because it requires real data infrastructure, not just orchestration. The hardest part isn't the AI scoring — it's getting product, engagement, and commercial signals into one queryable place. If your data is already unified, the build is faster than it looks.

DO IT YOURSELF

Build it yourself

If you have a data engineer or a working data warehouse already.

SKILL RevOps + part-time data engineer. SQL fluency required. Comfortable with Make/n8n, dbt or equivalent transformation layer, and prompt engineering. CRM workflow knowledge for routing.

TIME 120–200 hours of build over 4–8 calendar weeks, plus 6–10 hours per quarter for model audits and prompt tuning ongoing.

CASH COST $0 in services. Tooling adds $160–$900/mo depending on customer count and stack. Data warehouse if you don't have one adds $100–$500/mo.

RISK Underestimating the data unification work. If your product analytics, CRM, and billing data don't already join cleanly, that's the biggest part of the build — not the AI scoring.

HIRE A PARTNER

Hire a partner

If retention is bleeding right now and you don't have data engineering capacity.

SCOPE Full design + build of the health monitor including at-risk definition workshop, data unification across product/CRM/billing sources, scoring prompt with backtesting against historical churn, three routing paths, save-plan generator, observability dashboard, and a 90-day tuning playbook.

TIMELINE 5–9 weeks from contract signed to fully shipped. 30-day stabilization where the partner monitors model accuracy and tunes thresholds.

CASH COST $22K–$60K project cost depending on data complexity and CRM choice. Higher end if data unification is part of scope (most teams need this).

PAYBACK 3–8 months for most B2B SaaS doing $10M+ ARR with retention under 92%. Faster if you've already had named at-risk accounts churn that you wish you'd caught earlier.

BEFORE YOU REACH OUT

Want to get in touch with a partner to build this for you? Run the free audit first. It gives any partner the context they need on your business — your stack, your volume, your highest-leverage automation — so the first conversation is about scope, not discovery.

Run the free audit

Decision rule: If you have a working data warehouse and a data engineer, build it yourself — the work is more about discipline than expertise. If your data is fragmented across systems or retention is bleeding now, hire a partner. The data unification work is what makes this a hard DIY without engineering capacity.

RELATED AUTOMATIONS

Automations that pair with this one.

TOOL DECISIONS

Customer health + churn monitor automation.

A real health monitor has four jobs.

Quarterly health reviews

Daily score with intervention triggers

Who this is for, who it isn't.

Build this if any of these are true.

Skip or wait if any of these are true.

What this saves, by the numbers.

The architecture, end to end.

Stack combinations that actually work.

How to actually build this.

Define what at-risk means for your business

Get all three signal sources into one place

Build the scoring prompt + validate against history

Build the three routing paths

Wire up score history + observability

Run quarterly model audits

Where this fails in real deployments.

Alert fatigue kills the channel

The model misses the silent churners

The score becomes a vanity metric

Stale signal data triggers wrong alerts

Save-plan templates feel generic

Build it yourself, or get help.

Build it yourself

Hire a partner

Automations that pair with this one.

The matchups that come up while building this.

Want to know if this is the highest-leverage automation for your business?

Customer health + churn monitor automation.

A real health monitor has four jobs.

Quarterly health reviews

Daily score with intervention triggers

Who this is for, who it isn't.

Build this if any of these are true.

Skip or wait if any of these are true.

What this saves, by the numbers.

The architecture, end to end.

Stack combinations that actually work.

How to actually build this.

Define what at-risk means for your business

Get all three signal sources into one place

Build the scoring prompt + validate against history

Build the three routing paths

Wire up score history + observability

Run quarterly model audits

Where this fails in real deployments.

Alert fatigue kills the channel

The model misses the silent churners

The score becomes a vanity metric

Stale signal data triggers wrong alerts

Save-plan templates feel generic

Build it yourself, or get help.

Build it yourself

Hire a partner

Automations that pair with this one.

Customer onboarding sequence

Meeting notes + action items

Support ticket routing

The matchups that come up while building this.

HubSpot vs Salesforce

Claude API vs OpenAI API

Want to know if this is the highest-leverage automation for your business?