Skip to main content

Daily ops

The short list of things to check each morning, the cron jobs that run overnight, and what to do when an alert fires.

Drafted from planning · v0.1

Morning scan (5 minutes)

Open the Kvick aggregate dashboard (planned). For each shop:

  • 5xx rate yesterday — should be 0% or near
  • Daily reconciliation status — should be clean
  • Audit chain verification — should be verified
  • Cron runs — should show daily run completed
  • AI spend MTD — should be under cap

If everything's green, move on with your day. The dashboard is for "confirm fine," not for live monitoring.

What the overnight crons do

Daily 00:00 PT — reconciliation

For each shop:

  • Compare transactions (status='paid') for yesterday vs Stripe charges
  • Run audit-chain verification end-to-end
  • Generate low-stock report
  • Write summary to shop_config.last_daily_run
  • Email summary to shop owner (if any anomalies)

Weekly Mondays 06:00 PT — owner digest

For each shop:

  • Compile last week's: revenue, top mechanic, top customer, top SKU
  • Email to shop owner
  • Archive previous week's audit logs to R2 if past 90-day threshold

Monthly 1st 02:00 PT — close

For each shop:

  • Generate sales-tax CSV (GST + PST breakouts)
  • Generate accountant-friendly P&L summary
  • Archive audit_events older than 90 days to R2
  • Truncate D1 audit_events past 90 days
  • Email package to shop owner + their accountant

Alerts and responses

AlertWhat it meansWhat to do
5xx rate > 1% sustainedWorker bug or downstream issueCheck wrangler tail, identify error class, deploy fix
Stripe webhook signature failure 3+/hrSecret may be rotated; webhook config may be wrongVerify the secret in Cloudflare matches Stripe dashboard
D1 latency p95 > 200ms 10min sustainedIndex missing or D1 health issueCheck D1 status; run EXPLAIN QUERY PLAN on slow queries
Audit chain verification failedHigh-priorityInvestigate immediately; freeze deploys; see disaster recovery
Twilio cost > 150% of budgetShop sending more SMS than expectedContact shop owner; check for marketing-campaign over-send
AI cost > monthly capShop has hit budgetWorker pauses AI automatically; email owner
Cron didn't runExpected start + 30min, no end logCheck Cloudflare cron status; manually re-run if needed

Manual cron re-run

If a cron didn't run:

# From any developer machine
curl -X POST https://helm-{slug}.kvick.bike/cron/daily-reconciliation \
-H "X-Cron-Secret: $env:CRON_SECRET"

The Worker has cron-handler endpoints that mirror the scheduled handlers; secret-gated so only Kvick can hit them.

Investigating a 5xx spike

  1. wrangler tail --env {slug} — stream logs
  2. Filter to 5xx: tail's filter UI, or grep the export
  3. Identify the error_class
  4. If it's a known class, fix and deploy
  5. If novel, get the request_id, look up the user, replicate locally
  6. Audit chain shows what mutations were attempted; useful when ops state is corrupt

Reading the audit log

For investigations, the audit log is the source of truth. See audit-everything for query patterns.

A common pattern:

SELECT e.at, e.staff_label, m.summary
FROM audit_events e JOIN audit_mutations m ON m.event_id = e.id
WHERE e.at > datetime('now', '-1 day')
ORDER BY e.at DESC;

Weekly owner contact

Once a week, send the owner a short email:

  • Summary of any blips and what we did about them
  • Heads-up on planned changes (schema migrations, new features)
  • "Anything you need?"

Quiet weeks: 1-2 sentence email. Busy weeks: a paragraph. Never radio silence — owners hate that.

See also