Daily ops
The short list of things to check each morning, the cron jobs that run overnight, and what to do when an alert fires.
Drafted from planning · v0.1
Morning scan (5 minutes)
Open the Kvick aggregate dashboard (planned). For each shop:
- 5xx rate yesterday — should be
0%or near - Daily reconciliation status — should be
clean - Audit chain verification — should be
verified - Cron runs — should show daily run completed
- AI spend MTD — should be under cap
If everything's green, move on with your day. The dashboard is for "confirm fine," not for live monitoring.
What the overnight crons do
Daily 00:00 PT — reconciliation
For each shop:
- Compare
transactions(status='paid') for yesterday vs Stripe charges - Run audit-chain verification end-to-end
- Generate low-stock report
- Write summary to
shop_config.last_daily_run - Email summary to shop owner (if any anomalies)
Weekly Mondays 06:00 PT — owner digest
For each shop:
- Compile last week's: revenue, top mechanic, top customer, top SKU
- Email to shop owner
- Archive previous week's audit logs to R2 if past 90-day threshold
Monthly 1st 02:00 PT — close
For each shop:
- Generate sales-tax CSV (GST + PST breakouts)
- Generate accountant-friendly P&L summary
- Archive audit_events older than 90 days to R2
- Truncate D1 audit_events past 90 days
- Email package to shop owner + their accountant
Alerts and responses
| Alert | What it means | What to do |
|---|---|---|
| 5xx rate > 1% sustained | Worker bug or downstream issue | Check wrangler tail, identify error class, deploy fix |
| Stripe webhook signature failure 3+/hr | Secret may be rotated; webhook config may be wrong | Verify the secret in Cloudflare matches Stripe dashboard |
| D1 latency p95 > 200ms 10min sustained | Index missing or D1 health issue | Check D1 status; run EXPLAIN QUERY PLAN on slow queries |
| Audit chain verification failed | High-priority | Investigate immediately; freeze deploys; see disaster recovery |
| Twilio cost > 150% of budget | Shop sending more SMS than expected | Contact shop owner; check for marketing-campaign over-send |
| AI cost > monthly cap | Shop has hit budget | Worker pauses AI automatically; email owner |
| Cron didn't run | Expected start + 30min, no end log | Check Cloudflare cron status; manually re-run if needed |
Manual cron re-run
If a cron didn't run:
# From any developer machine
curl -X POST https://helm-{slug}.kvick.bike/cron/daily-reconciliation \
-H "X-Cron-Secret: $env:CRON_SECRET"
The Worker has cron-handler endpoints that mirror the scheduled handlers; secret-gated so only Kvick can hit them.
Investigating a 5xx spike
wrangler tail --env {slug}— stream logs- Filter to 5xx: tail's filter UI, or grep the export
- Identify the error_class
- If it's a known class, fix and deploy
- If novel, get the request_id, look up the user, replicate locally
- Audit chain shows what mutations were attempted; useful when ops state is corrupt
Reading the audit log
For investigations, the audit log is the source of truth. See audit-everything for query patterns.
A common pattern:
SELECT e.at, e.staff_label, m.summary
FROM audit_events e JOIN audit_mutations m ON m.event_id = e.id
WHERE e.at > datetime('now', '-1 day')
ORDER BY e.at DESC;
Weekly owner contact
Once a week, send the owner a short email:
- Summary of any blips and what we did about them
- Heads-up on planned changes (schema migrations, new features)
- "Anything you need?"
Quiet weeks: 1-2 sentence email. Busy weeks: a paragraph. Never radio silence — owners hate that.