Security model
How Helm protects shop data — who can sign in, what they can see, how mutations are recorded, where secrets live.
Layer 1 (Google OAuth → device_sessions, 30-day sliding) and Layer 2 (PIN → staff_sessions, 60-second idle) are both wired and live at the deployed Worker. Audit-event/mutation writes are live on every mutating endpoint. API-level gating (refusing anonymous callers on endpoints) and the daily chain-hash verification cron are the remaining slice-1 follow-ons.
Threat model in one paragraph
Each Helm deployment serves one shop on a public Worker URL (helm-{shop}.kvick.bike or, for Swicked today, mockup-only-swicked-helm.solitary-poetry-5fcc.workers.dev). The attack surface is that URL, the operator browsers signing in from inside the shop, and the secrets stored in Cloudflare for outbound API calls. Realistic threats: a former employee whose access wasn't revoked, a credential-stuffing bot pounding the login endpoint, a misconfigured CORS allowing a malicious site to read API responses, a compromised Stripe key. The audit trail exists so that even when a threat lands, we can reconstruct what happened.
Authentication — two layers
ADR-0026 is the design. Summary:
| Layer 1 — device | Layer 2 — staff | |
|---|---|---|
| Asks | Is this browser allowed to use Helm? | Which staff is at the till right now? |
| Mechanism | Google OAuth → email allowlist → device_sessions row + helm_device cookie | PBKDF2-SHA256 5-digit PIN → staff_sessions row + helm_session cookie |
| TTL | 30-day sliding (refreshed on every authenticated request) | Configurable idle lockout (default 60 s, three-tier resolvable), 12-hour hard expiry |
| Friction | One time per device per 30 days | Whatever idle interval the Owner / role / staff has set |
A request hits the Worker. The Worker checks helm_device:
- No
helm_device→ 302 redirect toGET /api/auth/google/start - Valid
helm_devicebut nohelm_session→ render the PIN overlay - Both valid → handle the request
Layer 1 — Google OAuth + device sessions
- Each shop has a per-shop allowlist of authorized Google emails in
shop_config.authorized_google_emails(JSON array, lowercased). Day-one Swicked allowlist:tanderson1963@gmail.com,james@swicked.com,chenoa@swicked.com,hello@kvick.ca. - OAuth flow: Authorization Code with PKCE.
GET /api/auth/google/start— Worker generatesstate+ PKCE code-verifier (stored short-term in ahelm_oauthcookie), redirects to Google's authorization endpointGET /api/auth/google/callback— Worker validatesstate, exchanges code for tokens, checks the email's lowercase form againstauthorized_google_emails, mints adevice_sessionsrow +helm_devicecookie, redirects to/
- The
helm_devicecookie isHttpOnly,SameSite=Lax,Secure,Max-Age=2592000(30 days). Server-side state, not a JWT. - Every authenticated request slides
device_sessions.last_seen_atandexpires_atforward. A 30+ day idle browser falls back to OAuth. - Sign-out:
POST /api/auth/google/signoutmarks the row revoked (revoked_at,revoked_reason). - Admin allowlist management:
GET/POST/DELETE /api/auth/allowlist(Sys Admin only).
Layer 2 — per-staff PIN (ADR-0013)
- Each staff row has
pin_hash(PBKDF2-SHA256, 100K iterations, 16-byte random salt) andpin_salt - 5-digit PINs — fast to type during register-speed handoffs
- Sign-in:
POST /api/auth/loginwith{pin}(only callable on a request that already has a validhelm_devicecookie) - Lockout: 5 wrong attempts in 60 seconds → 5-minute lockout. Per-IP rate-limit is the planned defence-in-depth follow-on.
- No "Sign in" button — the operator app opens with a lockout overlay. Idle timeout is three-tier resolvable (migration 018, 2026-05-14):
COALESCE(staff.idle_lockout_seconds, roles.idle_lockout_seconds, shop_config.idle_lockout_seconds). Encoding:NULL= inherit from the layer below;0= never lock (e.g., a Sys Admin diagnosing without re-PIN'ing every minute);15..3600= seconds. CHECK constraints enforce the range so an Owner can't pick a 5-second lockout that would interrupt every transaction. The resolved value lands on/api/auth/me; the frontend uses it instead of a hard-coded constant.
The 100K PIN keyspace is brute-forceable in seconds without rate limits — but with Layer 1, an attacker can't even reach the PIN screen without an allowlisted Google identity.
The admin reset code
A 6-digit ADMIN_RESET_CODE (currently 466687 in src/index.js) still does two things, after Layer 1 succeeds:
- PIN reset — entered in the PIN-reset modal lets the operator set a new PIN for any staff member
- Direct sign-in — entered at the lockout overlay signs in as the auto-created
Sys Adminstaff row
It does not bypass Google OAuth. A browser with no helm_device cookie sees the OAuth redirect before any PIN UI shows.
Session management
Two cookies, two tables:
helm_device → device_sessions row. 30-day sliding TTL. One row per browser-per-OAuth-login. Persists across PIN sign-outs.
helm_session → staff_sessions row. 60-second idle, 12-hour hard expiry. One row per active till sign-in.
Both are server-side state. Revocation:
- Device: mark
device_sessions.revoked_at = now()→ that browser hits OAuth next request - Staff:
DELETE FROM staff_sessions WHERE id = ?or wait for idle - Mass device revoke (e.g., employee left): remove their email from
shop_config.authorized_google_emails; a future sweep cron revokes all matchingdevice_sessionsrows automatically.
Authorization — role + per-staff overrides
Three layers stack:
- Built-in screen list (
screenstable) — Today, Sales, Customers, Service, Inventory, Trades, Rentals, Orders, Reports, Settings - Role defaults (
role_permissionstable) — for eachrole × screen, the defaultcan_see. Roles: Sys Admin, Owner, Sales, Mechanic, Junior, Service Lead. - Per-staff overrides (
staff_screen_permissionstable) — for eachstaff × screen, an explicitcan_seethat wins over the role default. NULL = "use role default."
Resolved permission: COALESCE(staff_override, role_default).
The Settings → Staff & permissions UI exposes both layers. Toggling a tab on a staff card writes to staff_screen_permissions for that staff member only. "Reset to role defaults" clears all override rows for that staff.
See permission model for the table shapes.
API authorization — currently gap, soon hardened
Today: Auth identifies who's calling but most endpoints do not refuse anonymous callers. This is a deliberate slice-1 gap, listed prominently in current state. Frontend nav-tab gating from the resolved screen permissions is the next concrete piece.
Soon: A middleware before the router checks the resolved staff+permissions and gates each endpoint per the screen its data belongs to:
GET /api/customers/*requiresscreen:customersvisibilityPOST /api/tickets/*requiresscreen:serviceSettings → /api/staff/*requiresscreen:settingsANDrole:owner|sys_admin
A small map from URL pattern to required screen lives at the top of src/index.js.
Audit trail (live)
Every mutating request writes two rows through the recordMutation helper:
audit_events — high-level: id, chain_hash, prev_chain_hash, actor_id, actor_label, action ('customer.update'), target_table, target_id, at, ip, user_agent, request_id, summary
The chain_hash = sha256(prev_chain_hash || canonical_json(this_event)). Tampering with any past row breaks the chain forward. The daily chain-verify cron (not yet wired) will rehash the chain and report breaks.
audit_mutations — detail: id, event_id (FK to audit_events), table_name, row_id, before_json, after_json, diff_json
The wrapper around mutating endpoints captures the row state pre- and post-mutation. Diff is precomputed at write time; storage is the snapshots + diff.
See audit-everything and ADR-0005: Tamper-chain audit log.
Secrets management
Secrets live in Cloudflare Worker bindings, never in source:
STRIPE_SECRET_KEY(per shop)STRIPE_WEBHOOK_SECRET(per shop)TWILIO_AUTH_TOKEN(Kvick account-level; sub-accounts via SID)ANTHROPIC_API_KEY(Kvick account-level)ADMIN_RESET_CODE(per shop, rotated on deploy in production)SESSION_SIGNING_KEY(per shop, used to sign session tokens server-side)
Local dev uses .dev.vars (gitignored). Production uses wrangler secret put.
CORS
The operator app is served from the same Worker that serves the API. No cross-origin requests, no CORS configuration needed for the operator UI.
The public marketing site (separate Worker, separate origin) does not call any Helm API. If a future integration requires it, an explicit allowlist with Access-Control-Allow-Origin: https://swickedcycles.com is added per-endpoint.
Stripe webhook verification
The Stripe webhook endpoint (POST /api/webhooks/stripe) verifies Stripe-Signature against STRIPE_WEBHOOK_SECRET before processing the event. A failing signature returns 400 and no DB writes happen. See ADR-0011.
What's intentionally not in this model
- 2FA / TOTP — the threat profile doesn't justify the operator friction. POS staff sign in 30+ times a day; a 6-digit TOTP would make the system unusable. PIN + lockout + audit is the right point on the friction curve.
- SSO / SAML — for shops, no. Each shop's staff are local to that shop. The bible (this site) uses Cloudflare Access for SSO because it serves Kvick + maybe shop owners.
- End-to-end encryption — the shop is the data owner and Kvick is the operator; there is no third party to encrypt against. Database-at-rest is encrypted by Cloudflare; transport is TLS.
- DDoS protection — Cloudflare's default protections sit in front of the Worker. No additional layer needed.
See also
- Audit-everything principle
- Identity slice
- ADR-0005: Tamper-chain audit log
- ADR-0013: PBKDF2 PIN hashing — Layer 2 detail
- ADR-0026: Google OAuth + device sessions — Layer 1 decision rationale
- Disaster recovery — what to do when a session token leaks