Skip to main content

Security model

How Helm protects shop data — who can sign in, what they can see, how mutations are recorded, where secrets live.

v0.3 — two-layer auth live in production (2026-05-13)

Layer 1 (Google OAuth → device_sessions, 30-day sliding) and Layer 2 (PIN → staff_sessions, 60-second idle) are both wired and live at the deployed Worker. Audit-event/mutation writes are live on every mutating endpoint. API-level gating (refusing anonymous callers on endpoints) and the daily chain-hash verification cron are the remaining slice-1 follow-ons.

Threat model in one paragraph

Each Helm deployment serves one shop on a public Worker URL (helm-{shop}.kvick.bike or, for Swicked today, mockup-only-swicked-helm.solitary-poetry-5fcc.workers.dev). The attack surface is that URL, the operator browsers signing in from inside the shop, and the secrets stored in Cloudflare for outbound API calls. Realistic threats: a former employee whose access wasn't revoked, a credential-stuffing bot pounding the login endpoint, a misconfigured CORS allowing a malicious site to read API responses, a compromised Stripe key. The audit trail exists so that even when a threat lands, we can reconstruct what happened.

Authentication — two layers

ADR-0026 is the design. Summary:

Layer 1 — deviceLayer 2 — staff
AsksIs this browser allowed to use Helm?Which staff is at the till right now?
MechanismGoogle OAuth → email allowlist → device_sessions row + helm_device cookiePBKDF2-SHA256 5-digit PIN → staff_sessions row + helm_session cookie
TTL30-day sliding (refreshed on every authenticated request)Configurable idle lockout (default 60 s, three-tier resolvable), 12-hour hard expiry
FrictionOne time per device per 30 daysWhatever idle interval the Owner / role / staff has set

A request hits the Worker. The Worker checks helm_device:

  • No helm_device → 302 redirect to GET /api/auth/google/start
  • Valid helm_device but no helm_session → render the PIN overlay
  • Both valid → handle the request

Layer 1 — Google OAuth + device sessions

  • Each shop has a per-shop allowlist of authorized Google emails in shop_config.authorized_google_emails (JSON array, lowercased). Day-one Swicked allowlist: tanderson1963@gmail.com, james@swicked.com, chenoa@swicked.com, hello@kvick.ca.
  • OAuth flow: Authorization Code with PKCE.
    • GET /api/auth/google/start — Worker generates state + PKCE code-verifier (stored short-term in a helm_oauth cookie), redirects to Google's authorization endpoint
    • GET /api/auth/google/callback — Worker validates state, exchanges code for tokens, checks the email's lowercase form against authorized_google_emails, mints a device_sessions row + helm_device cookie, redirects to /
  • The helm_device cookie is HttpOnly, SameSite=Lax, Secure, Max-Age=2592000 (30 days). Server-side state, not a JWT.
  • Every authenticated request slides device_sessions.last_seen_at and expires_at forward. A 30+ day idle browser falls back to OAuth.
  • Sign-out: POST /api/auth/google/signout marks the row revoked (revoked_at, revoked_reason).
  • Admin allowlist management: GET/POST/DELETE /api/auth/allowlist (Sys Admin only).

Layer 2 — per-staff PIN (ADR-0013)

  • Each staff row has pin_hash (PBKDF2-SHA256, 100K iterations, 16-byte random salt) and pin_salt
  • 5-digit PINs — fast to type during register-speed handoffs
  • Sign-in: POST /api/auth/login with {pin} (only callable on a request that already has a valid helm_device cookie)
  • Lockout: 5 wrong attempts in 60 seconds → 5-minute lockout. Per-IP rate-limit is the planned defence-in-depth follow-on.
  • No "Sign in" button — the operator app opens with a lockout overlay. Idle timeout is three-tier resolvable (migration 018, 2026-05-14): COALESCE(staff.idle_lockout_seconds, roles.idle_lockout_seconds, shop_config.idle_lockout_seconds). Encoding: NULL = inherit from the layer below; 0 = never lock (e.g., a Sys Admin diagnosing without re-PIN'ing every minute); 15..3600 = seconds. CHECK constraints enforce the range so an Owner can't pick a 5-second lockout that would interrupt every transaction. The resolved value lands on /api/auth/me; the frontend uses it instead of a hard-coded constant.

The 100K PIN keyspace is brute-forceable in seconds without rate limits — but with Layer 1, an attacker can't even reach the PIN screen without an allowlisted Google identity.

The admin reset code

A 6-digit ADMIN_RESET_CODE (currently 466687 in src/index.js) still does two things, after Layer 1 succeeds:

  1. PIN reset — entered in the PIN-reset modal lets the operator set a new PIN for any staff member
  2. Direct sign-in — entered at the lockout overlay signs in as the auto-created Sys Admin staff row

It does not bypass Google OAuth. A browser with no helm_device cookie sees the OAuth redirect before any PIN UI shows.

Session management

Two cookies, two tables:

helm_devicedevice_sessions row. 30-day sliding TTL. One row per browser-per-OAuth-login. Persists across PIN sign-outs.

helm_sessionstaff_sessions row. 60-second idle, 12-hour hard expiry. One row per active till sign-in.

Both are server-side state. Revocation:

  • Device: mark device_sessions.revoked_at = now() → that browser hits OAuth next request
  • Staff: DELETE FROM staff_sessions WHERE id = ? or wait for idle
  • Mass device revoke (e.g., employee left): remove their email from shop_config.authorized_google_emails; a future sweep cron revokes all matching device_sessions rows automatically.

Authorization — role + per-staff overrides

Three layers stack:

  1. Built-in screen list (screens table) — Today, Sales, Customers, Service, Inventory, Trades, Rentals, Orders, Reports, Settings
  2. Role defaults (role_permissions table) — for each role × screen, the default can_see. Roles: Sys Admin, Owner, Sales, Mechanic, Junior, Service Lead.
  3. Per-staff overrides (staff_screen_permissions table) — for each staff × screen, an explicit can_see that wins over the role default. NULL = "use role default."

Resolved permission: COALESCE(staff_override, role_default).

The Settings → Staff & permissions UI exposes both layers. Toggling a tab on a staff card writes to staff_screen_permissions for that staff member only. "Reset to role defaults" clears all override rows for that staff.

See permission model for the table shapes.

API authorization — currently gap, soon hardened

Today: Auth identifies who's calling but most endpoints do not refuse anonymous callers. This is a deliberate slice-1 gap, listed prominently in current state. Frontend nav-tab gating from the resolved screen permissions is the next concrete piece.

Soon: A middleware before the router checks the resolved staff+permissions and gates each endpoint per the screen its data belongs to:

  • GET /api/customers/* requires screen:customers visibility
  • POST /api/tickets/* requires screen:service
  • Settings → /api/staff/* requires screen:settings AND role:owner|sys_admin

A small map from URL pattern to required screen lives at the top of src/index.js.

Audit trail (live)

Every mutating request writes two rows through the recordMutation helper:

audit_events — high-level: id, chain_hash, prev_chain_hash, actor_id, actor_label, action ('customer.update'), target_table, target_id, at, ip, user_agent, request_id, summary

The chain_hash = sha256(prev_chain_hash || canonical_json(this_event)). Tampering with any past row breaks the chain forward. The daily chain-verify cron (not yet wired) will rehash the chain and report breaks.

audit_mutations — detail: id, event_id (FK to audit_events), table_name, row_id, before_json, after_json, diff_json

The wrapper around mutating endpoints captures the row state pre- and post-mutation. Diff is precomputed at write time; storage is the snapshots + diff.

See audit-everything and ADR-0005: Tamper-chain audit log.

Secrets management

Secrets live in Cloudflare Worker bindings, never in source:

  • STRIPE_SECRET_KEY (per shop)
  • STRIPE_WEBHOOK_SECRET (per shop)
  • TWILIO_AUTH_TOKEN (Kvick account-level; sub-accounts via SID)
  • ANTHROPIC_API_KEY (Kvick account-level)
  • ADMIN_RESET_CODE (per shop, rotated on deploy in production)
  • SESSION_SIGNING_KEY (per shop, used to sign session tokens server-side)

Local dev uses .dev.vars (gitignored). Production uses wrangler secret put.

CORS

The operator app is served from the same Worker that serves the API. No cross-origin requests, no CORS configuration needed for the operator UI.

The public marketing site (separate Worker, separate origin) does not call any Helm API. If a future integration requires it, an explicit allowlist with Access-Control-Allow-Origin: https://swickedcycles.com is added per-endpoint.

Stripe webhook verification

The Stripe webhook endpoint (POST /api/webhooks/stripe) verifies Stripe-Signature against STRIPE_WEBHOOK_SECRET before processing the event. A failing signature returns 400 and no DB writes happen. See ADR-0011.

What's intentionally not in this model

  • 2FA / TOTP — the threat profile doesn't justify the operator friction. POS staff sign in 30+ times a day; a 6-digit TOTP would make the system unusable. PIN + lockout + audit is the right point on the friction curve.
  • SSO / SAML — for shops, no. Each shop's staff are local to that shop. The bible (this site) uses Cloudflare Access for SSO because it serves Kvick + maybe shop owners.
  • End-to-end encryption — the shop is the data owner and Kvick is the operator; there is no third party to encrypt against. Database-at-rest is encrypted by Cloudflare; transport is TLS.
  • DDoS protection — Cloudflare's default protections sit in front of the Worker. No additional layer needed.

See also