Security model

How Helm protects shop data — who can sign in, what they can see, how mutations are recorded, where secrets live.

v0.3 — two-layer auth live in production (2026-05-13)

Layer 1 (Google OAuth → device_sessions, 30-day sliding) and Layer 2 (PIN → staff_sessions, 60-second idle) are both wired and live at the deployed Worker. Audit-event/mutation writes are live on every mutating endpoint. API-level gating (refusing anonymous callers on endpoints) and the daily chain-hash verification cron are the remaining slice-1 follow-ons.

Threat model in one paragraph

Each Helm deployment serves one shop on a public Worker URL (helm-{shop}.kvick.bike or, for Swicked today, mockup-only-swicked-helm.solitary-poetry-5fcc.workers.dev). The attack surface is that URL, the operator browsers signing in from inside the shop, and the secrets stored in Cloudflare for outbound API calls. Realistic threats: a former employee whose access wasn't revoked, a credential-stuffing bot pounding the login endpoint, a misconfigured CORS allowing a malicious site to read API responses, a compromised Stripe key. The audit trail exists so that even when a threat lands, we can reconstruct what happened.

Authentication — two layers

ADR-0026 is the design. Summary:

	Layer 1 — device	Layer 2 — staff
Asks	Is this browser allowed to use Helm?	Which staff is at the till right now?
Mechanism	Google OAuth → email allowlist → `device_sessions` row + `helm_device` cookie	PBKDF2-SHA256 5-digit PIN → `staff_sessions` row + `helm_session` cookie
TTL	30-day sliding (refreshed on every authenticated request)	Configurable idle lockout (default 60 s, three-tier resolvable), 12-hour hard expiry
Friction	One time per device per 30 days	Whatever idle interval the Owner / role / staff has set

A request hits the Worker. The Worker checks helm_device:

No helm_device → 302 redirect to GET /api/auth/google/start
Valid helm_device but no helm_session → render the PIN overlay
Both valid → handle the request

Layer 1 — Google OAuth + device sessions

Each shop has a per-shop allowlist of authorized Google emails in shop_config.authorized_google_emails (JSON array, lowercased). Day-one Swicked allowlist: tanderson1963@gmail.com, james@swicked.com, chenoa@swicked.com, hello@kvick.ca.
OAuth flow: Authorization Code with PKCE.
- GET /api/auth/google/start — Worker generates state + PKCE code-verifier (stored short-term in a helm_oauth cookie), redirects to Google's authorization endpoint
- GET /api/auth/google/callback — Worker validates state, exchanges code for tokens, checks the email's lowercase form against authorized_google_emails, mints a device_sessions row + helm_device cookie, redirects to /
The helm_device cookie is HttpOnly, SameSite=Lax, Secure, Max-Age=2592000 (30 days). Server-side state, not a JWT.
Every authenticated request slides device_sessions.last_seen_at and expires_at forward. A 30+ day idle browser falls back to OAuth.
Sign-out: POST /api/auth/google/signout marks the row revoked (revoked_at, revoked_reason).
Admin allowlist management: GET/POST/DELETE /api/auth/allowlist (Sys Admin only).

Layer 2 — per-staff PIN (ADR-0013)

Each staff row has pin_hash (PBKDF2-SHA256, 100K iterations, 16-byte random salt) and pin_salt
5-digit PINs — fast to type during register-speed handoffs
Sign-in: POST /api/auth/login with {pin} (only callable on a request that already has a valid helm_device cookie)
Lockout: 5 wrong attempts in 60 seconds → 5-minute lockout. Per-IP rate-limit is the planned defence-in-depth follow-on.
No "Sign in" button — the operator app opens with a lockout overlay. Idle timeout is three-tier resolvable (migration 018, 2026-05-14): COALESCE(staff.idle_lockout_seconds, roles.idle_lockout_seconds, shop_config.idle_lockout_seconds). Encoding: NULL = inherit from the layer below; 0 = never lock (e.g., a Sys Admin diagnosing without re-PIN'ing every minute); 15..3600 = seconds. CHECK constraints enforce the range so an Owner can't pick a 5-second lockout that would interrupt every transaction. The resolved value lands on /api/auth/me; the frontend uses it instead of a hard-coded constant.

The 100K PIN keyspace is brute-forceable in seconds without rate limits — but with Layer 1, an attacker can't even reach the PIN screen without an allowlisted Google identity.

The admin reset code

A 6-digit ADMIN_RESET_CODE (currently 466687 in src/index.js) still does two things, after Layer 1 succeeds:

PIN reset — entered in the PIN-reset modal lets the operator set a new PIN for any staff member
Direct sign-in — entered at the lockout overlay signs in as the auto-created Sys Admin staff row

It does not bypass Google OAuth. A browser with no helm_device cookie sees the OAuth redirect before any PIN UI shows.

Session management

Two cookies, two tables:

helm_device → device_sessions row. 30-day sliding TTL. One row per browser-per-OAuth-login. Persists across PIN sign-outs.

helm_session → staff_sessions row. 60-second idle, 12-hour hard expiry. One row per active till sign-in.

Both are server-side state. Revocation:

Device: mark device_sessions.revoked_at = now() → that browser hits OAuth next request
Staff: DELETE FROM staff_sessions WHERE id = ? or wait for idle
Mass device revoke (e.g., employee left): remove their email from shop_config.authorized_google_emails; a future sweep cron revokes all matching device_sessions rows automatically.

Authorization — role + per-staff overrides

Three layers stack:

Built-in screen list (screens table) — Today, Sales, Customers, Service, Inventory, Trades, Rentals, Orders, Reports, Settings
Role defaults (role_permissions table) — for each role × screen, the default can_see. Roles: Sys Admin, Owner, Sales, Mechanic, Junior, Service Lead.
Per-staff overrides (staff_screen_permissions table) — for each staff × screen, an explicit can_see that wins over the role default. NULL = "use role default."

Resolved permission: COALESCE(staff_override, role_default).

The Settings → Staff & permissions UI exposes both layers. Toggling a tab on a staff card writes to staff_screen_permissions for that staff member only. "Reset to role defaults" clears all override rows for that staff.

See permission model for the table shapes.

API authorization — currently gap, soon hardened

Today: Auth identifies who's calling but most endpoints do not refuse anonymous callers. This is a deliberate slice-1 gap, listed prominently in current state. Frontend nav-tab gating from the resolved screen permissions is the next concrete piece.

Soon: A middleware before the router checks the resolved staff+permissions and gates each endpoint per the screen its data belongs to:

GET /api/customers/* requires screen:customers visibility
POST /api/tickets/* requires screen:service
Settings → /api/staff/* requires screen:settings AND role:owner|sys_admin

A small map from URL pattern to required screen lives at the top of src/index.js.

Audit trail (live)

Every mutating request writes two rows through the recordMutation helper:

audit_events — high-level: id, chain_hash, prev_chain_hash, actor_id, actor_label, action ('customer.update'), target_table, target_id, at, ip, user_agent, request_id, summary

The chain_hash = sha256(prev_chain_hash || canonical_json(this_event)). Tampering with any past row breaks the chain forward. The daily chain-verify cron (not yet wired) will rehash the chain and report breaks.

audit_mutations — detail: id, event_id (FK to audit_events), table_name, row_id, before_json, after_json, diff_json

The wrapper around mutating endpoints captures the row state pre- and post-mutation. Diff is precomputed at write time; storage is the snapshots + diff.

See audit-everything and ADR-0005: Tamper-chain audit log.

Secrets management

Secrets live in Cloudflare Worker bindings, never in source:

STRIPE_SECRET_KEY (per shop)
STRIPE_WEBHOOK_SECRET (per shop)
TWILIO_AUTH_TOKEN (Kvick account-level; sub-accounts via SID)
ANTHROPIC_API_KEY (Kvick account-level)
ADMIN_RESET_CODE (per shop, rotated on deploy in production)
SESSION_SIGNING_KEY (per shop, used to sign session tokens server-side)

Local dev uses .dev.vars (gitignored). Production uses wrangler secret put.

CORS

The operator app is served from the same Worker that serves the API. No cross-origin requests, no CORS configuration needed for the operator UI.

The public marketing site (separate Worker, separate origin) does not call any Helm API. If a future integration requires it, an explicit allowlist with Access-Control-Allow-Origin: https://swickedcycles.com is added per-endpoint.

Stripe webhook verification

The Stripe webhook endpoint (POST /api/webhooks/stripe) verifies Stripe-Signature against STRIPE_WEBHOOK_SECRET before processing the event. A failing signature returns 400 and no DB writes happen. See ADR-0011.

What's intentionally not in this model

2FA / TOTP — the threat profile doesn't justify the operator friction. POS staff sign in 30+ times a day; a 6-digit TOTP would make the system unusable. PIN + lockout + audit is the right point on the friction curve.
SSO / SAML — for shops, no. Each shop's staff are local to that shop. The bible (this site) uses Cloudflare Access for SSO because it serves Kvick + maybe shop owners.
End-to-end encryption — the shop is the data owner and Kvick is the operator; there is no third party to encrypt against. Database-at-rest is encrypted by Cloudflare; transport is TLS.
DDoS protection — Cloudflare's default protections sit in front of the Worker. No additional layer needed.

Threat model in one paragraph​

Authentication — two layers​

Layer 1 — Google OAuth + device sessions​

Layer 2 — per-staff PIN (ADR-0013)​

The admin reset code​

Session management​

Authorization — role + per-staff overrides​

API authorization — currently gap, soon hardened​

Audit trail (live)​

Secrets management​

CORS​

Stripe webhook verification​

What's intentionally not in this model​

See also​