Skip to main content

ADR-0015 — Idempotency keys on all external writes

  • Status: Accepted
  • Date: 2026-05-06
  • Decision-makers: Tom Anderson

Context

External API writes (Stripe charges, Twilio sends, GBP review replies, Meta posts) need to be safe under retry. A naive retry-on-network-error can double-charge a customer if the first request actually succeeded but the response got lost.

Stripe, Twilio, and most modern APIs support idempotency keys: send the same key with two identical requests, and the second is a no-op (returns the original result). We just need to use them consistently.

The pattern is well-known; the decision is to make it a hard rule rather than a suggestion.

Decision

Every Worker call to an external service that mutates state includes an idempotency key. The key is:

  • Generated when the operation is first attempted, not at retry time
  • Stored in the local record (transactions.stripe_idempotency_key, service_ticket_messages.twilio_idempotency_key, etc.)
  • Reused on every retry of that exact operation
  • Regenerated only if the operator explicitly clicks "Retry" (and the operation is logically a fresh attempt)

Standard key format: {shop_slug}_{table}_{row_id}_v{attempt_n}. The v{n} allows operator-driven retries with a new attempt while keeping per-attempt idempotency.

For tools-use within the AI bubble: each tool call (including D1 writes if any) gets a key derived from the conversation ID + turn number + tool name.

Consequences

Positive:

  • Network blips and Worker restarts can retry safely
  • Operator double-clicks on a "Save" button cannot double-charge
  • Webhook reprocessing (Stripe replays its webhooks if we 5xx) is safe by design
  • Aligns with fail-quietly-recover-loudly retry semantics

Negative:

  • Every external-write code path needs to follow the pattern; missing one is a real bug
  • Tests need to verify the key behavior under retry

Mitigations:

  • The adapter helpers (src/lib/stripe.js, src/lib/twilio.js) wrap the fetch and require an idempotency key parameter; endpoints can't bypass it
  • A lint rule (planned) forbids direct fetch() to external services from src/index.js — must go through adapters

Notes

Stripe webhooks include their own idempotency-key header that we record in audit_events.request_id for traceability. Twilio doesn't have idempotency keys natively but supports MessageSid for status tracking; our local idempotency key prevents duplicate sends from our side.

Server-side implementation (live 2026-05-13)

This ADR was first about outbound writes (Helm → Stripe/Twilio). The same pattern was needed for inbound writes (client → Helm) once offline-architecture Slice 2 spec'd it. Implementation landed in migration 014_idempotency_records.sql + withIdempotency(request, env, endpoint, handler) in src/index.js. See offline-architecture § Slice 2 for the full contract (header semantics, body-hash mismatch → 409, race-via-UNIQUE-index, 7-day retention). First wrapped endpoint: POST /api/audit/manual. Each new mutating endpoint opts in by wrapping its handler.

See also