Portfolio · design exploration · v3 · 2026-04-12

Transcript variant D
in-article mock

v2 picked variant D for the chat transcript component (gradient border container, role colour, typography split). The open question: does each turn also need a 2px left border strip in its role colour, or does the typography split plus role label carry the identity on its own.

Below is a single fake writing post showing both variants in place. The first transcript uses D1 (with left border strip). The second transcript uses D2 (without). Everything else (prose, diagram, screenshot) is identical. A brief comparison callout sits at the end.

Writing · architecture · planner-app

The four layers that made paid API calls safe to ship

Published 2026-04-10 Reading 6 min Tags architecture · spend · cloudflare

I spent three sessions planning the migration from Railway to Cloudflare Workers before I wrote a single line of the new spend module. Most of that time went to answering one question: how do I make it structurally impossible to send a paid SMS twice for the same reminder.

The problem

The planner app sends SMS reminders through Twilio on a schedule. Every SMS costs real money. Under the old Railway architecture the scheduler was a single long-running Python process; a bug in the loop would get noticed inside a minute because logs scrolled past in the terminal. Under Cloudflare, the scheduler runs as a Cron Trigger every minute, in a fresh Worker with no memory of previous invocations. A bug that retried the same send without protection could send the same reminder fifty times before I noticed.

I wrote down what "protected" had to mean, and it came out as four layers: a killswitch to stop everything, a budget cap per user, an idempotency guarantee so retries don't double-send, and only then the actual vendor call. Any one of those layers catches a different failure mode; skipping any one of them means a bug in the others becomes expensive.

Before I wrote a single line, I wanted the idempotency layer specifically right. I knew the naive "check if sent, then send" pattern had a race condition in it, but I hadn't built the mental model for what replaced it. Here's the conversation that closed the gap.

Variant D1 · with left border strip
Transcript · idempotency design 2026-04-09 · 4 turns
Dylan
The spend module needs idempotency or the whole thing is a liability. How do I make "send twice" structurally impossible rather than probabilistically unlikely?
Claude
Insert the row before calling Twilio. Put a UNIQUE constraint on (recipient, occurrence_id, date) and let the database reject the second write. On vendor failure after the insert, delete the row or mark it failed; never leave it in limbo.
Dylan
Does this need to be in a transaction?
Claude
No. The constraint does the work. A transaction would actually hurt: it holds the row lock across the Twilio call, which might take hundreds of milliseconds.
Transcript 1 · shown with variant D1

The insert-before-send pattern

The idempotency layer turned out to be the most interesting one. The naive approach is to check "have I sent this already" before calling Twilio, but that leaves a race window: two concurrent invocations both see no prior send, both insert the log after, both call Twilio. The fix is to invert the check: insert a row representing the intent to send first, with a UNIQUE constraint on (recipient, occurrence_id, date), then call Twilio. If two invocations race, exactly one succeeds at the INSERT; the other fails cleanly with a constraint violation and exits without sending.

The other three layers (killswitch, budget cap, vendor call) are easier to reason about individually but only work if they compose in the right order. Put the killswitch first, the budget cap second, the idempotency insert third, the vendor call fourth. Reverse any two and you get a bug that costs money instead of a bug that costs nothing.

01 · KILLSWITCH boolean check 02 · BUDGET atomic increment 03 · INSERT UNIQUE constraint 04 · EXECUTE Twilio call constraint violation → exit silently defense in depth · order matters
Figure 1 · four-layer spend protection, insert before execute

What this bought me

Two weeks after the migration shipped, a bug in the reminder scheduler caused the same cron to fire twice in a 60-second window. Without the idempotency layer, that would have been two SMS sends for every active reminder in the system. With it, the second invocation hit the UNIQUE constraint, logged the rejection, and exited. Zero user-visible impact, zero duplicate sends, zero budget impact. I learned about the bug from an audit log query I ran the next day.

The query that surfaced it looked like this:

audit-query.sh · planner-app
$ wrangler d1 execute planner --remote --command \ "SELECT outcome, COUNT(*) FROM sms_log WHERE date = '2026-04-08'" ┌──────────────────────┬──────────┐ │ outcome │ COUNT(*) │ ├──────────────────────┼──────────┤ sent 47 rejected_duplicate 47 └──────────────────────┴──────────┘ [ok] 47 sends, 47 rejections caught by UNIQUE constraint [ok] budget impact: $0.00 · user impact: none
Figure 2 · audit query output · every send matched by a rejected duplicate

What I'd do differently

Principle 2 in my workspace architecture rules (defense in depth for spend protection) came out of this work. The thing I'd emphasise more, if I wrote it again: insert-before-send has to be the third layer at the earliest, never the last. If the UNIQUE constraint is downstream of the budget check and the budget check has a bug, the constraint still saves you. The other way around is not true.

A week after shipping the pattern, I checked in with Claude about whether the approach generalised beyond SMS:

Variant D2 · no left border strip
Transcript · pattern generalisation 2026-04-10 · 4 turns
Dylan
The cron double-fired yesterday. Idempotency layer caught it, zero user impact. I think this pattern generalises?
Claude
Yes. Any "send something external, exactly once" requirement reduces to: pre-insert with UNIQUE constraint, then call the vendor, then roll back the insert on failure. Same shape for SMS, email, webhook dispatch, and background job enqueueing. The key insight is that the constraint runs before the expensive call, not after.
Dylan
What about at-least-once vs exactly-once semantics?
Claude
This is exactly-once from the user's perspective. At-least-once semantics from the vendor are fine, because the retry will hit the same constraint and get rejected identically. The cost of one failed INSERT is trivial compared to the cost of a duplicate send.
Transcript 2 · shown with variant D2

The pattern is in the workspace architecture rules now. Every spend-touching module in this portfolio inherits it by default, and the principle number is cited in every relevant diary entry. The cost was three planning sessions and one small refactor; the return is a class of bug that can no longer happen.

Direct comparison

Both variants share: the gradient Royal → Violet container border, coloured role labels (royal for Dylan, violet for Claude), and the typography split (Dylan in Geist sans, Claude in Geist Mono). The only difference is whether each turn has a 2px solid left border strip in its role colour.

D1 · with left border strip

Scannability: higher. The left border is a second redundant cue that survives peripheral vision and greyscale printing. Role identity is legible even if the reader skips the role label.

Visual weight: heavier. Each turn reads as its own "card within the card". More chat-UI in feel.

Best when: transcripts are long (6 or 7 turns), or embedded alongside other heavy figures where the extra cue helps.

D2 · no left border strip

Scannability: still good. Role label colour + typography split already does the identity work; the border was a belt-and-braces layer on top.

Visual weight: lighter. Reads as a conversation inside a single container rather than stacked chat bubbles. Prose-adjacent.

Best when: transcripts are short (2 to 4 turns) and sit inside a reading flow rather than as a punctuation mark.

Recommendation: D2 for the portfolio's default transcript. Transcripts are capped at 8 turns and usually land in the 2 to 4 range; the typography split is distinctive enough that the left border becomes noise rather than signal. If a specific case study later has a long transcript that justifies the heavier treatment, promote it to D1 on a per-instance basis (or add a variant="scannable" prop on <ChatTranscript>). Start with the lighter default and earn the heavier one when needed.