Fresh Thoughts

View all →

Bytes & Drafts

Specifying the Codex Agent Lifecycle in TLA+

5 minute read Published: 2026-03-25

Codex is built around persistent conversation threads. In this model, an "agent" is not a short-lived worker process. It is a live session that can receive input, run a turn, emit events, and then remain available for later input.

That distinction matters because a turn and a thread are not the same thing.

  • A thread is the long-lived conversation session.
  • A turn is one unit of work executed inside that thread after input arrives.
  • Agent status is the public summary that other parts of the system use to decide whether to wait, send more input, or treat the thread as unavailable.

In this post, I'll try to model the lifecycle of one Codex thread in TLA+. The goal is narrow: specify the public status behavior precisely enough to remove ambiguity about what Completed, Interrupted, Shutdown, and NotFound actually mean.

How TLA+ Can Help Build Reliable AI Agentic Systems

4 minute read Published: 2026-03-21

When we move from building chatbots to building agents—systems that can call tools and perform actions—we move from the domain of language modeling into the domain of stateful systems.

In a simple chat interface, a model's mistake is usually a semantic one: it gives a wrong answer or hallucinates a fact. But when an agent has the authority to issue a refund, move a file, or trigger a deployment, a mistake can become an operational failure.

Often, these failures are not caused by the model's reasoning, but by how we have designed the system around it.

Safety & Liveness properties in TLC Mechanics

5 minute read Published: 2026-02-25

When you specify a concurrent system, you aren't just describing a state machine. You are describing how that machine is allowed to evolve over time. This creates two distinct correctness categories: forbidden states and required futures.

The first category, Safety, is about ensuring a "bad thing" never happens. The second, Liveness, is about ensuring a "good thing" eventually happens. In TLA+, these aren't just philosophical labels; they change how TLC explores your state space and what its counterexamples actually tell you.

Why on earth do we need TLA+?

19 minute read Published: 2025-11-16

Every few years the industry collectively rediscovers the same uncomfortable fact: you can have good engineers, solid testing, careful code reviews, and still ship a distributed system that behaves in ways nobody expected.

Software is when hardware changes, Part 2: When the clock disappears

22 minute read Published: 2025-11-10

If you have ever chased a bug where a system froze every few hours, only to wake up the moment you attached a debugger, you already know that time is rarely as simple as a counter that increments once a millisecond. Timers fire slightly early or slightly late, interrupts sneak in between instructions, peripherals run from their own oscillators, and distributed nodes see each other through links with variable delay. From the outside everything looks nicely clocked. Inside, every part of the stack negotiates its own idea of now.