Codex is built around persistent conversation threads. In this model, an "agent" is not a short-lived worker process. It is a live session that can receive input, run a turn, emit events, and then remain available for later input.
That distinction matters because a turn and a thread are not the same thing.
- A thread is the long-lived conversation session.
- A turn is one unit of work executed inside that thread after input arrives.
- Agent status is the public summary that other parts of the system use to decide whether to wait, send more input, or treat the thread as unavailable.
In this post, I'll try to model the lifecycle of one Codex thread in TLA+. The goal is narrow: specify the public status behavior precisely enough to remove ambiguity about what Completed, Interrupted, Shutdown, and NotFound actually mean.
Table of Contents
1. Status as a Stored Event-Derived Projection
When Codex creates a session, it also creates a status channel initialized to PendingInit. That channel stores the latest public status for the thread. Later, whenever the runtime emits an event that affects lifecycle state, Codex updates the stored status through agent_status_from_event.
The mapping is:
- TurnStarted ->
Running - TurnComplete(msg) ->
Completed(msg) - TurnAborted(Interrupted) ->
Interrupted - TurnAborted(Replaced | ReviewEnded) ->
Errored("Replaced" | "ReviewEnded") - Error(msg) ->
Errored(msg) - ShutdownComplete ->
Shutdown
So the public status is not a separate reducer-managed state machine. It is the latest stored projection produced from a subset of emitted events.
The enum mixes two concerns:
- Turn Result: The outcome of the most recent interaction (
Completed,Interrupted,Errored). - Thread Liveness: Whether the thread is currently executing or available for control (
Running,Shutdown).
2. Modeling Restartability
The primary distinction between a turn and a thread is restartability. In most systems, Completed is a terminal state. In Codex, a thread in a Completed or Interrupted state can transition back to Running if new user input arrives.
The TLA+ spec captures this through the ModelRestartable operator:
ModelRestartable(s) == s \in {"pending_init", "completed", "interrupted"}
This operator explains which statuses are restartable in the model. The checked StartTurn action spells the precondition out directly:
StartTurn ==
/\ status \in {"pending_init", "completed", "interrupted"}
/\ status' = "running"
/\ completedMsg' = NoMsg
/\ errorInfo' = NoError
This reflects the Codex turn-level semantics: Completed and Interrupted describe the result of the last turn, not destruction of the thread. In this model, they are restartable states from which a later input may begin a new turn.
3. The WaitFinal Set
Client-side polling logic often requires a definition of "finality" to know when to stop waiting for a result. However, the event-derived nature of the status makes this definition non-trivial. The spec defines WaitFinal to codify the states where a result is guaranteed to be stable:
WaitFinal(s) == s \in {"completed", "errored", "shutdown"}
Note that Interrupted is excluded from WaitFinal. The Codex wait logic treats PendingInit, Running, and Interrupted as non-final. A client that waits only for WaitFinal therefore needs a separate policy for handling Interrupted.
4. Enforcing Payload Discipline
Status variants like Completed and Errored carry associated data (completedMsg and errorInfo). A common failure mode in event-sourced systems is the presence of stale data—where a payload from a previous state persists after a transition.
The TLA+ model uses invariants to enforce payload discipline, ensuring that data is only present when the status warrants it:
CompletedCarriesOnlyCompletionMessage ==
status = "completed" => completedMsg \in Messages
NonCompletedClearsCompletionMessage ==
status # "completed" => completedMsg = NoMsg
These invariants force the StartTurn action to explicitly clear payloads. Without this clearing, the model checker (TLC) would find a trace where the system is Running but still holds a message from a prior turn.
5. System Boundaries and Deadlocks
When running the model checker with CHECK_DEADLOCK TRUE, TLC reports a deadlock at the transition from pending_init to shutdown. In TLA+, a deadlock is any state with no valid next moves. Since shutdown is modeled as a terminal state with no outgoing transitions, this deadlock is expected behavior for the model.
This also explains the absence of NotFound from the model's status set. NotFound is not a state of the live thread. It is the result returned by agent control when the thread manager cannot find the thread. Shutdown and NotFound are therefore distinct: direct subscribers to a live thread can observe Shutdown, while later polling through agent control may return NotFound after removal.
6. Verification Results
I checked the model with small bounds:
Messages = {msg1, msg2}ErrorInfos = {err1, err2}
There are two relevant TLC runs.
First, the configured run with CHECK_DEADLOCK TRUE stops at the expected terminal deadlock in shutdown. That run generates 19 states, finds 10 distinct states, and reaches depth 3.
Second, a deadlock-disabled run explores the full bounded state space and reports:
- 37 states generated
- 10 distinct states found
- depth 3
- no invariant violations
The TLC results verify the core status logic for a single thread:
TypeOKis preserved across all transitions.- Completion and error payloads are cleared precisely when the thread restarts.
Interruptedis correctly excluded from theWaitFinalset.Shutdownis modeled as the only terminal state.
This 100-line spec replaces an informal documentation block with a verifiable foundation, ensuring the status machine is sound before layering on the complexity of parent/child trees and persistent rollout state.
Appendix: Full TLA+ Spec
---- MODULE AgentLifecycle ----
EXTENDS TLC
\* Compact requirement ledger:
\* - PendingInit -> Running on TurnStarted: StartTurn
\* - Running -> Completed(msg) on TurnComplete: CompleteTurn
\* - Running -> Interrupted on interrupt abort: AbortInterrupted
\* - Running -> Errored on Replaced / ReviewEnded: AbortReplaced, AbortReviewEnded
\* - Any live state -> Errored(err) on generic error: EmitError
\* - Any live state -> Shutdown: Shutdown
\* - Completed and Interrupted may accept another turn: StartTurn precondition
\* - NotFound is not modeled; it is an external lookup result after thread removal
\*
\* Modeling assumptions:
\* - Single live agent thread only.
\* - This models the event-derived status projection, not the full task/session engine.
\* - Completed and Interrupted are restartable because they describe the last turn, not
\* thread destruction.
\* - Errored is not restartable in this first model, for blog clarity.
\* - Shutdown is the only intended terminal status in the modeled machine.
CONSTANTS Messages, ErrorInfos
Statuses ==
{
"pending_init",
"running",
"interrupted",
"completed",
"errored",
"shutdown"
}
AbortReasons == {"replaced", "review_ended"}
NoMsg == "no_msg"
NoError == "no_error"
ErrorPayloads == ErrorInfos \cup AbortReasons
VARIABLES status, completedMsg, errorInfo
Vars == <<status, completedMsg, errorInfo>>
WaitFinal(s) == s \in {"completed", "errored", "shutdown"}
ModelRestartable(s) == s \in {"pending_init", "completed", "interrupted"}
ModelTerminal(s) == s = "shutdown"
TypeOK ==
/\ status \in Statuses
/\ completedMsg \in Messages \cup {NoMsg}
/\ errorInfo \in ErrorPayloads \cup {NoError}
Init ==
/\ status = "pending_init"
/\ completedMsg = NoMsg
/\ errorInfo = NoError
StartTurn ==
/\ status \in {"pending_init", "completed", "interrupted"}
/\ status' = "running"
/\ completedMsg' = NoMsg
/\ errorInfo' = NoError
CompleteTurn(msg) ==
/\ status = "running"
/\ msg \in Messages
/\ status' = "completed"
/\ completedMsg' = msg
/\ UNCHANGED errorInfo
AbortInterrupted ==
/\ status = "running"
/\ status' = "interrupted"
/\ UNCHANGED <<completedMsg, errorInfo>>
AbortReplaced ==
/\ status = "running"
/\ status' = "errored"
/\ errorInfo' = "replaced"
/\ UNCHANGED completedMsg
AbortReviewEnded ==
/\ status = "running"
/\ status' = "errored"
/\ errorInfo' = "review_ended"
/\ UNCHANGED completedMsg
EmitError(err) ==
/\ status # "shutdown"
/\ err \in ErrorInfos
/\ status' = "errored"
/\ completedMsg' = NoMsg
/\ errorInfo' = err
Shutdown ==
/\ status # "shutdown"
/\ status' = "shutdown"
/\ completedMsg' = NoMsg
/\ errorInfo' = NoError
Next ==
\/ StartTurn
\/ \E msg \in Messages: CompleteTurn(msg)
\/ AbortInterrupted
\/ AbortReplaced
\/ AbortReviewEnded
\/ \E err \in ErrorInfos: EmitError(err)
\/ Shutdown
Spec == Init /\ [][Next]_Vars
CompletedCarriesOnlyCompletionMessage ==
status = "completed" => completedMsg \in Messages
NonCompletedClearsCompletionMessage ==
status # "completed" => completedMsg = NoMsg
ErroredCarriesOnlyErrorInfo ==
status = "errored" => errorInfo \in ErrorPayloads
NonErroredClearsErrorInfo ==
status # "errored" => errorInfo = NoError
InterruptedIsNotWaitFinal ==
status = "interrupted" => ~WaitFinal(status)
ShutdownIsModelTerminal ==
status = "shutdown" => ModelTerminal(status)
NonTerminalHasSuccessor ==
~ModelTerminal(status) => ENABLED Next
====Appendix: TLC Config
SPECIFICATION Spec
CONSTANTS
Messages = {msg1, msg2}
ErrorInfos = {err1, err2}
INVARIANT
TypeOK
CompletedCarriesOnlyCompletionMessage
NonCompletedClearsCompletionMessage
ErroredCarriesOnlyErrorInfo
NonErroredClearsErrorInfo
InterruptedIsNotWaitFinal
ShutdownIsModelTerminal
CHECK_DEADLOCK TRUE