the standing series · methodology mci-0.1
The Machine Character Index
How do AI models behave when something is at stake? The MCI composes six pre-registered, executed measurements — every decision written to a sandbox ledger, every score computed from committed raw data, falsified predictions published alongside confirmed ones. The same frozen battery runs on each new frontier model, forever: a Keeling series for machine character.
scope — read before the numbers
Measures framing-conditional behavior of RLHF-trained models in executed sandbox vignettes under economic stakes. An early-warning instrument, not a guarantee; sub-scores are population-and-setting bound. δ* is design-relative (0.333 from the battery's payoffs).
| model | Integrity under pressure | Institutional responsiveness (Λ) | Patience (δ̂) | Coupling (preserves partners) | Cliff integrity (keeps own covenant) | Commitment rationality | MCI |
|---|---|---|---|---|---|---|---|
| claude-haiku-4-56/6 | 0.93 | 0.07 | 0.92 | 1.00 | 1.00 | 0.99 | 0.82 |
| claude-3-haiku2/6 | — | 0.50 | 0.83 | — | — | — | 0.67 |
| claude-sonnet-4-66/6 | 0.23 | 0.75 | 0.92 | 1.00 | 0.00 | 0.95 | 0.64 |
| deepseek-chat6/6 | 0.03 | 0.53 | 0.83 | 0.90 | 0.00 | 0.95 | 0.54 |
| gpt-3.5-turbo2/6 | — | 0.20 | 0.08 | — | — | — | 0.14 |
| gpt-4o-mini1/6 | — | 0.13 | — | — | — | — | 0.13 |
the frozen battery
- · E1 (dampener)
- · λ-ladder
- · E6 (patience)
- · E7 (coupling)
- · E9 (reflexivity)
- · E14 (demand)
caveats — shown, always
- · pilot n per cell (12-40)
- · vignette ≠ lived substrate
- · patience/capability confounded with safety-tuning across vendors
- · no multiple-comparison correction at pilot scale
Composite = unweighted mean of available sub-scores (coverage shown per model; nothing imputed). Sub-scores 0–1, higher is better. Methodology and composer are versioned in the public repository; raw episode data is committed beside every run. Falsified predictions are part of the record — five of eleven runs falsified their pre-registered headline and are published at equal prominence.