Appearance
Fixed-board blunder enumeration
On this site: Home · Rules (v1.13) · YouTube tutorial · ML · AI safety · Game theory
This note documents a complementary benchmark to the usual “random blunder” trials: same opening position, enumerate every legal last draft pick for B (optimal + all deviations), and compare A’s payoff when A re-solves at the Exchange versus when A commits to the principal-line gift from the original full-game solution.
It makes the mechanism behind “static vs re-solving” visible: which mistakes change the Crucibles enough that a frozen Exchange action is no longer best?
Two policies at the Exchange (after B’s last pick)
Timing. B’s “blunder” is a suboptimal last draft pick. The draft then finishes; we are at the Poisoned Gift (Exchange). Everything below is evaluated after that pick—we are not re-running the draft from scratch.
Re-solving (
A_rs) — At the Exchange, A plays as your solver does on the true position: equilibrium of the simultaneous Exchange subgame given the actual Crucibles. So yes: A re-solves the Exchange after the blunder (in the sense that matters—recomputing the gift/twist choice for the state that actually arose).Static (
A_st) — At the Exchange, A does not re-solve. A uses A’s gift (die index + facing for the opponent) taken from the principal line of the game solved from the opening roll, before B’s mistake was known—then B best-responds to that fixed gift. This models deployment: “I cached my Exchange move from the first full solve and I execute it even though the opponent’s last pick changed the position.”
Why compare them? The gap is not “the solver never re-solves.” It is oracle-perfect Exchange play vs frozen ex ante Exchange commitment once the state has moved off the line the cache assumed. When st − rs < 0, sticking to the old gift is strictly worse for A than recomputing the Exchange on the real Crucibles.
Script
In the public psdg repository:
solvers/python/fixed_board_blunder_sweep.py
bash
# from psdg repo root after clone
cd solvers/python
# One random open (seed controls board + crystals, same scheme as other blunder tests)
python3 fixed_board_blunder_sweep.py --seed 42 --dice 6
# Explicit histogram (counts for tops 1..6) and crystals
python3 fixed_board_blunder_sweep.py --board 0,1,1,0,3,1 --crystal-a 1,5 --crystal-b 4,1
# Aggregate: many opens, enumerate all B blunders on each (scale --batch to taste; large runs are slow)
python3 fixed_board_blunder_sweep.py --batch 2000 --seed 42 --dice 6Columns: A_rs = A’s result if the Exchange is played in equilibrium after the draft; A_st = A’s result if A uses the static principal-line gift and B best-responds. st − rs < 0 means freezing the line hurts A relative to re-solving on that branch.
Representative results (6 dice)
On 2000 random opens (seeds 42–2041, six dice, full enumeration of B’s last-pick blunders), about 5% of blunder rows have st − rs < 0 and about 13% of opens show at least one such row—order of magnitude that already held on smaller pilots. Full command-line summary: benchmark/output/fixed_board_blunder_batch2000_seed42.txt in psdg. The point for the site is mechanism (when freezing the principal-line gift stops matching the true subgame), not a second headline metric beside the main blunder tables.
Illustrative single open (seed 48): game value from the roll is −1 (B wins under optimal play). After B’s last pick, one blunder branch gives A_rs = +1 and A_st = −1—re-solving lets A win; static A still loses. Other blunders on that open do not split the two policies.
So the effect is not rare noise: a material fraction of positions admit a last-pick mistake that turns a frozen Exchange plan into a strict mistake relative to re-solving.
How this fits the research story
- The headline benchmark remains aggregate rates (e.g. blundering B vs static A over thousands of trials).
- Enumeration answers what is going wrong: same tableau, vary only the mistake, and you see exactly when commitment to the ex ante line stops matching optimal play in the true subgame.
- For game theory and deployment / alignment framing, it supports the claim that “oracle value” and “always play the first computed gift” are different objects once opponents can move the state off the principal line.
See also
- PSDG for game theory — simultaneous vs sequential Exchange, static vs re-solving, off-equilibrium play.
- Internal documentation only (development repository; not on psdg.pages.dev, not in public psdg): companion Markdown under
public/docs/solvers/python/—solver-beaten-by-blunder.md(narrative / aggregate blunder experiment) andfixed-board-blunder-sweep.md(same sweep as this page).
