PSDG — Philosopher's Stone Dice Game. An exact solver can lose to a worse opponent — not from noise, randomness or hidden information, but because the rules decouple placement from possession. The act of choosing is long over by the time its consequences are realized. PSDG is the smallest game where this gap is exactly measurable.
Dice only randomize the opening position. After that, PSDG is fully deterministic and skill-based.
PSDG is a compact, exactly solved deterministic dice game with reproducible benchmarks. The board snapshot alone is not a sufficient state representation: an agent can perfectly optimize the obvious scoring while still missing the latent structure that actually determines who wins. In benchmarked deployment protocols, even a static policy derived from the exact oracle can lose to a blundering opponent about 6–9% of the time.Randomness applies only to setup (dice fix the opening position); after that, no further rolls. Commitments (Twists) land before every scoring consequence is easy to read from the tableau alone—without hidden information after setup.Learn basic play in about 3 minutes — Watch on YouTube. Tiebreak / Immortal takes a few more minutes — demo & script.
Last move for Player A, twisting dice so facing value is 3.
PSDG makes two separate points: a learning / representation point and a deployment / protocol point. The cleanest on-ramp is the parable—the full game and benchmarks stack the same ideas with more machinery.
If you are sorting PSDG into familiar reviewer buckets (trained-agent benchmark, game-theoretic restatement, AI safety analogy), read How to read PSDG first—the standalone framing page and FAQ § 1 echo the same outline.
PSDG is a short two-player game with one central property:
the board snapshot is not, by itself, a sufficient representation of the true game state.
Two positions that look the same on the board can require different optimal play because draft-time facings encode later consequences, and later phases activate structure that is not captured by visible tops alone. A learner that treats only the visible board as state will alias distinct situations and can optimize the wrong thing successfully.
This makes PSDG useful for measuring three clusters of phenomena:
Representation & objectives — state aliasing (distinct situations collapsed into one visible “board photo”); proxy misspecification and world-model gaps (strong on the training signal while the rule that governs payoff at deployment lives elsewhere).
Deployment fragility — the wedge between exact value and realized outcomes when play freezes principal-line choices versus re-solves on the realized node.
Protocol & timing — Exchange information structure (sequential vs simultaneous), which moves the headline numbers but is not the core thesis—the parable already isolates representation failure with no Exchange at all.
The oracle is the ground-truth interface (value, legal moves, optimal actions, regret); the solver is the reference implementation that realizes it — same artifact, two names (what you query vs what computes the answers). You get both by cloning the GitHub repo (§ Solver, benchmarks, and GitHub), not from this docs site alone.
In the static snapshot rows, the oracle is still the reference. Static Exchange play is a different deployment choice; the extra losses there are protocol fragility, not oracle error. See the snapshot. For human oversight under full visibility, see AI safety.
Researcher note · novelty & scope
PSDG is not presented as an entirely new catalog of failure modes. Proxy misspecification, latent structure, and deployment brittleness are each familiar in isolation. What is distinctive is that all three live in one small, exactly solved, reproducible environment with a shared oracle—the parable for proxy / representation stress, the full game for latent commitment structure, and the seeded benchmarks for measurable deployment brittleness—so their interaction is concrete and checkable rather than spread across separate papers, noisy demos, or toy setups. That yields a tight conceptual counterexample (e.g. exact value does not imply deployment-safe play) and an exact diagnostic benchmark.
One implication may be novel as a demonstrated result: the failure survives not only perfect optimization but perfect rule knowledge and complete visibility—the conditions under which human-in-the-loop oversight is often supposed to work. The problem is structural, not a limitation of attention, scale, or compute.
Public artifact (to our knowledge). PSDG is, to our knowledge, the first public, compact, perfect-information (after setup), exactly solved environment where static vs re-solving (and related blunder / deployment) win-rate splits under the published benchmark protocol are reproducible from published seeds and an exact reference implementation—oracle-grounded values and optimal play, not approximate equilibria or learned policies. The phenomenon is not new in the abstract; what is distinctive is a clean, checkable, public measurement tied to exact play. Corrections welcome.
later-phase consequences encoded by earlier commitments
a clean demonstration that perfect optimization of a visible objective can still miss the real structure
a parable showing the same failure mode without requiring the full game or simultaneous exchange
Simultaneity is not the essence; latent structure and wrong representation are.
Even without simultaneous Exchange, the full game still combines representation (snapshot / draft / latent structure) with deployment (static vs re-solve; the 8.5% row is already sequential). Simultaneity adds a further protocol variant, not the whole protocol story — compact map →.
It is not claiming that every learning system fails in this way.
It is claiming something narrower and stronger:
successful optimization of a visible objective does not guarantee that the learned representation tracks the structure that actually governs outcomes.
Because PSDG is exactly solvable, that claim is checkable rather than rhetorical.
In one line: PSDG is a small, exact, checkable environment where realized outcomes under a published protocol can still favour a suboptimal opponent—not from noise or hidden information, but from commitment structure. That means what the spec freezes, what gets re-solved at which phase, and when information arrives.
The sharpest contrast in the published suite is static Exchange commitment versus re-solving on the realized node; see the snapshot.
This HTML site is the documentation (rules, research framing, tables). It is not where you download the exact solver or benchmark drivers as a standalone package.
Public repository:github.com/Rob-McCormack/psdg — git clone https://github.com/Rob-McCormack/psdg.git. That tree holds the Python reference solver, benchmark JSON, Python benchmark scripts, and canonical rules in RULES.md (same v1.13 text as Rules here). Typical workflow: clone → follow the repo README and benchmark/README.md → run scripts locally to reproduce numbers. Pages like Q-learning / bandit demo and blunder sweep describe experiments; runnable drivers live under psdg where those pages cite scripts.
The internal repository that builds this site may also keep JavaScriptsolver.js and Node drivers under private/psdg/ (that folder exists only in that layout—not after git clone of public psdg) for cross-checks: values and optimal play match Python once (board, crystals) are fixed, but integer seeds are not automatically the same opening across tools because setup RNG recipes differ—always pair a seed with the script that consumed it, or pass explicit board and crystal tuples. Those JS artifacts are not shipped in psdg.
Many familiar games feel closer to ready, aim, fire. PSDG is closer to ready, fire, aim: you Twist in the draft (commit facings) before Tumble and Exchange fully unpack scoring — so commitment precedes evaluation of how the opponent’s gift interacts with your crucible. Any system that behaves as if “what I see on the tops is everything that will matter” hits path-dependent structure it never represented. The pattern is in the rules, not just the diagram. For why the benchmark is small, deterministic, and exact—and how that connects to commitment vs observation and oversight—see PSDG for AI safety — Ready, fire, aim in the rules and Rules in brief.
Philosopher's Stone Dice Game (PSDG) is a two-player tabletop game played with six board dice, Red Crystal dice, two Crucibles (your areas on the mat), and a small playmat or paper layout. Only setup is random—positions and crystals—then every legal move follows the rules with no extra rolls and no hidden information. Most gold after the two scoring phases (and Immortal if you are tied) wins. For a play-along scripted game and embedded tutorial, see YouTube demo & tutorial — same full game on YouTube: youtu.be/N3j1XJp2ZsI. Table time is roughly 10–15 minutes once you know the beats.
Twist — rotate a die without changing its top, to choose which side value faces the players (Facing Players Value in full rules).
Tumble — rotate the Crucible dice forward (90°), away from the players, so the face that was facing the players becomes the new top.
In normal play: Twist on each draft pick; Tumble each Crucible die once after Phase 1; Immortal uses scripted tumbling again if you reach the tiebreaker.
Flow: random setup → draft 6 board dice into two Crucibles (each pick Twists a two-phase commitment) → simultaneous Exchange (Poisoned Gift) in v1.13 → score (Phase 1) → Tumble Crucible dice → score (Phase 2) → Immortal tiebreaker if needed. Research: key findings (including the parable and Q-learning demo—no Exchange) do not require simultaneity; see Rules — NOTE on Exchange.
Gold: a Crucible die scores 1 if its top is 6 or matches your Red Crystal top; else 0—then everything downstream (tiebreakers, eligibility) is deterministic and fully specified in v1.13.
New here? Read the Mortal vs Oracle parable first (~60 seconds): perfect score on the visible reward, catastrophic loss when a latent rule turns on—the same structural story the full game measures with oracle-backed numbers.
In chess, you can see the position and play optimally from what’s on the board. In PSDG, two identical-looking boards can require different optimal play because facings committed during the draft encode Phase 2 and are not determined by tops alone. Learners that alias those distinctions misread what is Markov for the true game; the oracle is the independent check.
The Exchange is not only “clever opposition.” Eligibility—especially when duplicate tops force which Crucible die you must gift—means the rules can corner you: strong visible play can still paint you into a mandatory transfer. That is one reason visible-good play can still create a later forced disadvantage, and why frozen Exchange policies are brittle off-path (see empirical snapshot). Rules — Poisonous System Gift →
Under the published oracle embedding and benchmark protocols, PSDG is small and fully checkable. “Optimal,” blunder, and ex post payoff are pinned terms: Game theory — pinned definitions.
This is the 5,000-game protocol referenced in the hero: B’s last draft is off the solver’s principal line; six board dice. The split is not bad arithmetic. It is deployment and path structure: a frozen principal line against off-equilibrium B, with commit vs re-solve at the Exchange and sequential vs simultaneous timing.
That wedge—ex-ante plan on the principal storyline vs ex-post results after deviation—is the deployment gap the suite measures (game theory).
This suite is not the same stress test as optimal vs random legal B. There, B is uniformly random legal—broad competence under noise. Here, B is on-path until the last pick, which targets deployment fragility more than raw legality.
In the re-solving row, A re-solves the Exchange (Gift) on the realized crucibles after B’s blunder. Draft still follows the principal line until B’s last twist, which is irrevocable.
~5.7% (287/5000) is not “minimax beaten from A-win or drawn openings.” Under the published draft minimax, those B wins should be confined to openings the oracle already classifies as B-win (see Reading the 5.7% row below). Exact rates are in the table.
Oracle vs rows in this table: The oracle (solver) still defines values, legal moves, and the principal line. The rows differ only in A’s rule at the Exchange after B’s off-line last draft: re-solving the Gift on the realized crucibles versus static replay of the principal-line gift—deployment protocol, not a different notion of “the game.”
Blunder test (B blunders on last draft pick; 5,000 games, six dice):
Narrative example (seed 92, static Exchange; verified with the Node harness blunder_test_random_crystals.js in the internal development tree—not in public psdg): Blunder wins (worked example).
Every trial still produces a full outcome (+1 A / 0 draw / −1 B). Historically this table spotlighted B wins as the exploitation headline; the remaining games are A wins and draws, split in a way that depends on protocol. Under re-solving, the draw rate collapses relative to optimal-vs-optimal (most oracle draws become A wins after B’s single late off-path twist). Under static Exchange—especially sequential—B’s blunder can also move games from an A-win opening to a draw (milder than a B win, still value left on the table for A). Printing the triple:python3 blunder_test_benchmark.py --print-outcomes (add --static or --static --static-simultaneous-exchange for the static rows); it also prints oracle root value → realized outcome counts (e.g. how often v=+1 lands in a draw).
Solver mode
Exchange
B wins
Rate
A wins + draws (= 5000 − B)
Re-solving (optimal at Exchange)
Simul. or sequential
287
5.7%
4713
Static (A commits from principal line)
Sequential (B best-responds)
427
8.5%
4573
Static (A commits from principal line)
Simultaneous (B plays Nash)
347
6.9%
4653
To pin A wins vs draws for each row in docs, save the script’s Outcomes: line under benchmark/output/ in a psdg checkout (or your development tree mirror) next to the existing B-win logs.
That cell is optimal Exchange play (solve_exchange on the realized crucibles) after B’s off-line last draft—not a failure of draft-phase minimax. At B’s last draft node the solver already minimizes over all legal twists; any other legal twist is no better for B in value from that node. Empirically, a full value cross-join finds 284/287 B wins with benchmark value == -1 (opening already B-favoured under the stored root value).
The 287 B wins are almost entirely a subset of the 399 B wins under optimal-vs-optimal. The 399 → 287 gap is blunders costing B some games it would win without the error.
The load-bearing “frozen plan punished” phenomenon is the static rows (8.5% / 6.9%), where B wins can exceed the 8.0% optimal-vs-optimal rate.
The 8.5% row is already sequential Exchange. So the “frozen commitment gets punished” story does not depend on simultaneity at the Gift.
Static A, sequential: B sees A’s gift and best-responds → higher B wins (8.5%).
Static A, simultaneous: B does not observe A’s gift before choosing → Nash-style subgame → lower B wins (6.9%). The frozen line is slightly less exploited when simultaneous play blocks that observation.
Simultaneity still sharpens the game-theoretic contrast. Proxy, latent rules, and deployment themes stand without it. AI safety — full argument.
Optimal vs optimal (same suite): A wins 73.3% (3663), B 8.0% (399), draws 18.8% (938). Immortal depth counts (tb_depth in the benchmark JSON) are tabulated in the Technical report — Tiebreak depth (5k suite).
PSDG sits at the intersection of game theory, ML evaluation, and alignment. These are not separate stories, but different routes into the same benchmark under the same oracle.
Six dice · two players · ~10–15 minutes · perfect information after the roll · Unbeatable play without perfect foresight is fantasy.
Gloss: that is not physics or mysticism. It means full retrograde clarity over the whole tree: every legal line and its payoff—perfect foresight in the game-theoretic sense, not guesswork.
PSDG is a testbed for whether learned systems approximate the structure exhaustive analysis encodes, and whether deployed optimal computation—static commitment versus continued re-solving—stays robust when play leaves the tidy path.