Optimal A vs random legal B (baseline)

On this site: Home · Game theory · ML

Standalone stress test: Player A uses the exact solver’s optimal draft move at every decision (assuming optimal continuation thereafter), then the benchmark Exchange protocol below. Player B picks uniformly at random among all legal draft moves, and (in the default exchange mode) a uniform random legal Poisoned Gift at the Exchange; A best-responds to B’s gift. This is not the simultaneous Nash Exchange subgame for B—B is intentionally weak everywhere except legality.

Contrast: this script mainly probes general competence against noise (legal but unstructured play). The 5,000-game blunder suite is different: B follows the principal line until a deliberate last-pick deviation, which stresses deployment (static vs re-solving at the Gift, Exchange timing)—off-equilibrium fragility, not the same statistic as win rate vs random.

Published log (public repo): benchmark/output/optimal_vs_random_legal_batch10000_seed42.txt in psdg.
Driver: the batch was produced with optimal_vs_random_legal.js (Node), which lives only in the internal repository layout under private/psdg/benchmark/ at the root of that checkout—the public psdg repo does not include private/ or this script (Python solver + benchmarks only there).

What the 31 draws mean (and do not mean)

Yes — about the opening, not about B “being optimal.”
For all 31 draws in the 10k run, the opening (board + crystals) had oracle value 0: if both sides followed optimal play from that roll under the solver embedding, the outcome class is a draw. So those positions are theoretically drawn under optimal-vs-optimal.

No — the dumb player is not “just as good as optimal.”
The realized games were A optimal vs B uniformly random (legal). From the 1698 opens that were oracle draw, optimal A still beat random B in 1667 of them (~98.2%). Only 31 times did random B’s actual choices (draft + random gift), together with A’s optimal replies, still land in a draw instead of an A win.

So:

Optimal B from those opens could force the draw (value says neither side can improve unilaterally under the embedding).
Random B usually drifts into a loss against optimal A; rarely the random line stays “in the drawing region” ex post.

That is not the same as “random causes a draw as well as optimal.” It is: “from a draw-valued open, noise sometimes doesn’t throw the game away.”

10,000 trials (six board dice, seed 42…10041, `random-br`)

Outcome	Count	Rate
A wins	9965	99.65%
Draw	31	0.31%
B wins	4	0.04%

Opening value from solveFromRoll (both sides optimal from the roll — Nash-modeled Exchange at the end of the solver tree):

Oracle value (A perspective)	Count	Rate
+1 (A wins under optimal)	8289	82.89%
0 (draw under optimal)	1698	16.98%
−1 (B wins under optimal)	13	0.13%

Cross-checks (same run):

0 trials with oracle +1 but B won (random never “stole” a win from an A-winning open).
4 trials with oracle −1 and B won — all B wins sit in that 13-trial slice (random B converted 4/13 of B-favored opens).
Draws by opening oracle: 31 draws with oracle 0; 0 draws with oracle +1; 0 with oracle −1.

So the 31 draws are not “random luck from an A-favored board.” See above: they are oracle-0 opens where random B nonetheless held a draw against optimal A only ~1.8% of the time within that slice.

How to read “amazing” vs “setup”

Board + crystals fix the opening’s +1 / 0 / −1 under optimal-vs-optimal (draw semantics).
A realized draw in this harness is (i) oracle 0 and (ii) random B’s trajectory did not collapse to an A win under optimal A.

More than 10,000 trials: useful if you want tighter confidence on very rare cells (e.g. B wins at 0.04%). For draws at ~0.31%, 10k already gives a rough order of magnitude; doubling trials shrinks Monte Carlo noise but will not change the qualitative story unless the script or seed regime changes.

Relation to other site notes

Game theory — pinned definitions of optimal and protocol.
Blunder suite / static vs re-solving — different embedding (B blunders off the principal line; emphasis on commitment and Exchange timing).
This page — policy contrast: optimal A vs uniform random legal B, to separate “how hard is the game to lose against noise?” from blunder exploitation rates.

Optimal A vs random legal B (baseline) ​

What the 31 draws mean (and do not mean) ​

10,000 trials (six board dice, seed 42…10041, random-br) ​

How to read “amazing” vs “setup” ​

Relation to other site notes ​

Optimal A vs random legal B (baseline)

What the 31 draws mean (and do not mean)

10,000 trials (six board dice, seed 42…10041, `random-br`)

How to read “amazing” vs “setup”

Relation to other site notes