Connections Evaluation Box Score
Latest runs for 23 models (>=11 puzzles, >40 guesses, sorted by solve rate, accuracy, avg time, cost per game)
Model Ver Date GP W WIN PCT ATT HIT MISS ERR ACC PCT TIME AVG/G TOK TOK/G COST $/G
openai/o3 2.0.2 2025-10-03 11 11 1.000 44 44 0 0 1.000 53m26s 4m51s 222.9k 20.3k $0.76 $0.069
google/gemini-3-pro-preview 2.0.2 2025-11-18 20 20 1.000 81 80 0 1 .987 49m16s 2m27s 350.2k 17.5k $1.55 $0.078
openai/gpt-5 2.0.2 2025-10-03 11 11 1.000 45 44 1 0 .977 68m13s 6m12s 258.9k 23.5k $1.13 $0.103
x-ai/grok-4 2.0.2 2025-10-18 20 20 1.000 82 80 2 0 .975 139m16s 6m57s 543.7k 27.2k $2.53 $0.127
openrouter/sherlock-think-alpha 2.0.2 2025-11-16 20 20 1.000 83 80 3 0 .963 50m39s 2m31s 396.8k 19.8k $0.00 $0.000
x-ai/grok-4-fast 2.0.2 2025-10-03 20 20 1.000 83 80 3 0 .963 114m16s 5m42s 322.7k 16.1k $0.06 $0.003
anthropic/claude-opus-4.5 2.0.2 2025-11-24 20 20 1.000 84 80 4 0 .952 29m18s 1m27s 310.3k 15.5k $1.08 $0.054
anthropic/claude-4.5-sonnet 2.0.2 2025-11-25 20 19 .950 88 78 10 0 .886 29m18s 1m27s 303.1k 15.2k $0.96 $0.048
openai/gpt-5-mini 2.0.2 2025-12-03 20 19 .950 92 78 14 0 .847 78m55s 3m56s 393.8k 19.7k $0.30 $0.015
openrouter/sherlock-dash-alpha 2.0.2 2025-11-16 20 19 .950 91 77 14 0 .846 21m42s 1m5s 371.0k 18.6k $0.00 $0.000
google/gemini-2.5-pro 2.0.2 2025-10-18 20 18 .900 89 75 12 2 .842 41m4s 2m3s 457.1k 22.9k $1.30 $0.065
deepseek/deepseek-v3.2 2.0.2 2025-12-02 20 17 .850 92 72 20 0 .782 174m33s 8m43s 537.7k 26.9k $0.08 $0.004
moonshotai/kimi-k2-thinking 2.0.2 2025-11-12 20 17 .850 101 74 16 11 .732 337m26s 16m52s 716.8k 35.8k $0.65 $0.032
anthropic/claude-haiku-4.5 2.0.2 2025-10-16 11 9 .818 49 40 9 0 .816 9m20s 50s 207.0k 18.8k $0.22 $0.020
deepseek/deepseek-r1-0528 2.0.2 2025-10-18 20 16 .800 99 69 30 0 .696 369m31s 18m28s 855.4k 42.8k $0.98 $0.049
google/gemini-2.5-flash 2.0.2 2025-10-02 11 8 .727 49 32 17 0 .653 9m56s 54s 340.1k 30.9k $0.17 $0.016
openai/gpt-oss-120b 2.0.2 2025-10-18 20 14 .700 111 65 25 21 .585 97m23s 4m52s 775.2k 38.8k $0.12 $0.006
qwen/qwen3-max 2.0.2 2025-10-18 20 13 .650 106 63 29 14 .594 86m25s 4m19s 549.7k 27.5k $0.69 $0.034
moonshotai/kimi-k2-0905 2.0.2 2025-10-18 20 10 .500 100 53 47 0 .530 46m10s 2m18s 285.3k 14.3k $0.15 $0.008
openai/gpt-oss-20b 2.0.2 2025-10-18 20 8 .400 108 45 57 6 .416 162m48s 8m8s 1374.1k 68.7k $0.12 $0.006
z-ai/glm-4.6 2.0.2 2025-10-18 20 7 .350 115 45 48 22 .391 67m42s 3m23s 328.5k 16.4k $0.17 $0.008
microsoft/phi-4 2.0.2 2025-11-06 20 1 .050 102 17 79 6 .166 31m34s 1m34s 301.1k 15.1k $0.01 $0.001
baidu/ernie-4.5-21b-a3b-thinking 2.0.2 2025-10-18 20 0 .000 99 18 80 1 .181 134m58s 6m44s 891.8k 44.6k $0.11 $0.005