Connections Evaluation Box Score

Latest runs for 39 models (20 games each, sorted by solve rate, avg time, cost)
ModelDateWWIN%HIT/ATTACC%AVG/GTOK/GCOST$/G
google/gemini-3-flash-preview2026-02-2420100.0%80/8594.1%10s4.9k$0.10$0.005
openai/gpt-5.3-codex2026-02-2620100.0%80/8495.2%36s3.6k$0.50$0.025
x-ai/grok-4-fast2026-02-2520100.0%80/8594.1%41s8.5k$0.06$0.003
openai/o32026-02-2520100.0%80/8297.6%1m7s8.2k$1.05$0.052
google/gemini-3.1-pro-preview2026-02-2420100.0%80/80100.0%1m15s7.8k$1.31$0.066
google/gemini-3-pro-preview2026-02-2520100.0%80/8198.8%1m19s10.7k$1.94$0.097
anthropic/claude-opus-4.62026-02-2320100.0%80/80100.0%1m25s17.7k$3.53$0.176
anthropic/claude-sonnet-4.62026-02-2420100.0%80/80100.0%1m51s29.4k$3.53$0.177
openai/gpt-5.2-pro2026-02-2520100.0%80/80100.0%3m7s3.9k$7.44$0.372
x-ai/grok-42025-10-1820100.0%80/8297.6%3m28s13.6k$2.53$0.127
moonshotai/kimi-k2.52026-02-2420100.0%80/8989.9%8m14s13.4k$0.51$0.026
anthropic/claude-opus-4.52026-02-251995.0%78/8492.9%34s7.6k$1.52$0.076
qwen/qwen3-max-thinking2026-02-151995.0%78/8789.7%2m48s15.3k$0.88$0.044
z-ai/glm-52026-02-241995.0%77/8986.5%3m11s9.2k$0.41$0.020
z-ai/glm-4.72026-01-301995.0%77/8887.5%5m18s21.0k$0.76$0.038
google/gemini-2.5-pro2025-10-181890.0%75/8984.3%1m1s11.4k$1.30$0.065
openai/gpt-5-mini2025-12-191890.0%69/8482.1%1m51s7.9k$0.24$0.012
stepfun/step-3.5-flash2026-02-151890.0%76/9282.6%5m42s93.1k$0.43$0.021
anthropic/claude-4.5-sonnet2025-12-191785.0%59/8668.6%38s7.1k$0.91$0.045
deepseek/deepseek-v3.22025-12-021785.0%72/9278.3%4m21s13.4k$0.08$0.004
moonshotai/kimi-k2-thinking2025-11-121785.0%74/10173.3%8m26s17.9k$0.65$0.032
deepseek/deepseek-r1-05282025-10-181680.0%69/9969.7%9m14s21.4k$0.98$0.049
openai/gpt-5.22025-12-191575.0%56/8367.5%42s3.9k$0.64$0.032
openai/gpt-oss-120b2025-10-181470.0%65/11158.6%2m26s19.4k$0.12$0.006
qwen/qwen3.5-35b-a3b2026-02-261470.0%63/8970.8%3m10s24.3k$0.73$0.037
anthropic/claude-haiku-4.52025-12-191365.0%47/9450.0%37s12.5k$0.56$0.028
qwen/qwen3-max2025-10-181365.0%63/10659.4%2m9s13.7k$0.69$0.034
qwen/qwen3.5-flash-02-232026-02-261365.0%61/9762.9%2m53s30.3k$0.19$0.010
moonshotai/kimi-k2-09052025-10-181050.0%53/10053.0%1m9s7.1k$0.15$0.008
openai/o4-mini2025-12-20945.0%35/9736.1%3m33s29.3k$2.41$0.121
openai/gpt-oss-20b2025-10-18840.0%45/10841.7%4m4s34.4k$0.12$0.006
z-ai/glm-4.62025-10-18735.0%45/11539.1%1m41s8.2k$0.17$0.008
openai/o3-mini2025-12-20525.0%25/10025.0%2m38s21.6k$1.72$0.086
minimax/minimax-m2.52026-02-14315.0%24/9525.3%2m5s8.8k$0.13$0.007
amazon/nova-pro-v12025-12-2015.0%8/908.9%11s4.7k$0.13$0.006
microsoft/phi-42025-11-0615.0%17/10216.7%47s7.5k$0.01$0.001
meta-llama/llama-3.3-70b-instruct2025-12-2015.0%7/977.2%50s4.5k$0.03$0.001
mistralai/mistral-large2025-12-2015.0%9/999.1%58s10.6k$0.68$0.034
baidu/ernie-4.5-21b-a3b-thinking2025-10-1800.0%18/9918.2%3m22s22.3k$0.11$0.005