Back to benchmark cases
Active

Chess Scout

A data-heavy chess scouting product used to test whether AI models can improve UX, insight quality, report clarity, and code reliability.

Product baseline

V1 turns public Chess.com games into scouting notes: habits, weak spots, repeated patterns, and practical next-game advice.

Current verdict / next step

Run the first V2 comparison against the frozen Chess Scout baseline.

What this case tests

Data-heavy UXInsight rankingReport clarityBackend logic

Linked evidence

Related reports and model runs

Baseline ReportPublished draft

Chess Scout V1 baseline

AI helped with structure and UI direction, but product judgement needed human correction.

GPTClaudeGemini