A practical example for building and testing an AI system is strongly connected to an AI benchmark. In most cases, it's realized as a computer game in which a program has to win. Typical examples are: OpenAI, chess, Starcraft, Robocup and the ALE arcade learning environment.
A more general approach is a general video game challenge. Here is the task that a single agent can pla many different games. That means, the same software program is able to play chess and backgammon.
- Canaan, Rodrigo, et al. "Leveling the Playing Field-Fairness in AI Versus Human Game Benchmarks." arXiv preprint arXiv:1903.07008 (2019).
- General Video Game Playing in Research: http://www.gvgai.net/papers.php