Submitted:
30 January 2026
Posted:
02 February 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. Connect-4 as a Benchmark Game
1.2. From Classical Search to Learning-Based Approaches
1.3. Explainability and Human–AI Interaction
1.4. Emerging Trends and Motivation for This Review
- A structured and focused survey of artificial intelligence research applied to the Connect-4 game, addressing the lack of a consolidated Connect-4-specific review in the existing literature.
- A multidimensional taxonomy that organises existing Connect-4 research into clearly defined categories, including game-theoretical foundations, algorithmic and learning-based approaches, strategic and tactical reasoning, explainability, and computational and formal techniques.
- A taxonomy-driven synthesis of the literature, highlighting how different classes of methods are positioned within the proposed framework and how they address distinct aspects of Connect-4 gameplay.
- An analysis of open challenges and future research directions, derived from gaps identified across the taxonomy, with particular attention to explainability, adaptability, and evaluation practices.
2. Review Protocol
2.1. Data Sources
2.2. Search Strategy
("Connect-4" OR "Connect Four" OR "four-in-a-row" OR "four in a row") OR (("gravity-based game" OR "connection game" OR "alignment game") AND ("general game playing" OR "board game" OR "perfect-information game"))
2.3. Study Selection and Eligibility Criteria
- Title screening: 50 records were excluded as clearly irrelevant, leaving 103 records.
- Abstract screening: 37 records were excluded, leaving 67 records.
- Full-text screening: 17 records were excluded as closely related but outside the review scope, resulting in 49 studies included in the final review.
2.4. Data Extraction and Synthesis
3. Taxonomy and Research Directions
3.1. Game Theoretical Foundations
3.1.1. Optimal Strategies
3.1.2. NP-Completeness and Solved States
3.1.3. Game Theory in Two-Player Zero-Sum Games
3.2. Algorithmic Approaches
3.2.1. Minimax Algorithm
3.2.2. Alpha-Beta Pruning
3.2.3. Reinforcement Learning
3.2.4. Monte Carlo Tree Search
3.2.5. Hybrid Methods
3.3. Strategy and Tactical Play
3.3.1. Tactical Decision Making
3.3.2. Forced Moves and Traps
3.3.3. Endgame Strategies
3.4. AI Explainability
3.4.1. Explaining AI Decisions
3.4.2. Trust in AI for Game Strategies
3.5. Computational Enhancements and Formal Analysis
3.5.1. Parallel Search and Optimisation
3.5.2. Formal Verification of AI Moves
| Ref | Year | Title | Algorithm | ML | SOA | TA | Var | CA | C/M |
|---|---|---|---|---|---|---|---|---|---|
| [57] | 2017 | Data Mining in Adversarial Search - Players Movement Prediction in Connect-4 Games | Random, Minimax, MiniMaxHori | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ |
| [58] | 2021 | Computing Games: Bridging the Gap Between Search and Entertainment | PPNS, PNS, MCPNS, MCTS | ✓ | ✓ | ✓ | ✗ | ✓ | ✗ |
| [59] | 2022 | Scaling Laws for a Multi-Agent Reinforcement Learning Model | AlphaZero | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ |
| [60] | 2024 | Formal Verification of Multi-Thread Minimax Behavior Using mCRL2 in the Connect-4 | LTS, Minimax | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ |
4. Discussion
5. Open Challenges and Future Research Directions
6. Conclusion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Steele, R.; Larremore, D.B. Misére Connect Four is Solved. ICGA Journal 2025, 47, 118–129. [Google Scholar] [CrossRef]
- Allis, L.V. A knowledge-based approach of connect-four. J. Int. Comput. Games Assoc. 1988, 11, 165. [Google Scholar] [CrossRef]
- Stuart, B.L. Connect 4 as a problem in artificial intelligence and robotics. ACM SIGCSE Bulletin 1994, 26, 41–46. [Google Scholar] [CrossRef]
- Allis, L.V. Searching for solutions in games and artificial intelligence. Ph.D. Thesis, University of Limburg, 1994. [Google Scholar]
- Scheiermann, J.; Konen, W. AlphaZero-inspired game learning: Faster training by using MCTS only at test time. IEEE Transactions on Games 2022, 15, 637–647. [Google Scholar] [CrossRef]
- Wu, T.R.; Guei, H.; Peng, P.C.; Huang, P.W.; Wei, T.H.; Shih, C.C.; Tsai, Y.J. Minizero: Comparative analysis of alphazero and muzero on go, othello, and atari games. IEEE Transactions on Games, 2024. [Google Scholar]
- Baier, H.; Kaisers, M. Novelty in Monte Carlo Tree Search. IEEE Transactions on Games, 2025. [Google Scholar]
- Zhu, Y.; Cui, G.; Liu, A.; Jia, Q.S.; Guan, X.; Zhai, Q.; Guo, Q.; Guo, X. A Reinforcement Learning Embedded Surrogate Lagrangian Relaxation Method for Fast Solving Unit Commitment Problems. IEEE Transactions on Power Systems, 2025. [Google Scholar]
- Hassija, V.; Chamola, V.; Mahapatra, A.; Singal, A.; Goel, D.; Huang, K.; Scardapane, S.; Spinelli, I.; Mahmud, M.; Hussain, A. Interpreting black-box models: a review on explainable artificial intelligence. Cognitive Computation 2024, 16, 45–74. [Google Scholar] [CrossRef]
- Dunning, R.E.; Fischhoff, B.; Davis, A.L. When do humans heed AI agents’ advice? When should they? Human Factors 2024, 66, 1914–1927. [Google Scholar] [CrossRef]
- Schultz, J.; Adamek, J.; Jusup, M.; Lanctot, M.; Kaisers, M.; Perrin, S.; Hennes, D.; Shar, J.; Lewis, C.; Ruoss, A.; et al. Mastering board games by external and internal planning with language models. arXiv 2024. arXiv:2412.12119.
- Ahmad, Z.; Jehangiri, A.I.; Ala’anzy, M.A.; Othman, M.; Latip, R.; Zaman, S.K.U.; Umar, A.I. Scientific workflows management and scheduling in cloud computing: taxonomy, prospects, and challenges. IEEE Access 2021, 9, 53491–53508. [Google Scholar] [CrossRef]
- Ala’anzy, M.; Othman, M. Load balancing and server consolidation in cloud computing environments: a meta-study. IEEE Access 2019, 7, 141868–141887. [Google Scholar] [CrossRef]
- Sheoran, K.; Dhand, G.; Dabas, M.; Dahiya, N.; Pushparaj, P. Solving connect 4 using optimized minimax and monte carlo tree search. Mili Publications. Advances and Applications in Mathematical Sciences 2022, 21, 3303–3313. [Google Scholar]
- Taylor, H.; Stella, L. An Evolutionary Framework for Connect-4 as Test-Bed for Comparison of Advanced Minimax, Q-Learning and MCTS. arXiv 2024. arXiv:2405.16595.
- Kuramitsu, H.; Suzuki, K.; Matsuzawa, T. N-Tuple Network Search in Othello Using Genetic Algorithms. Games 2025, 16, 5. [Google Scholar] [CrossRef]
- Cutsinger, J.; Wylie, T. Row Shifting as a Puzzle Mechanic in Generalized Connect Four. In Proceedings of the 2024 IEEE Conference on Games (CoG). IEEE, 2024; pp. 1–4. [Google Scholar]
- Bagan, G.; Duchêne, E.; Galliot, F.; Gledel, V.; Mikalački, M.; Oijid, N.; Parreau, A.; Stojaković, M. Poset positional games. Discrete Mathematics 2025, 348, 114455. [Google Scholar] [CrossRef]
- Jiralerspong, M.; Sun, B.; Vucetic, D.; Zhang, T.; Bengio, Y.; Gidel, G.; Malkin, N. Expected flow networks in stochastic environments and two-player zero-sum games. arXiv 2023. arXiv:2310.02779.
- Primanita, A.; Khalid, M.N.A.; Iida, H. Characterizing the nature of probability-based proof number search: A case study in the othello and connect four games. Information 2020, 11, 264. [Google Scholar] [CrossRef]
- Touré, A.W. Evaluation of the Use of Minimax Search in Connect-4—How Does the Minimax Search Algorithm Perform in Connect-4 with Increasing Grid Sizes? Applied Mathematics 2023, 14, 419–427. [Google Scholar] [CrossRef]
- Tommy, L.; Hardjianto, M.; Agani, N. The analysis of alpha beta pruning and MTD (f) algorithm to determine the best algorithm to be implemented at connect four prototype. Proceedings of the IOP Conference Series: Materials Science and Engineering 2017, Vol. 190, 012044. [Google Scholar] [CrossRef]
- Weiner, E.M.; Montañez, G.D.; Trujillo, A.; Molavi, A. Hyperparameter Choice as Search Bias in AlphaZero. In Proceedings of the 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2021; IEEE; pp. 2389–2394. [Google Scholar]
- Thill, M.; Koch, P.; Konen, W. Reinforcement learning with n-tuples on the game Connect-4. In Proceedings of the Parallel Problem Solving from Nature-PPSN XII: 12th International Conference, Taormina, Italy, September 1-5, 2012; Springer, 2012; Proceedings, Part I 12, pp. 184–194. [Google Scholar]
- Yuan, C.; Al Forhad, M.A.; Bansal, R.; Sidorova, A.; Albert, M.V. Multi-agent Dual Level Reinforcement Learning of Strategy and Tactics in Competitive Games. Results in Control and Optimization 2024, 16, 100471. [Google Scholar] [CrossRef]
- Bagheri, S.; Thill, M.; Koch, P.; Konen, W. Online adaptable learning rates for the game Connect-4. IEEE Transactions on Computational Intelligence and AI in Games 2014, 8, 33–42. [Google Scholar] [CrossRef]
- Lei, S.; Lee, K.; Li, L.; Park, J. Learning Strategy Representation for Imitation Learning in Multi-Agent Games. Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence 2025, Vol. 39, 18163–18171. [Google Scholar] [CrossRef]
- Willemsen, D.; Baier, H.; Kaisers, M. Value targets in off-policy AlphaZero: a new greedy backup. Neural Computing and Applications 2022, 34, 1801–1814. [Google Scholar] [CrossRef]
- Świechowski, M.; Godlewski, K.; Sawicki, B.; Mańdziuk, J. Monte Carlo tree search: A review of recent modifications and applications. Artificial Intelligence Review 2023, 56, 2497–2562. [Google Scholar] [CrossRef]
- Kadam, P.; Xu, R.; Lieberherr, K. Dual Monte Carlo Tree Search. arXiv 2021, arXiv:2103.11517. [Google Scholar] [CrossRef]
- Ji, J.; Thielscher, M. MCTS with Dynamic Depth Minimax. In Advances in Computer Games; Springer, 2023; pp. 63–75. [Google Scholar]
- Sironi, C.F.; Winands, M.H. Analysis of the Impact of Randomization of Search-Control Parameters in Monte-Carlo Tree Search. Journal of Artificial Intelligence Research 2021, 72, 717–757. [Google Scholar] [CrossRef]
- Baier, H.; Winands, M.H. MCTS-minimax hybrids. IEEE Transactions on Computational Intelligence and AI in Games 2014, 7, 167–179. [Google Scholar] [CrossRef]
- Baier, H.; Winands, M.H. Monte-carlo tree search and minimax hybrids. In Proceedings of the 2013 IEEE Conference on Computational Inteligence in Games (CIG), 2013; IEEE; pp. 1–8. [Google Scholar]
- Clausen, C.; Reichhuber, S.; Thomsen, I.; Tomforde, S. Improvements to Increase the Efficiency of the AlphaZero Algorithm: A Case Study in the Game’Connect 4’. In Proceedings of the ICAART (2), 2021; pp. 803–811. [Google Scholar]
- Cleaver, N.; Neshatian, K. Transfer Learning in Monte Carlo Tree Search. In Proceedings of the 2023 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), 2023; IEEE; pp. 1–7. [Google Scholar]
- Jung, J.D.; Hoey, J. Heuristic Knowledge Transfer for General Game Playing. In Proceedings of the 2024 IEEE Conference on Games (CoG), 2024; IEEE; pp. 1–8. [Google Scholar]
- Fernández-Conde, J.; Cuenca-Jiménez, P.; Cañas, J.M. Hybrid Training Strategies: Improving Performance of Temporal Difference Learning in Board Games. Applied Sciences 2022, 12, 2854. [Google Scholar] [CrossRef]
- Trudeau, A.; Bowling, M. Targeted Search Control in AlphaZero for Effective Policy Improvement. In Proceedings of the Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023; pp. 842–850. [Google Scholar]
- Cotton, D.; Traish, J.; Chaczko, Z. Coevolutionary deep reinforcement learning. In Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), 2020; IEEE; pp. 2600–2607. [Google Scholar]
- Runarsson, T.P.; Lucas, S.M. On imitating Connect-4 game trajectories using an approximate n-tuple evaluation function. In Proceedings of the 2015 IEEE Conference on Computational Intelligence and Games (CIG), 2015; IEEE; pp. 208–213. [Google Scholar]
- Ahmed, U.; Chatterjee, K.; Gulwani, S. Automatic generation of alternative starting positions for simple traditional board games. Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence 2015, Vol. 29, 1–10. [Google Scholar] [CrossRef]
- Shashkov, A.; Hemberg, E.; Tulla, M.; O’Reilly, U.M. Adversarial agent-learning for cybersecurity: a comparison of algorithms. The Knowledge Engineering Review 2023, 38, e3. [Google Scholar] [CrossRef]
- Goodman, J.; Perez-Liebana, D.; Lucas, S. Measuring Randomness in Tabletop Games. In Proceedings of the 2024 IEEE Conference on Games (CoG), 2024; IEEE; pp. 1–8. [Google Scholar]
- Moreno-Calderón, S.; Martínez-Cagigal, V.; Santamaría-Vázquez, E.; Pérez-Velasco, S.; Marcos-Martínez, D.; Hornero, R. Assessing the Potential of Brain-Computer Interface Multiplayer Video Games using c-VEPs: A Pilot Study. In Proceedings of the 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 2023; pp. 1–4. [Google Scholar]
- Hill, G.; Kemp, S.M. Connect 4: A novel paradigm to elicit positive and negative insight and search problem solving. Frontiers in psychology 2018, 9, 1755. [Google Scholar] [CrossRef]
- Baier, H.; Winands, M.H. Time management for Monte Carlo tree search. IEEE transactions on computational intelligence and AI in games 2015, 8, 301–314. [Google Scholar] [CrossRef]
- Garapati, S.; Karlapalem, K. Empirical evaluation of idle-time analysis driven improved decision making by always-on agents. Proceedings of the 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC) 2018, Vol. 2, 141–146. [Google Scholar]
- Wäldchen, S.; Pokutta, S.; Huber, F. Training characteristic functions with reinforcement learning: XAI-methods play connect four. In Proceedings of the International Conference on Machine Learning. PMLR, 2022; pp. 22457–22474. [Google Scholar]
- Duan, J.; Wang, S.; Diffenderfer, J.; Sun, L.; Chen, T.; Kailkhura, B.; Xu, K. ReTA: Recursively Thinking Ahead to Improve the Strategic Reasoning of Large Language Models. Proceedings of the Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2024, Volume 1, 2232–2246. [Google Scholar]
- Nasir, J.; Oppliger, P.; Bruno, B.; Dillenbourg, P. Questioning wizard of oz: effects of revealing the wizard behind the robot. In Proceedings of the 2022 31st IEEE international conference on robot and human interactive communication (RO-MAN), 2022; IEEE; pp. 1385–1392. [Google Scholar]
- Matarese, M.; Cocchella, F.; Rea, F.; Sciutti, A. Ex (plainable) machina: how social-implicit xai affects complex human-robot teaming tasks. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023; IEEE; pp. 11986–11993. [Google Scholar]
- Matarese, M.; Cocchella, F.; Rea, F.; Sciutti, A. Natural Born Explainees: how users’ personality traits shape the human-robot interaction with explainable robots. In Proceedings of the 2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2023; IEEE; pp. 1786–1793. [Google Scholar]
- Holz, E.M.; Höhne, J.; Staiger-Sälzer, P.; Tangermann, M.; Kübler, A. Brain–computer interface controlled gaming: Evaluation of usability by severely motor restricted end-users. Artificial intelligence in medicine 2013, 59, 111–120. [Google Scholar] [CrossRef]
- Moreno-Calderón, S.; Martínez-Cagigal, V.; Santamaría-Vázquez, E.; Pérez-Velasco, S.; Marcos-Martínez, D.; Hornero, R. Combining brain-computer interfaces and multiplayer video games: An application based on c-VEPs. Frontiers in Human Neuroscience 2023, 17, 1227727. [Google Scholar] [CrossRef]
- Nash, K.; Lea, J.M.; Davies, T.; Yogeeswaran, K. The bionic blues: Robot rejection lowers self-esteem. Computers in human behavior 2018, 78, 59–63. [Google Scholar] [CrossRef]
- Ribeiro, A.C.; Rios, L.M.; Gomes, R.M.; Faria, B.M.; Reis, L.P. Data mining in adversarial search—players movement prediction in connect 4 games. In Proceedings of the 2017 12th Iberian Conference on Information Systems and Technologies (CISTI), 2017; IEEE; pp. 1–6. [Google Scholar]
- Primanita, A.; Khalid, M.N.A.; Iida, H. Computing games: Bridging the gap between search and entertainment. IEEE Access 2021, 9, 72087–72102. [Google Scholar] [CrossRef]
- Neumann, O.; Gros, C. Scaling laws for a multi-agent reinforcement learning model. arXiv arXiv:2210.00849. [CrossRef]
- Escobar, D.; Insuasti, J. Formal Verification of Multi-Thread Minimax Behavior Using mCRL2 in the Connect 4. Mathematics 2025, 13, 96. [Google Scholar] [CrossRef]








| Ref. | Sub-area | Technique / Focus | Claimed benefit | Key limitation(s) |
|---|---|---|---|---|
| [41] | TD | Preference learning with n-tuple features (imitation of expert trajectories) | Learns after-state evaluation functions aligned with expert play; improved generalisation via DAgger | Depends on distribution of training trajectories; limited opponent modelling |
| [42] | TD | Automatic generation of balanced or skill-adjusted start states | Symbolic search + simulation to create positions that calibrate difficulty and increase variety | Some generated starts favour one player; fairness calibration needed |
| [43] | TD | Alpha-Beta Pruning for adversarial decision-making (cybersecurity testbed) | Prunes large parts of the game tree to improve minimax efficiency | Limited adaptability in dynamic settings; computationally heavy for large state spaces |
| [46] | EG | MCTS & Alpha-Beta for late-game search | Focuses on critical late stages; structured optimisation of final plies | Deep searches costly in real time; poor transfer to 3D variants |
| [47] | EG | Dynamic time allocation for MCTS (STOP, BEHIND, UNST, CLOSE) | Allocates time where most needed; improves decision efficiency | Requires careful tuning; misallocation can waste time or miss key moves |
| [48] | EG | Idle-time analysis (FDS, ADS, OADS) | Uses idle cycles to precompute responses; improves win rates and reaction speed | Assumes deterministic settings; effectiveness in stochastic games untested |
| [44] | FM&T | Stochasticity measurement and control in search (parallel MCTS) | Moderate randomness reduces redundant exploration and speeds convergence | Mostly theoretical; lacks large-scale GPU implementations |
| [45] | FM&T | c-VEP BCI-controlled Connect-4 | Enables non-traditional interaction with high move-selection accuracy | Accuracy drops in multiplayer; calibration overhead and fatigue issues |
| Ref | Year | Title | Algorithm | ML | SOA | TA | Var | CA | C/M |
|---|---|---|---|---|---|---|---|---|---|
| [7] | 2025 | Novelty in Monte Carlo Tree Search | MCTS with Novelty Search | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [5] | 2022 | AlphaZero-Inspired Game Learning: Faster Training by Using MCTS Only at Test Time | MCTS, TD Learning, n-tuple Networks | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [14] | 2022 | Solving connect-4 using Optimized Minimax and Monte Carlo Tree Search | Minimax, Alpha-Beta Pruning, Dynamic Programming, MCTS | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [15] | 2024 | An Evolutionary Framework for Connect-4 as Test-Bed for Comparison of Advanced Minimax, Q-Learning and MCTS | Advanced Minimax, Q-Learning, MCTS | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ |
| [16] | 2025 | N-Tuple Network Search in Othello Using Genetic Algorithms | GA, BRKGA | ✓ | ✓ | ✓ | ✗ | ✓ | ✗ |
| [17] | 2024 | Row Shifting as a Puzzle Mechanic in Generalized Connect-4 | Shift-Tac-Toe, Game Tree Search, NP-Completeness Analysis | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [18] | 2025 | Poset Positional Games | - | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [19] | 2023 | Expected Flow Networks in Stochastic Environments and Two-player Zero-sum Games | EFlowNet, AFlowNet | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ |
| [20] | 2020 | Characterizing the Nature of Probability-Based Proof Number Search: A Case Study in the Othello and Connect-4 Games | PPNS, PNS, MCPNS | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [21] | 2023 | Evaluation of the Use of Minimax Search in Connect-4 | Minimax, Alpha-Beta Pruning | ✗ | ✓ | ✓ | ✓ | ✓ | ✗ |
| [29] | 2023 | Monte Carlo Tree Search- a review of recent modifcations | MCTS | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [30] | 2021 | Dual Monte Carlo Tree Search | AlphaZero, MPV-MCTS, Dual MCTS | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ |
| [31] | 2023 | MCTS with Dynamic Depth Minimax | Minimax, MCTS | ✗ | ✓ | ✓ | ✓ | ✓ | ✗ |
| [32] | 2021 | Analysis of the Impact of Randomization of Search-Control Parameters in Monte-Carlo Tree Search | MCTS | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| [24] | 2012 | Reinforcement Learning with N-tuples on the Game Connect-4 | TDL | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ |
| [25] | 2024 | Multi-agent Dual Level Reinforcement Learning of Strategy and Tactics in Competitive Games | Multi-agent RL, Minimax | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| [26] | 2014 | Online Adaptable Learning Rates for the Game Connect-4 | TDL, IDBD, TCL | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [27] | 2025 | Learning Strategy Representation for Imitation Learning in Multi-Agent Games | STRIL Framework | ✓ | ✗ | ✗ | ✗ | ✓ | ✓ |
| [28] | 2022 | Value targets in off-policy AlphaZero: a new greedy backup | A0GB | ✓ | ✗ | ✓ | ✓ | ✓ | ✓ |
| [33] | 2015 | MCTS-Minimax Hybrids | MCTS, Alpha-Beta Pruning, Minimax | ✗ | ✓ | ✓ | ✓ | ✓ | ✗ |
| [34] | 2013 | Monte-Carlo Tree Search and Minimax Hybrids | MCTS, Minimax Hybrid | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ |
| [35] | 2021 | Improvements to Increase the Efficiency of the AlphaZero Algorithm- A case study in the game connect-4 | MCTS, AlphaZero | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| [36] | 2023 | Transfer Learning in Monte Carlo Tree Search | MCTS | ✓ | ✗ | ✓ | ✗ | ✓ | ✗ |
| [37] | 2024 | Heuristic Knowledge Transfer for General Game Playing | MCTS, AlphaZero | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| Ref | Year | Title | Algorithm | ML | SOA | TA | Var | CA | C/M |
|---|---|---|---|---|---|---|---|---|---|
| [38] | 2022 | Hybrid Training Strategies: Improving Performance of Temporal Difference Learning in Board Games | TDL, Hybrid Training | ✓ | ✗ | ✓ | ✗ | ✓ | ✓ |
| [39] | 2023 | Targeted Search Control in AlphaZero for Effective Policy Improvement | Go-Exploit | ✓ | ✓ | ✗ | ✗ | ✓ | ✓ |
| [40] | 2020 | Coevolutionary Deep Reinforcement Learning | Coevolutionary RL | ✓ | ✓ | ✗ | ✗ | ✓ | ✓ |
| [22] | 2017 | The Analysis of Alpha Beta Pruning and MTD(f) Algorithm to Determine the Best Algorithm to be Implemented at Connect Four Prototype | Alpha Beta Pruning, MTD(f) | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ |
| [23] | 2021 | Hyperparameter Choice as Search Bias in AlpaZero | MCTS, AlphaZero | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [41] | 2015 | On imitating Connect-4 game trajectories using an approximate n-tuple evaluation function | n-tuple evaluation, Preference Learning | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| [42] | 2015 | Automatic Generation of Alternative Starting Positions for Simple Traditional Board Games | Symbolic Methods, Iterative Simulation, Minimax | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| [43] | 2023 | Adversarial agent-learning for cybersecurity: a comparison of algorithms | A2C, AlphaZero, MCTS, ES, CEM, DRL | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ |
| [46] | 2018 | Connect-4: A Novel Paradigm to Elicit Positive and Negative Insight and Search Problem Solving | ✓ | ✗ | ✗ | ✓ | ✗ | ✓ | ✗ |
| [47] | 2016 | Time Management for Monte Carlo Tree Search | MCTS | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| [48] | 2018 | Empirical Evaluation of Idle-Time Analysis Driven Improved Decision Making by Always-On Agents | AlphaZero, FDS, ADS, OADS | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [44] | 2024 | Measuring Randomness in Tabletop Games | MCTS | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| [45] | 2023 | Assessing the Potential of Brain-Computer Interface Multiplayer Video Games using c-VEPs: A Pilot Study | c-VEP | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ |
| [49] | 2022 | Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect-4 | AlphaZero MCTS | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| [50] | 2024 | ReTA- Recursively Thinking Ahead to Improve the Strategic Reasoning of Large Language Models | Minimax, ToT, CoT, ReTA | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| [51] | 2022 | Questioning Wizard of Oz: Effects of Revealing the Wizard behind the Robot | MCTS | ✗ | ✓ | ✓ | ✗ | ✓ | ✓ |
| [52] | 2023 | Explainable Machina how social-implicit XAI affects complex human-robot teaming tasks | MCTS, Alpha-Beta Pruning, DNN | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
| [53] | 2023 | Natural Born Explainees how users personality traits shape the human-robot interaction with explainable robots | AlphaZero, MCTS | ✓ | ✗ | ✓ | ✓ | ✓ | ✗ |
| [54] | 2013 | Brain-computer interface controlled gaming: Evaluation of usability by severely motor restricted end-users | SMR-BCI | ✗ | ✓ | ✓ | ✗ | ✓ | ✗ |
| [55] | 2023 | Combining brain-computer interfaces and multiplayer video games: an application based on C-VEPs | c-VEP | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ |
| [56] | 2018 | The bionic blues: Robot rejection lowers self-esteem | - | ✗ | ✗ | ✓ | ✗ | ✓ | ✗ |
| AI Technique | Primary Role (Main Sec.) | Secondary Roles (Other Sec.) |
|---|---|---|
| Minimax with Alpha-Beta Pruning | 3.2.1 | 3.1, 3.3.1, 3.3.3, 3.2.2, 3.5 |
| Monte Carlo Tree Search (MCTS) | 3.2.4 | 3.1.1, 3.3.3, 3.3.2, 3.2.5, 3.3.1 |
| Reinforcement Learning (TDL, MARL, Q-Learning) | 3.2.3 | 3.1.1, 3.4.1, 3.2.5 |
| Hybrid Models (e.g., MCTS-Minimax, RL-MCTS) | 3.2.5 | 3.2.4, 3.2.3, 3.3 |
| Proof Number Search (PPNS, PNS, MCPNS) | 3.1.3 | 3.5, 3.5.1 |
| Formal Verification (e.g., mCRL2, LTS) | 3.5.2 | 3.2.2, 3.1.3 |
| n-Tuple Networks | 3.2.3 | 3.3.1, 3.1.1 |
| XAI Methods (e.g., DeepShap, ReTA, Saliency, c-VEP) | 3.4 | 3.4.1, 3.4.2, 3.3 |
| Evaluation dimension | Metric / focus | References in this survey |
|---|---|---|
| Game outcome performance | Win / Draw / Loss rates | [7,14,15,21,30,31,33,34,35,38,39,40,46] |
| Search efficiency | Nodes expanded, pruning effectiveness | [17,20,22,33,34,43,58,60] |
| Computational cost | Time per move, runtime, scalability | [14,15,46,47,48,59,60] |
| Learning efficiency | Episodes to convergence, sample efficiency | [24,25,27,28,30,40,59] |
| Learning stability | Reward variance, training robustness | [7,30,37,40] |
| Generalisation | Performance vs unseen agents / domains | [36,37,43,55,57] |
| Strategic / tactical quality | Forced moves, traps, endgame precision | [41,42,43,44,45,46] |
| Formal correctness | Solved status, proofs, complexity | [14,17,18,20,60] |
| Explainability | Saliency, reasoning transparency | [49,50,52,53] |
| Human–AI interaction & trust | Trust, persuasion, usability | [45,51,54,55,56] |
| Non-traditional interaction | BCI, alternative interfaces | [45,55] |
| Theoretical learning frameworks | Equilibrium learning, flow models | [19,32] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.