3.6 Summary
We shifted gears from considering standard search problems where we simply attempt to find a path from our starting point to some goal, to considering adversarial search problems where we may have opponents that attempt to hinder us from reaching our goal. Two primary algorithms were considered:
- Minimax - Used when our opponent(s) behaves optimally, and can be optimized using \(\alpha\)-\(\beta\) pruning. Minimax provides more conservative actions than expectimax, and so tends to yield favorable results when the opponent is unknown as well.
- Expectimax - Used when we are facing suboptimal opponent(s), using a probability distribution over the moves we believe they will make to compute the expected value of states.
In most cases, it’s too computationally expensive to run the above algorithms all the way to the level of terminal nodes in the game tree under consideration, and so we introduced the notion of evaluation functions for early termination. For problems with large branching factors, we described the MCTS and UCT algorithms. Such algorithms are easily parallelizable, allowing for a large number of rollouts to take place using modern hardware.
Finally, we considered the problem of general games, where the rules are not necessarily zero-sum.