Teleology (a.k.a – what is the purpose of this post?)
Recently, I finished an artificial intelligence project that involved implementing the Minimax and Alpha-Beta pruning algorithms in Python.
These algorithms are standard and useful ways to optimize decision making for an AI-agent, and they are fairly straightforward to implement.
I haven’t seen any actual working implementations of these using Python yet, however. So I’m posting my code as an example for future programmers to improve & expand upon.
It’s also useful to see a working implementation of abstract algorithms sometimes, when you’re seeking greater intuition about how they work in practice.
My hope is that this post provides you with some of that intuition, should you need it–and that it does so at an accelerated pace.
Ontology (a.k.a – how does this abstract idea __actually work__, in reality?)
Let’s start with Minimax itself.
Assumptions: This code assumes you have already built a game tree relevant to your problem, and now your task is to parse it. If you haven’t yet built a game tree, that will be the first step in this process. I have a previous post about how I did it for my own problem, and you can use that as a starting point. But keep in mind that YMMV.
My implementation looks like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
########################## | |
###### MINI-MAX ###### | |
########################## | |
class MiniMax: | |
# print utility value of root node (assuming it is max) | |
# print names of all nodes visited during search | |
def __init__(self, game_tree): | |
self.game_tree = game_tree # GameTree | |
self.root = game_tree.root # GameNode | |
self.currentNode = None # GameNode | |
self.successors = [] # List of GameNodes | |
return | |
def minimax(self, node): | |
# first, find the max value | |
best_val = self.max_value(node) # should be root node of tree | |
# second, find the node which HAS that max value | |
# –> means we need to propagate the values back up the | |
# tree as part of our minimax algorithm | |
successors = self.getSuccessors(node) | |
print "MiniMax: Utility Value of Root Node: = " + str(best_val) | |
# find the node with our best move | |
best_move = None | |
for elem in successors: # —> Need to propagate values up tree for this to work | |
if elem.value == best_val: | |
best_move = elem | |
break | |
# return that best value that we've found | |
return best_move | |
def max_value(self, node): | |
print "MiniMax–>MAX: Visited Node :: " + node.Name | |
if self.isTerminal(node): | |
return self.getUtility(node) | |
infinity = float('inf') | |
max_value = –infinity | |
successors_states = self.getSuccessors(node) | |
for state in successors_states: | |
max_value = max(max_value, self.min_value(state)) | |
return max_value | |
def min_value(self, node): | |
print "MiniMax–>MIN: Visited Node :: " + node.Name | |
if self.isTerminal(node): | |
return self.getUtility(node) | |
infinity = float('inf') | |
min_value = infinity | |
successor_states = self.getSuccessors(node) | |
for state in successor_states: | |
min_value = min(min_value, self.max_value(state)) | |
return min_value | |
# # | |
# UTILITY METHODS # | |
# # | |
# successor states in a game tree are the child nodes… | |
def getSuccessors(self, node): | |
assert node is not None | |
return node.children | |
# return true if the node has NO children (successor states) | |
# return false if the node has children (successor states) | |
def isTerminal(self, node): | |
assert node is not None | |
return len(node.children) == 0 | |
def getUtility(self, node): | |
assert node is not None | |
return node.value |
How-to: To use this code, create a new instance of the Minimax object, and pass in your GameTree object. This code should work on any GameTree object that has fields for: 1) child nodes; 2) value. (That is, unless I made an error, which of course, is very possible)
After the Minimax object is instantiated, run the minimax() function, and you will see a trace of the program’s output, as the algorithm evaluates each node in turn, before choosing the best possible option.
What you’ll notice: Minimax needs to evaluate **every single node** in your tree. For a small tree, that’s okay. But for a huge AI problem with millions of possible states to evaluate (think: Chess, Go, etc.), this isn’t practical.
How we solve: To solve the problem of looking at every single node, we can implement a pruning improvement to Minimax, called Alpha-Beta.
Alpha-Beta Pruning Improvement
Essentially, Alpha-Beta pruning works keeping track of the best/worst values seen as the algorithm traverses the tree.
Then, if ever we get to a node with a child who has a higher/lower value which would disqualify it as an option–we just skip ahead.
Rather than going into a theoretical discussion of WHY Alpha-Beta works, this post is focused on the HOW. For me, it’s easier to see the how and work backwards to why. So here’s the quick and dirty implementation.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
########################## | |
###### MINI-MAX A-B ###### | |
########################## | |
class AlphaBeta: | |
# print utility value of root node (assuming it is max) | |
# print names of all nodes visited during search | |
def __init__(self, game_tree): | |
self.game_tree = game_tree # GameTree | |
self.root = game_tree.root # GameNode | |
return | |
def alpha_beta_search(self, node): | |
infinity = float('inf') | |
best_val = –infinity | |
beta = infinity | |
successors = self.getSuccessors(node) | |
best_state = None | |
for state in successors: | |
value = self.min_value(state, best_val, beta) | |
if value > best_val: | |
best_val = value | |
best_state = state | |
print "AlphaBeta: Utility Value of Root Node: = " + str(best_val) | |
print "AlphaBeta: Best State is: " + best_state.Name | |
return best_state | |
def max_value(self, node, alpha, beta): | |
print "AlphaBeta–>MAX: Visited Node :: " + node.Name | |
if self.isTerminal(node): | |
return self.getUtility(node) | |
infinity = float('inf') | |
value = –infinity | |
successors = self.getSuccessors(node) | |
for state in successors: | |
value = max(value, self.min_value(state, alpha, beta)) | |
if value >= beta: | |
return value | |
alpha = max(alpha, value) | |
return value | |
def min_value(self, node, alpha, beta): | |
print "AlphaBeta–>MIN: Visited Node :: " + node.Name | |
if self.isTerminal(node): | |
return self.getUtility(node) | |
infinity = float('inf') | |
value = infinity | |
successors = self.getSuccessors(node) | |
for state in successors: | |
value = min(value, self.max_value(state, alpha, beta)) | |
if value <= alpha: | |
return value | |
beta = min(beta, value) | |
return value | |
# # | |
# UTILITY METHODS # | |
# # | |
# successor states in a game tree are the child nodes… | |
def getSuccessors(self, node): | |
assert node is not None | |
return node.children | |
# return true if the node has NO children (successor states) | |
# return false if the node has children (successor states) | |
def isTerminal(self, node): | |
assert node is not None | |
return len(node.children) == 0 | |
def getUtility(self, node): | |
assert node is not None | |
return node.value |
How-to: This algorithm works the same as Minimax. Instantiate a new object with your GameTree as an argument, and then call alpha_beta_search().
What you’ll notice: Alpha-Beta pruning will always give us the same result as Minimax (if called on the same input), but it will require evaluating far fewer nodes. Tracing through the code will illustrate why.
Phenomenology (a.k.a – how will I perceive this, myself?)
This isn’t the most robust implementation of either algorithm (in fact it’s deficient in many ways), so I wouldn’t recommend it for industrial use.
However, this code should simply illustrate how each algorithm works, and it will provide output you can trace through and compare against–as long as you are able to construct the GameTree for your problem.
From there, it’s only a matter of time until you’ll understand it intuitively. This is one of those things that took a little while for me to grasp–so hopefully having a clear example will help others get there more quickly. Good luck.