Updates
All updates to the project spec will be listed in this section.- 03/13 : Project 2 released.
- 03/14 : Bug fix for MovesWritable.java! Please get the fresh copy of MovesWritable if you downloaded your version before the 15th!.
Clarifications/Reminders
- Start Early!
- Make sure you read through the project specs before starting.
- This project should be done on either the hive machines or the the computers in 271,273,275, or 277 Soda. These are the machines where we have hadoop installed and tested for your usage.
- It is suggested that you work with one partner for this project. Keep in mind this has to be a different person from the one you worked with on project 1. You may share code with your partner, but ONLY your partner. The submit command will ask you for your partner's name, so only one partner needs to submit. (It would be nice if only one partner submitted).
Goals
The goal of this project is to get you familar with the MapReduce programming model using the Hadoop framework. The first part of the project will give you an opportunity to turn minimax game tree search (an algorithm you're hopefully already familiar with) into a format that is compatible with the mapreduce framework. The second part will require you to run your implementation on a large cluster of Amazon Web Services Elastic Compute Cloud (AWS EC2) servers in order to crunch through a large problem. We hope that by doing this project you will gain an appreciation for the MapReduce programming model, and at the same time pick up a very marketable skill.
Part 1 (Due 3/19 @ 23:59:59)
Background
Connect N
The game we will try to completely solve in this project is a variation of Connect Four. Connect Four is essentially like Tic-Tac-Toe, but players are required to "drop" their piece into a column rather than arbitrary place it. Furthermore, instead of winning by placing three pieces in a row, they must place four. For a better idea of game mechanics, you can play the game here (or on any of a plethora of flash game websites) and read more about it on Wikipedia.
For the purpose of this project, we are going to be solving smaller versions of the complete game. Our game board will only be a maximum of 5 by 5 (instead of the standard 6 by 7). This is in order to limit the amount of time and space your code will require to run. In fact, for the first part we recommend you solve a game board of 3 by 3, with the win condition being three, instead of four, pieces in a row. With a smaller case, it should also be easier for you to verify the correctness of your output. Of course, you can use larger parameters if you have enough time, and space on your hard drive.
Minimax Game Tree Traversal
The actual algorithm we want you to implement consists of two main parts: 1) Constructing a full game tree for connect four, and 2) Solving the game by running minimax to label each game state with a win, loss or tie value.
In the first part, you will begin from an empty board, generate all possible subsequent game board configurations and link each of the new states to the original parent. This process will be repeated for each of the generated states until you reach a game state that is a winning state for either one of the players.
In the second part you will walk through the tree you have generated from the bottom up and propagate up the win/loss/tie value according to minimax. Essentially, there are three different scenarios (as detailed in the diagram below) and in each case you simply have to choose the optimal option to update your value. We will record the win/loss/tie state, and the number of moves it will take to get to that state from the current state.
Getting started
Copy the files in the directory ~cs61c/proj/02
to your proj2
directory, by entering the
following command:
$ mkdir ~/proj2 $ cp -r ~cs61c/proj/02/* ~/proj2
The files you will need to modify and submit are:
PossibleMoves.java
contains the code for the second MapReduce job that produces the full game tree. The first run of this job receives as input the output produced by the MapReduce job inInitFirst.java
.SolveMoves.java
contains the code for the third MapReduce job that traverses each layer of the full game tree and performs minimax selection to score each gamestate in the tree.
You are free to define and implement additional helper functions, but if you want them to be run when we grade your project, you must include them in the above classes. Changes you make to any other files will be overwritten when we grade your project.
The rest of the files are part of the framework and should not be modified. However, you will need to look at them as well. The particularly interesting/useful classes are bolded.
Makefile
defines the configurations for the run commands including the size of the game board and how many pieces in a row are required to winProj2.java
containsmain
, which sets up the four MapReduce jobs to be runInitFirst.java
The first MapReduce job that creates the initial input (empty game board) toPossibleMoves.java
FinalMoves.java
contains the code that cleans up the output from the final run ofSolveMoves.java
to create the table of mappings from game state to moves until win/loss/tie.TUI.py
contains a TUI (Text User Interface) for trying the game out with your solved solution! You run the TUI by typingpython TUI.py WIDTH HEIGHT CONNECTION_TO_WIN
, and place pieces on the board by typing the column where you want the piece dropped. (Zero indexed).Proj2Util.java
contains the hasher and unhasher (the hash is invertible to allow for this), which you need to generate keys for most map reduce outputs, and also contains a game finished check that lets you know if either side won the match or not.MovesWritable.java
is aWritable
class that can be used to store win/loss/tie state, the number of moves to get to that state and an array of all the parent game states associated a particular game state.
Your Job
Task A (Due 3/19): Your task will be to implement the two MapReduce jobs.
The first mapreduce job you have to write is in PossibleMoves.java
. This job is called once per
level in the full game tree, with each call to map getting as input the output of the previous map phase. The first
run of the job gets the output from the InitFirst
mapreduce job (this basically gives it an empty board
representation). Therefore, your map function should take the game state represented by the key (a game state that
has been hashed into an int
using the hash
method in the Proj2Util
class). The
reducer for this job sets up the actual game tree by emitting a MovesWritable
object associated with
every passed in child gamestate. The MovesWritable
object will at this point encapsulate all the
possible parents and a win/loss/tie value for this state. The parents is simply a combined list of all corresponding
values for the provided key. The win/loss/tie value needs to be determined by you by scanning through the game state
representation. If there are connectWin
pieces in a row somewhere in the game board, you should
determine this and set the correct win value (0b01
for a O
win, and 0b10
for
a X
win). If this is the lastRound
and you don't find a win, then you should set the
status to a tie (0b11
). If none of these conditions exist, then the value for this state is undecided
(0b00
).
The second mapreduce job you have to write is in SolveMoves.java
. This job is called once per level
in the almost full game tree (not the bottommost level), working bottom up instead of top down like PossibleMoves
.
The first run of the job gets the output from PossilbeMoves
exclusively, while all of the remaining levels of it
gets the output from PossibleMoves
from the layer below as well as the output from SolveMoves
from
the layer below. Therefore, your mapper should receive the key of the current board state, and the value of a MovesWritable class,
which contains the board state, moves to end, and parents of the current board state. From here, your mapper should pass each instances
of the parent state as output key, and the value of the current board as the output value. That means that your reducer will receive all
of the parents which contain the values from their children, so you can generate the next level up. You should use minimax to
determine which value should be used, with the help of the diagram above, and you should also generate the parents of this state.
It is okay if you don't generate the parents perfectly, as long as you AT LEAST generate all of the parents. (As in you can generate the
impossible parents here, as long as you can filter them out properly). The filtering should be aided by the outputs of PossibleMoves
which should have distinctly different values from the outputs of SolveMoves
. Think about what's different
about the two, and write the code accordingly! Remember, if you have impossible board states in your final output, you will be deducted points!
Here are some tips and reminders:
- Look at
Proj2.java
to understand how each of the MapReduce jobs will interact with each other. We will be calling each of the two mapreduce jobs you write multiple times in a loop. They should be able to handle this. - Use the
MovesWritable
class to link together game states and hold win/loss/tie data about them.- The
value
field in this class actually holds two different pieces of information. The two low order bits are used to signify the game state status. The bits are0b01
if the corresponding state will lead to a win forO
,0b10
if it will lead to a win forX
,0b11
if it will lead to a tie, or0b00
if the state has not yet been decided. - The remaining 6 bits of the byte represent the unisgned number of moves it would take to reach the state signified by the first two bits.
- The
moves
field holds a list of hashed game states that can lead to this game state. - The
counter
field is the length of the moves field. It is used to help us when restoring aMovesWritable
instance from the output context.
- The
- The
FinalMoves
mapreduce jobs runs at the end, after all your code has executed to clean up any leftover intermediate key value pairs from the output ofSolveMoves
. - The first run of
PossibleMoves
will need the empty board representation to work correctly. TheInitFirst
mapreduce job is provided to give the correct input format. Take a look at it to see exactly what the beginning key-value pair provided to your code will be. - The
Proj2Util
class is extremely inportant as it provides the hashing function. They allow us to compress a game state to fit in a 4 byte integer. While you don't have to fully understand how it works, be sure to understand the input and output formats for both thehasher
andunHasher
functions. Essentially, the string representation should be as long as the area of the gameboard with which we are working. The slots of each column are grouped together, with the bottom slot of the first column being the left most character, and the top slot of the last column being the right most character.
Debugging and Testing
The provided Makefile
gives you the ability to easily compile and run your code using variable
arguments.
$ make // Compiles your code $ make proj2 // Runs your code with the default 3 by 3 board and connect 3 $ make proj2 WIDTH=n HEIGHT=m CONNECT=i // Make sure to actually type the all-caps letters, //and generate your output with board width of n, board height of m, and connect to win of i. $ make proj2 WIDTH=4 HEIGHT=5 CONNECT=3 // For example, this would run a four by five board, with connect 3.
The provided TUI.py
will help you test your output, as you can use it to play against your AI, or see how the AI plays optimal moves on both sides.
$ python TUI.py 4 5 3 // This runs the TUI.py with your output, with board width of 4, board height of 5, and connect 3 to win.
In the ref_out
directory, we have also provided you with the correct output we expect for two different game types, 2 by 2 Connect 2 and 3 by 3 Connect 3.
You can use this to verify whether or not your program is working for at least the small case.
While you are working on the project, we encourage you to keep your code under version control. However, if you are using a hosted repository (i.e. GitHub, BitBucket, etc), please make sure that it is private, or else it will be viewed as a violation of academic integrity.
If you mistakenly break parts of the skeleton, you can always grab a new copy of the file by running the following command, but BE VERY CAREFUL not to overwrite all your work.
$ cp ~cs61c/proj/02/<NAME OF FILE> ~/proj2
Before you submit, make sure you test your code on the Hive machines. We will be grading your code there. IF YOUR PROGRAM FAILS TO COMPILE, YOU WILL AUTOMATICALLY GET A ZERO FOR THAT PORTION OF THE ASSIGNMENT There is no excuse not to test your code.
How your project will be graded:
- First, it will be graded on the correctness of the produced solution. This will be done by comparing your output file containing the solved values with the file that the staff solution outputs. Make sure that your file does not have any game states that cannot exist. (For example a board with moves played after a win state, or a board containing illegal moves like floating pieces or repeated turns by a particular side.)
- Second, we will be comparing the performance of your code against the staff solution. This includes both run
time and disk space usage.
- As a reference, running
make proj2 WIDTH=4 HEIGHT=5 CONNECT=3
, should take somewhere between a system time of a minute and a minute and a half. - The above command should also produce an output somewhere between 15-20 MB. You can verify this by
running
du -h
, and checking to see that yourall_data
directory is no larger thatn 20MB.
- As a reference, running
Submission
The full proj2-1 is due Wednesday (3/19/2014), to account for the exam on the previous Wednesday. To submit proj2-1, enter in
the following. You should be turning in PossibleMoves.java
, and SolveMoves.java
.
$ cd ~/proj2 $ submit proj2-1