Project 3.4: Battleship


Due 11/27 at 11:59pm

DRAFT -- MORE TEXT TO BE ADDED ON 11/15

Introduction

In this checkpoint, you will design more agents for the CS188 variant of Battleship.  Now, the ships will be moving, and so your agents will have to model the passage of time, in addition to the observation of sensor readings.

The code for this project contains the following files, available as a zip archive.  The code is unchanged from checkpoint 3.3, so you may continue with your previous download if you would like.

Battleship
battleship.py The main code for the game of Battleship. You should familiarize yourself with the general outline of this code, which you will be interacting with for the rest of Project 3.
tutorial.py Tutorial script -- start here!
sensorDistributions.py Plug-in for the GUI interface. You can ignore this file.
graphicsUtils.py Graphics utilities. You can ignore this file.
util.py Tools used in battleship. You may be familiar with some of these by now, and they will save you a lot of time.
Agents
battleshipAgent.py This is where all your code will go. Don't ignore this file.

 

What to submit: You will fill in portions of battleshipAgent.py (only) during the assignment, and submit it.  The only other file you should submit is a readme in .txt or .pdf format in which you answer any written questions and in which you should clearly note your partner's name and login (if you are working in a pair).

Evaluation: Your code will be autograded for technical correctness. Please do not change the names of any provided functions or classes within the code, or you will wreak havoc on the autograder. Your answers to discussion questions will also be graded.

Academic Dishonesty: We will be checking your code against other submissions in the class for logical redundancy. If you copy someone else's code and submit it with minor changes, we will know. These cheat detectors are quite hard to fool, so please don't try. We trust you all to submit your own work only; please don't let us down. Instead, contact the course staff if you are having trouble.

Battleship and BNs

In this cs188 version game of battleship, the goal is to locate and bomb the battleship(s) hiding on the grid.  Unlike in classic battleship, however, you aren't guessing at random.  You can obtain information about the ships' whereabouts by dropping sensors at grid positions.  These sensors return readings, which are correlated with the distance to the closest ship.  You only have one bomb per ship, and bombing, successful or not, ends the game.  There are two tasks of interest in this game.  First, we would like to compute probability distributions over ship locations given sensor readings.  This is the inference problem.  Second, we want to decide, given our current beliefs about ship locations, whether to sense or bomb, and where.  This is the decision problem.  In this checkpoint, you will only address the inference problem, but with the added complexity that the ships move over time.

You can probably figure out the rules just by playing, but here's the official version:

At any time, you can sense or bomb.  If you wish to sense, left-click a grid location, and you will reveal the sensor reading at that location.  You cannot "re-sense" the same location.  The intuition is that time is frozen and your observation, though noisy, will not change until the world changes (which it will not do until the next checkpoint).  The observation you get from a sensor will be one of RED, ORANGE, YELLOW, or GREEN, which roughly indicates how close the closest ship is to the sensor location.  The exact sensor model is given in sensorDistributions.py.  You lose one point per sensor revealed.

Once you are ready to bomb, click the BOMB button. It will turn red to indicate that you are in bombing mode.  All you can do now is drop as many bombs as there are ships on the board.  Once all your bombs are used up, you will see which were hits and which were misses.  All ships in any squares you bomb are hit (you can hit multiple ships with a single bomb if you're lucky).  You gain 50 points for each ship you hit.

In this checkpoint, you have the ability to advance time by clicking on the TIME+1 button.  The main reason to advance time (from a decision point of view) is that you can get new readings at each location each time step.  Of course if the sensors are deterministic, there is much less reason to allow time to pass.  However, your code should correctly track the ships across time steps in any case.

To get started with battleship, let's play a few games with time.  Run battleship from the command line:

  python battleship.py -w -q -m circle

Click the time button to make the ship(s) move.  The new option -m triggers motion, and takes the arguments basic, center, and circle.  Each of these choices creates a different current patttern: for example, circle causes each ship to move in a clockwise circle.  You can make the motion noisy with the -n option, which takes a float argument representing the probability that the ship will go in an unexpected direction (i.e. not what the current specifies for the ship's position).

Question 1 (6 points)  When time can elapse (either -m is not basic or -n is not zero), the inference will default to the ExactDynamicInferenceModule in battleshipAgent.py.  This agent blissfully ignores observations and the passage of time (for now).  You will replace it with an HMM-like inference module which runs the forward algorithm.  Technically, you will be working with a dynamic Bayes' net, not a simple HMM, because some time steps may have multiple observations, while others may have none.  Unlike the inference module from the first checkpoint, you dynamic inference module will maintain, or track, a belief state over the possible ship tuples.  We have already given you code which initializes the belief state to the (uniform) prior over tuples.  You should fill in observe(), which updates these beliefs based on a single sensor observation (location, reading), and you should fill in elapseTime() which updates beliefs for a time step passing.  These methods will alter the agent's belief state in the appropriate ways.

You can test your implementation of observe() before elapseTime() is written by never clicking on TIME+1.  In this case, your agent's beliefs should be just as in 3.3 (indeed, you may find that the incremental updates are more intuitive and faster than the inference you wrote for 3.3).  You can also test elapseTime() by itself by using the center motion type.  The ships, and therefore your agent's beliefs, will always drift towards the center of the board.  Make sure you code works on the multiple ship case (try two ships -- more will be slow).

Some things to try once your inference algorithm is working: First, try sensing until you have a peaked distribution of where the ships are.  Then, try elapsing time.  Watch how the beliefs spread out if noise (-n) is set to, say, 0.25 and motion is set to center.  In many applications, you would get only one sensor reading per time step.  Watch how readings tend to sharpen beliefs, while time tends to flatten them.

Try putting 3 or more ships on the board.  Think about why the inference is so slow.  To address this issue, we will write an approximate inference procedure next, using particle filtering.

Question 2 (6 points) You will now fill in the stub of the class ApproximateDynamicInferenceModule in battleshipAgent.py, which you can select from the command line using -i approximate.  The general outline is similar to the exact agent, except that the belief state is represented by a list of particles (samples) of ship tuple locations.  At initialization, you should start your beliefs off with numParticles samples from the ship tuple prior.  In particle filtering, each particle is a possible value for the hidden state (here, the ship position tuple) and each particle has equal weight after every update.  You will then update this sample list during elapseTime() and observe().  Both updates will start with the current samples, and end with a new list of samples according to the particle filtering algorithm.  Elapsing time will require that you sample a next state for each particle.  Observation will require that you resample from your particles, where each particle is weighted by the observation's likelihood given the state represented by that particle.  Note that you will be maintaining a list of samples internally, but you will still be returning a distribution (in the form of a dictionary or Counter) in getShipTupleDistribution().  This returned distribution may be quite sparse if you don't have many samples.  Check out util.py, where several sampling methods have been provided for you.  Make sure your code works with both noisy and deterministic sensors.  Note that with deterministic sensors, it is possible that all of your particles will be given zero weight during the observation update if none of them are consistent with your observation.  In this case, which usually means you don't have enough samples, you should reinitialize with a prior sample.

Try various numbers of samples with one, two, or more ships.  Think about whether the approximate inference is faster, and why or why not.  Also, test how many samples you need to use for one, two, or three ships to get reasonable approximations.  With one ship and thousands of samples, the predictions should be quite good.