Project 5: Dynamic Battleship


Due 11/19 at 11:59pm

Updated with minor bug fixes (util.py and sonar.py), 11/13 @ 8am

Introduction

In this checkpoint, you will design more agents for the CS188 variant of Battleship.  Now, the ships will be moving, and so your agents will have to model the passage of time, in addition to the observation of sensor readings.

The Battleship code is unchanged from project 4. This project does, however, include a Pacman variant related to the course contest. Note: The Pacman code is only a contest preview; it doesn't require any additional work on your part. The code for this project contains the following files, available as a zip archive.

Battleship
battleship.py The main code for the game of Battleship. You should familiarize yourself with the general outline of this code, which you will be interacting with for Project 4 and 5.
tutorial.py Tutorial script -- start here!
sensorDistributions.py Plug-in for the GUI interface. You can ignore this file.
gui.py The graphical user interface for Battleship. You can ignore this file.
graphicsUtils.py Graphics utilities. You can ignore this file.
util.py Tools used in battleship. You may be familiar with some of these by now, and they will save you a lot of time.
battleshipAgent.py This is where all your code will go. Don't ignore this file.
Sonar Pacman
sonar.py The main file that runs Sonar Pacman games.
game.py The logic behind how the Pacman world works.
sonarAgents.py Agents for Sonar Pacman.
graphicsDisplay.py Graphics for Pacman
layout.py Code for reading layout files and storing their contents
distanceCalculator.py Code for computing shortest paths through a maze

 

What to submit: You will fill in portions of battleshipAgent.py (only) during the assignment, and submit it. 

Evaluation: Your code will be autograded for technical correctness. Please do not change the names of any provided functions or classes within the code, or you will wreak havoc on the autograder.

Academic Dishonesty: We will be checking your code against other submissions in the class for logical redundancy. If you copy someone else's code and submit it with minor changes, we will know. These cheat detectors are quite hard to fool, so please don't try. We trust you all to submit your own work only; please don't let us down. Instead, contact the course staff if you are having trouble.

Battleship and Bayes Nets and Time

In this checkpoint, you will maintain beliefs over ship positions as they move around on the board. You will only address the inference problem of tracking the ships; decisions about when and where to bomb will be left to the human player.

You can probably figure out the rules just by playing, but here's the official version:

At any time, you can sense or bomb.  If you wish to sense, left-click a grid location, and you will reveal the sensor reading at that location.  You cannot "re-sense" the same location.  The intuition is that time is frozen and your observation, though noisy, will not change until the world changes (which it will not do until you advance time).  The observation you get from a sensor will be one of RED, ORANGE, YELLOW, or GREEN, which roughly indicates how close the closest ship is to the sensor location.  The exact sensor model is given in sensorDistributions.py.  You lose one point per sensor revealed.

Once you are ready to bomb, click the BOMB button. It will turn red to indicate that you are in bombing mode.  All you can do now is drop as many bombs as there are ships on the board.  Once all your bombs are used up, you will see which were hits and which were misses.  All ships in any squares you bomb are hit (you can hit multiple ships with a single bomb if you're lucky).  You gain 50 points for each ship you hit.

In this checkpoint, you have the ability to advance time by clicking on the TIME+1 button.  The main reason to advance time (from a decision point of view) is that you can get new readings at each location each time step.  Of course if the sensors are deterministic, there is much less reason to allow time to pass.  However, your code should correctly track the ships across time steps in any case.

To get started with dynamic battleship, let's play a few games with time.  Run battleship from the command line:

python battleship.py -w -q -m circle

Click the time button to make the ship(s) move.  The new option -m triggers motion, and takes the arguments basic, center, and circle.  Each of these choices creates a different current patttern: for example, circle causes each ship to move in a clockwise circle.  You can make the motion noisy with the -n option, which takes a float argument representing the probability that the ship will go in an unexpected direction (i.e. not what the current specifies for the ship's position).

 

Question 1 (6 points)  When time can elapse (either -m is not basic or -n is not zero), the inference will default to the ExactDynamicInferenceModule in battleshipAgent.py.  This agent blissfully ignores observations and the passage of time (for now).  You will replace it with an HMM-like inference module which runs the forward algorithm.  Technically, you will be working with a dynamic Bayes' net, not a simple HMM, because some time steps may have multiple observations, while others may have none.  Unlike the inference module from the first checkpoint, you dynamic inference module will maintain, or track, a belief state over the possible ship tuples.  We have already given you code which initializes the belief state to the (uniform) prior over tuples.  You should fill in observe(), which updates these beliefs based on a single sensor observation (location, reading), and you should fill in elapseTime() which updates beliefs for a time step passing.  These methods will alter the agent's belief state in the appropriate ways.

You can test your implementation of observe() before elapseTime() is written by never clicking on TIME+1.  In this case, your agent's beliefs should be just as in 3.3 (indeed, you may find that the incremental updates are more intuitive and faster than the inference you wrote for 3.3).  You can also test elapseTime() by itself by using the center motion type. 

python battleship.py -w -m center
The ships, and therefore your agent's beliefs, will always drift towards the center of the board.  Make sure you code works on the multiple ship case (try two ships -- more will be slow).

Some things to try once your inference algorithm is working: First, try sensing until you have a peaked distribution of where the ships are.  Then, try elapsing time.  Watch how the beliefs spread out if noise (-n) is set to, say, 0.25 and motion is set to center.  In many applications, you would get only one sensor reading per time step.  Watch how readings tend to sharpen beliefs, while time tends to flatten them.

Try putting 3 or more ships on the board.  Think about why the inference is so slow.  To address this issue, we will write an approximate inference procedure next, using particle filtering.

 

Question 2 (6 points) You will now fill in the stub of the class ApproximateDynamicInferenceModule in battleshipAgent.py, which you can select from the command line:

python battleship.py -w -i approximate -m circle

The general outline is similar to the exact agent, except that the belief state is represented by a list of particles (samples) of ship tuple locations.  At initialization, you should start your beliefs off with numParticles samples from the ship tuple prior.  In particle filtering, each particle is a possible value for the hidden state (here, the ship position tuple) and each particle has equal weight after every update.  You will then update this sample list during elapseTime() and observe().  Both updates will start with the current samples, and end with a new list of samples according to the particle filtering algorithm.  Elapsing time will require that you sample a next state for each particle.  Observation will require that you resample from your particles, where each particle is weighted by the observation's likelihood given the state represented by that particle.  Note that you will be maintaining a list of samples internally, but you will still be returning a distribution (in the form of a dictionary or Counter) in getShipTupleDistribution().  This returned distribution may be quite sparse if you don't have many samples. 

Check out util.py, where several sampling methods have been provided for you.  Make sure your code works with both noisy and deterministic sensors.  Note: with deterministic sensors, it is possible that all of your particles will be given zero weight during the observation update if none of them are consistent with your observation.  In this case, which usually means you don't have enough samples, you should reinitialize with a prior sample.

Try various numbers of samples with one, two, or more ships.  Think about whether the approximate inference is faster, and why or why not.  Also, test how many samples you need to use for one, two, or three ships to get reasonable approximations.  With one ship and thousands of samples, the predictions should be quite good.

 

Pacman Contest Preview (0 points) Pacman spends his life running from ghosts, but things were not always so. Legend has it that many years ago, Pacman's great grandfather, Grandpac, learned to hunt ghosts for sport. However, he was blinded by his power and could only track ghosts by their banging and clanging.

Sonar.py defines a variant where Pacman (aka Grandpac) is invincible, and he tries to hunt down all the ghosts as quickly as possible. As observations, he receives a noisy distance reading for each remaining ghost. As a human player, these distances are quite difficult to interpret. You should try anyway:

python sonar.py
If you want to see the ghosts, you can turn on cheating, which lets you see the ghosts:
python sonar.py --showGhosts
You can also use the inference code from Battleship to track ghosts. We have set up sonar.py to use your ExactDynamicInferenceModule, which you can invoke via:
python sonar.py -p TrackingKeyboardAgent
Note: This dependency on Battleship code is really a hack, so please accept our apologies if it doesn't work. If you refer to anything too battleship-specific in your implementation, this will probably not work as expected. You will not be penalized if this doesn't work for you.

We've also implemented a greedy agent for you that uses your Battleship inference code to track ghosts, and moves toward the nearest ghost at each step. To watch Grandpac hunt all by himself, try:

python sonar.py -p GreedyTrackingAgent --layout bigHunt
The course contest, currently in beta release, will provide a similar format: your opponents will be hidden from you unless you are close enough to detect them. We may provide a similar sonar signal to the defensive agents (ghosts). Details of the contest will be finalized within the next week.

Have fun!