- Start Early!
- This project should be done on either the hive machines. Parts of your program may not work on other computers.
- This project is to be completed individually.
- Make sure you read through the project spec before starting.
This project will expose you to C and RISC-V in greater depth. The first part gives you the opportunity to sharpen your C programming skills that you have learned in the past few weeks, while the second part dives deeper into coding. Part of this project will also serve as the basis for the upcoming project 2. In particular part 1's goals are:
- To give you practice applying algorithms in C.
- To give you more experience doing memory management in C.
- To expose you to helpful C testing tools like CUnit and Valgrind.
- For you to have fun.
Cameras traditionally capture a two dimensional projection of a scene. However depth information is important for many real-world application including robotic navigation, face recognition, gesture or pose recognition, 3D scanning, and self-driving cars. The human visual system can preceive depth by comparing the images captured by our two eyes. This is called stereo vision. In this project we will experiment with a simple computer vision/image processing technique, called "shape from stereo" that, much like humans do, computes the depth information from stereo images (images taken from two locations that are displaced by a certain amount).
Humans can tell how far away an object is by comparing the position of the object is in left eye's image with respect to right eye's image. If an object is really close to you, your left eye will see the object further to the right, whereas your right eye will see the object to the left. A similar effect will occur with cameras that are offset with respect to each other as seen below.
The above illustration shows 3 objects at different depth and position. It also shows the position in which the objects are projected into image in camera 1 and camera 2. As you can see, the closest object (green) is displaced the most (6 pixels) from image 1 to image 2. The furthest object (red) is displaced the least (3 pixels). We can therefore assign a displacement value for each pixel in image 1. The array of displacements is called displacement map. The figure shows the displacement map corresponding to image 1.
Your task will be to find the displacement map using a simple block matching algorithm. Since images are two dimensional we need to explain how images are represented before going to describe the algorithm.
Below is a classic example of left-right stereo images and the displacement map shown as an image.
Part 1 (Due 9/18 @ 23:59:59)
In this project, we will attempting to simulate depth perception on the computer, by writing a program that can distinguish far away and close by objects.
First, create a Project 1 GitHub Classroom repository. Make sure to add
https://github.com/61c-teach/fa18-proj1-starter.git as the remote as you've did for the homework and lab repositories.
The files you will need to modify and submit are:
calc_depth.c: Creates a depth map out of two images. You will be implementing the
make_qtree.c: Creates a quadtree representation from a depth map. You will be implementing the
You are free to define and implement additional helper functions, but if you want them to be run when we grade your project, you must include them in
make_qtree.c. Changes you make to any other files will be overwritten when we grade your project.
The rest of the files are part of the framework. It may be helpful to look at all the other files.
Makefile: Defines all the compilation commands.
depth_map.c: Loads bitmap images and calls the calc_depth() to calculate the depth map.
calc_depth.h: Defines the signature for the calc_depth() function you will implement.
make_qtree.h: Defines the signature for the depth_to_quad() and homogenous() functions you will implement.
quadtree.h: Defines the qNode struct, as well as quadtree.c function headers.
quadtree.c: Calls depth_to_quad() and free_qtree().
utils.h: Defines Image struct and utility function signatures.
utils.c: Defines bitmap loading, printing, and saving functions.
test/: Contains the files necessary for testing. images/ holds input images, output will contain output files created by the program, and expected/ has the correct output of the tests. cunit contains code to help with unit testing.
Task A (Due 9/18)
Your first task will be to implement the depth map generator. This function takes two input images (unsigned char *left and unsigned char *right), which represent what you see with your left and right eye, and generates a depth map using the output buffer we allocate for you (unsigned char *depth_map).
Generating a depth map
In order to achieve depth perception, we will be creating a depth map. The depth map will be a new image (same size as left and right) where each "pixel" is a value from 0 to 255 inclusive, representing how far away the object at that pixel is. In this case, 0 means infinity, while 255 means as close as can be. Consider the following illustration:
The first step is to take a small patch (here 5x5) around the green pixel. This patch is represented by the red rectangle. We call this patch a feature. To find the displacement, we would like to find the corresponding feature position in the other image. We can do that by comparing similarly sized features in the other image and choosing the one that is the most similar. Of course, comparing against all possible patches would be very time consuming. We are going to assume that there's a limit to how much a feature can be displaced -- this defines a search space which is represented by the large green rectangle (here 11x11). Notice that, even though our images are named
right, our search space extends in both the left/right and the up/down directions. Since we search over a region, if the "left image" is actually the right and the "right image" is actually the left, proper distance maps should still be generated.
The feature (a corner of a white box) was found at the position indicated by the blue square in the right image.
We'll say that two features are similar if they have a small Squared Euclidean Distance. If we're comparing two features, A and B, that have a width of W and height of H, their Squared Euclidean Distance is given by:
(Note that this is always a positive number.)
For example, given two sets of two 2×2 images below:
← Squared Euclidean distance is (1-1)2+(5-5)2+(4-4)2+(6-6)2 = 0 →
← Squared Euclidean distance is (1-3)2+(5-5)2+(4-4)2+(6-6)2 = 4 →
Once we find the feature in the right image that's most similar, we check how far away from the original feature it is, and that tells us how close by or far away the object is.
We define these variables to your function:
We define the variables
feature_height which result in
feature patches of size:
(2 × feature_width + 1) × (2 × feature_height + 1). In the
feature_width = feature_height = 2 which gives a 5×5 patch. We also define the variable
maximum_displacement which limits the search space. In the previous example
maximum_displacement = 3 which
results in searching over
(2 × maximum_displacement + 1)2 patches in the second image to compare with.
In order for our results to fit within the range of a unsigned char, we output the normalized displacement between the left feature and the corresponding right feature, rather than the absolute displacement. The normalized displacement is given by:
This function is implemented for you in calc_depth.c.
In the case of the above example,
dx=2 are the vertical and horizontal displacement of the green pixel. This formula will guarantee that we have a value that fits in a
unsigned char, so the normalized displacement is
255 × sqrt(1 + 22)/sqrt(2 × 32) = 134, truncated to an integer.
We will be working with 8-bit grayscale bitmap images. In this file format, each pixel takes on a value between 0 and 255 inclusive,
where 0 = black, 255 = white, and values in between to various shades of gray. Together, the pixels
form a 2D matrix with
image_height rows and
Since each pixel may be one of 256 values, we can represent an image in memory using an array of
unsigned char of size
image_width * image_height. We store the 2D image in a 1D array such that each row of the image occupies a contiguous
part of the array. The pixels are stored from top-to-bottom and left-to-right (see illustration below):
We can refer to individual pixels of the array by specifying the row and column it's located in. Recall that in a matrix,
rows and columns are numbered from the top left. We will follow this numbering scheme as well, so the leftmost red square
is at row 0, column 0, and the rightmost blue square is at row 2, column 1. In this project, we will also refer to an element's
column # as its x position, and it's row # as its y position. We can also call the # of columns of the image
as its image_width, and the # of rows of the image as its image_height. Thus, the image above has a width of 2, height of 3,
and the element at
y=2 is the rightmost blue square.
Edit the function in
calc_depth.c so that it generates a depth map and stores it in
unsigned char *depth_map, which points to a pre-allocated buffer of size
image_height. Two images,
right are provided. They are also of size
feature_height parameters are described in the Generating a depth map section.
Here are some implementation details and tips:
- A feature is a box of width
2 × feature_width + 1and height
2 × feature_height + 1, with the original position of the pixel at its center.
- You may not assume
maximum_displacement. They may all be different (e.g. your feature box may be a rectangle).
- Pixels on the edge of the image, whose left-image features don't fit inside the image, should have a normalized displacement of 0 (infinite).
maximum_displacementis 0, the whole image would have a normalized displacement of 0.
- Your algorithm should not consider right-image features that lie partially outside the image area. However, if the left-image feature of a pixel is fully within the image area, you should always be able to assign a normalized displacement to that pixel.
- The source pixels always come from
unsigned char *left, whereas the
unsigned char *rightimage is always the one that is scanned for nearby features.
- You may not assume that
unsigned char *depth_maphas been filled with zeros.
- You may not store global variables that persist between multiple calls to
- The Squared Euclidean Distance should be calculated according to the formula:
- After finding the matching feature in the right image with the smallest Squared Euclidean Distance, the normalized displacement of the pixel is given by the formula:
- Ties in the Euclidean Distance should be won by the one with the smallest resulting normalized displacement.
- Some test cases are provided by
make check. These are not all of the tests that we will be grading your project on.
make to compile the
calc_depth. Your code must compile with the given
$ make gcc -Wall -g -std=c99 -o .... ....
You can now run
./depth_map to see the options that the program takes.
$ ./depth_map USAGE: ./depth_map [options] REQUIRED -l [LEFT_IMAGE] The left image -r [RIGHT_IMAGE] The right image -w [WIDTH_PIXELS] The width of the smallest feature -h [HEIGHT_PIXELS] The height of the smallest feature -t [MAX_DISPLACE] The threshold for maximum displacement OPTIONAL -o [OUTPUT_IMAGE] Draw output to this file -v Print the output to stdout as readable bytes
-o option will let you visualize your depth map as a BMP image that you can open in your file browser. In these images, blue regions are far away and red regions are close by.
A testing framework and a few sample tests are provided for you. These tests are not a guarantee of the correctness of your code. Your code will be graded on its correctness, not whether or not it passes these tests. You can run the testing framework with
make check. For calc_depth, the output images and bytes will be written to
test/output/TEST_NAME-output.txt. For quadtree, the printed output is written to
You can use your own 8-bit grayscale BMP images to test your code. There are helper functions in
utils.c to generate BMP files from
unsigned char arrays. Alternatively, you can generate them with an image editing program like Photoshop.
$ ./depth_map -l test/images/quilt2-left.bmp -r test/images/quilt2-right.bmp -h 0 -w 0 -t 1 -o test/output/quilt2-output.bmp -v 00 00 00 00 ff 00 00 00 b4
- Try to think about how to marginalize the problem before starting. How do you select the closest portion of a picture? How do you evaluate how close two feature spaces are? How do you determine the distance between two pixels?
- If you get stuck try testing a portion with the CUnit tests (see the testing section below).
- Step through the unit tests in cgdb rather than a whole program.
- If you find yourself stuck or not passing tests doublecheck what the spec asks you to do. It impossible to write effective tests if you get the expected results wrong.
Task B (Due 9/18)
Your second task will be to implement quadtree compression. This function takes a depth map (
unsigned char *depth), and generates a recursive data structure called a quad tree.
The depth maps that we create in this project are just 2D arrays of
unsigned char. When we interpret each value as a square pixel, we can output a rectangular image. We used bitmaps in this project, but it would be incredibly space inefficient if every image on the internet were stored this way, since bitmaps store the value of every pixel separately. Instead, there are many ways to compress images (ways to store the same image information with a smaller filesize). In task B, you will be asked to implement one type of compression using a data structure called a quadtree.
A quadtree is similar to a binary tree, except each node must have either 0 children or 4 children. When applied to a square bitmap image whose width and height can be expressed as 2N, the root node of the tree represents the entire image. Each node of the tree represents a square sub-region within the image. We say that a square region is homogenous if its pixels all have the same value. If a square region is not homogenous, then we divide the region into four quadrants, each of which is represented by a child of the quadtree parent node. If the square region is homogeneous, then the quadtree node has no children and instead, has a value equal to the color of the pixels in that region.
We continue checking for homogeneity of the image sections represented by each node until all quadtree nodes contain only pixels of a single grayscale value. Each leaf node in the quadtree is associated with a square section of the image and a particular grayscale value. Any non-leaf node will have a value of -1 (outside the grayscale range) associated with it, and should have 4 child nodes.
We will be numbering each child node created (1-4) clockwise from the top left square, as well with their ordinal direction (NW, NE, SE, SW). When parsing through nodes, we will use this order: 1: NW, 2: NE, 3: SE, 4: SW.
Given a quadtree, we can choose to only keep the leaf nodes and use this to represent the original image. Because the leaf nodes should contain every color value represented, the non-leaf nodes are not needed to reconstruct the image. This compression technique works well if an image has large regions with the same grayscale value (artificial images), versus images with lots of random noise (real images). Depth maps are a relatively good input, since we get large regions with similar depths.
Your task is to write the depth_to_quad(), homogenous() and free_qtree() functions located in make_qtree.c.
The first function, depth_to_quad() takes an array of unsigned char, converts it into a quadtree, and returns a
pointer to the root qNode in the tree. Keep in mind that local variables don't last after your function returns, so you must use
dynamic memory allocation in the function. Since memory allocation could fail, you need to check whether the pointer
malloc() is valid or not. If it is NULL, you should call allocation_failed() (defined in utils.h).
Your representation should use a tree of qNodes, all of which either have 0 or 4 children. The declaration of the struct qNode is in quadtree.h.
The second function homogenous() takes in the depth_map as well as a region of the image (top left coordinates, width, and height). If every pixel in that region has the same grayscale value, then homogenous() should return that value. Or else, if the section is non-homogenous, it should return -1.
The third function free_qtree() should take in the root of a qtree and should free all the memory associated with that tree. Since any node is itself the root of a subtree the root passed in just needs to be a malloced node, not the necessarily the root of the original tree.
Here are a few key points regarding your quadtree representation:
- Leaves should have the boolean value leaf set to
true, while all other nodes should have it as
- The gray_value of leaves should be set to their grayscale value, but non-leaves should take on the value -1.
- The x and y variables should hold the top-left pixel coordinate of the image section the node represents.
- We only require that your code works with images that have widths that are powers of two. This means that all qNode sizes will also be powers of two and the smallest qNode size will be one pixel.
- The four child nodes are marked with ordinal directions (NW, NE, SE, SW), which you should match closely to the corresponding sections of the image.
- Some test cases are provided by
make check. These are not all of the tests that we will be grading your project on.
- Don't worry about NULL images or images of size zero, we won't test for these (but you're welcome to have a check for it anyways and return null)
- Your final code needs to have no memory leaks. Make sure your free_qtree() free the entire subtree associated with a root.
- Your may not assume that the pointer passed in to free_qtree() is not NULL.
The following example illustrates these points:
You can compile your code for task B with the following command:
$ make quadtree
Running the program with no arguments will print out the quadtree and compressed representation for a few arrays defined in print_basic(). You can also pass in the name of a grayscale bitmap image, and the code will compress the image and print out the quadtree and compressed representations (although we have not included any tests). Note that for any images whose dimensions are not square powers of two, the program will grab a square section at the approximate center of the image.
- Try segmenting the problem. First construct the quadtree and then try including compression.
- If you find yourself stuck add CUnit tests (see testing).
- Run valgrind to make sure your code has no memory leaks. You should figure out how to do this from task C.
Task C (due 9/18)
The final task is a small exercise intended to teach you have to use valgrind. From lecture and lab you should have seen that memory that is allocated and not freed results in a memory leak. One particularly useful piece of software for detecting memory leaks is valgrind and it is already installed on the hive. If you learn how to use valgrind you can quickly detect many memory errors that occur (not just memory leaks). Unfortunately many students do not realize how powerful valgrind can be and so this exercise is intended to assist you in learning to use valgrind and to approach documentation in general. In the depth_map program there exists exactly 1 memory leak in the starter code. Your task is to find the memory leak and develop a solution. When you find the solution you will edit leak_fix.py with the location of the leak and the line of c code needed to fix it. For example if there a file called example.c which had a memory leak that could be solved by freeing the variable weezy right before line 15, then you would fill the python file to contain.
filename = "example.c" linenum = 15 line = "free (weezy);"
We will insert the line you specified in the file you specified via a script. You only need to supply a working line number (not any particular line number). The line you insert should be a valid line of C code. This exercise is not meant to be difficult but is intended to get you to explore learning how to use testing software from documentation. It is of course possible to brute force but it will defeat the purpose and most likely take longer. Because of these goals we will have the following rules:
- You may not share any information about the file to check, the line number, or the variable that needs to be freed.
- You may not post any information about the proper way to use valgrind to find the memory leak. Half this task is about learning how to read documentation.
Debugging and Testing
Your code is compiled with the -g flag, so you can use CGDB to help debug your program. While adding in print statements can be helpful, CGDB can make debugging a lot quicker, especially for memory access-related issues. While you are working on the project, we encourage you to keep your code under version control via your github classroom account.
In addition, we have included a few functions to help make development and debugging easier:
- print_image(const unsigned char *data, int width, int height): This function takes in an array of pixels and prints their values in hex to standard output.
- save_image(char *filename, const unsigned char *data, int width, int height): This function takes in an array of pixels and saves them to a new bmp file at a specified filename.
- print_qtree(qNode *qtree_root): This function takes in a qNode and prints out the quadtree.
- print_compressed(qNode *qtree_root): This function takes in a qNode and prints out the compressed representation of the quadtree.
The test cases we provide you are not all the test cases we will test your code with. You are highly encouraged to write your own tests before you submit. Feel free to add additional tests into the skeleton code, but do not make any modifications to function signatures or struct declarations. This can lead to your code failing to compile during grading.
make check will run the test cases against your code. You will see results like this:
$ make check Running: ./depth_map -l test/images/quilt1-left.bmp -r test/images/quilt1-right.bmp -h 0 -w 0 -t 1 -o test/output/quilt1-output.bmp -v Wrong output. Check test/output/quilt1-output.txt and test/expected/quilt1-expected.txt ...
You can open up
test/output/quilt1-output.bmp with an image viewer to see what kind of depth map your algorithm produced. You can open up
test/output/quilt1-output.txt with a text editor to see the actual values it produced. The expected values will be in
For further testing you can optionally write cunit tests, whose documentation can be found here. A starter framework has been provided showing sample tests on an example helper function square_euclidean_distance (). These CUnit tests are not graded and do not test much at all. Instead these are provided to give you a starting framework to run these tests. CUnit is already installed on the hive. You can run the CUnit tests with:
$ make run-unit-tests
The result of which is a breakdown for what tests you pass. For example initially it will look like:
Restoring Your Work
Before you turn in your project you should ensure that if you checkout the skeleton code and only swap in the files you are allowed to submit that the project will work as intended. You can use the
git checkout command to do so, but be careful about the arguments you pass in as this will overwrite files. (Git will usually make sure you've committed changes before running the checkout though.)
$ git fetch starter $ git checkout starter/master [files]
Before you submit, make sure you test your code on the Hive machines. We will be grading your code there. IF YOUR PROGRAM FAILS TO COMPILE, YOU WILL AUTOMATICALLY GET A ZERO FOR THAT PORTION (ie. If depth_map works but quadtree doesn't compile, you will receive points for depth_map but none for quadtree). There is no excuse not to test your code.
Submitting is a two step process. You will need to submit on both the hive machine through glookup (where we will actually grade the submission) and tag your submission on github in case any issues arise. The full project 1-1 is due Tuesday 9/18 at 23:59.
To submit the full proj1-1 through glookup, enter in the following on the hive machine. You should be turning in calc_depth.c, make_qtree.c, and leak_fix.py.
$ cd ~/fa18-proj1-[YOUR USERNAME] $ submit proj1-1
To tag the commit of your submission on github run the following commands.
$ cd ~/fa18-proj1-[YOUR USERNAME] $ git add -u # should add all unmodified files in proj1 directory $ git commit -m "Project 1-1 submission" # or any other commit msg $ git tag -f "proj1-1-sub" # The tag MUST be "proj1-1-sub". Failure to do so will result in a loss of credit. $ git push origin master --tags # Note the --tags must be included to push tags to github