CS61C Fall 2018 Project 1: C/RISC-V

TA: Nick Riasanovsky
Part 1: Due 09/18 @ 23:59:59

Clarifications/Reminders

Goals

This project will expose you to C and RISC-V in greater depth. The first part gives you the opportunity to sharpen your C programming skills that you have learned in the past few weeks, while the second part dives deeper into coding. Part of this project will also serve as the basis for the upcoming project 2. In particular part 1's goals are:

Background

Cameras traditionally capture a two dimensional projection of a scene. However depth information is important for many real-world application including robotic navigation, face recognition, gesture or pose recognition, 3D scanning, and self-driving cars. The human visual system can preceive depth by comparing the images captured by our two eyes. This is called stereo vision. In this project we will experiment with a simple computer vision/image processing technique, called "shape from stereo" that, much like humans do, computes the depth information from stereo images (images taken from two locations that are displaced by a certain amount).

Depth Perception

Humans can tell how far away an object is by comparing the position of the object is in left eye's image with respect to right eye's image. If an object is really close to you, your left eye will see the object further to the right, whereas your right eye will see the object to the left. A similar effect will occur with cameras that are offset with respect to each other as seen below.

The above illustration shows 3 objects at different depth and position. It also shows the position in which the objects are projected into image in camera 1 and camera 2. As you can see, the closest object (green) is displaced the most (6 pixels) from image 1 to image 2. The furthest object (red) is displaced the least (3 pixels). We can therefore assign a displacement value for each pixel in image 1. The array of displacements is called displacement map. The figure shows the displacement map corresponding to image 1.

Your task will be to find the displacement map using a simple block matching algorithm. Since images are two dimensional we need to explain how images are represented before going to describe the algorithm.

Below is a classic example of left-right stereo images and the displacement map shown as an image.

Part 1 (Due 9/18 @ 23:59:59)

Objective

In this project, we will attempting to simulate depth perception on the computer, by writing a program that can distinguish far away and close by objects.

Getting started

First, create a Project 1 GitHub Classroom repository. Make sure to add https://github.com/61c-teach/fa18-proj1-starter.git as the remote as you've did for the homework and lab repositories.

The files you will need to modify and submit are:

You are free to define and implement additional helper functions, but if you want them to be run when we grade your project, you must include them in calc_depth.c or make_qtree.c. Changes you make to any other files will be overwritten when we grade your project.

The rest of the files are part of the framework. It may be helpful to look at all the other files.

Task A (Due 9/18)

Your first task will be to implement the depth map generator. This function takes two input images (unsigned char *left and unsigned char *right), which represent what you see with your left and right eye, and generates a depth map using the output buffer we allocate for you (unsigned char *depth_map).

Generating a depth map

In order to achieve depth perception, we will be creating a depth map. The depth map will be a new image (same size as left and right) where each "pixel" is a value from 0 to 255 inclusive, representing how far away the object at that pixel is. In this case, 0 means infinity, while 255 means as close as can be. Consider the following illustration:

The first step is to take a small patch (here 5x5) around the green pixel. This patch is represented by the red rectangle. We call this patch a feature. To find the displacement, we would like to find the corresponding feature position in the other image. We can do that by comparing similarly sized features in the other image and choosing the one that is the most similar. Of course, comparing against all possible patches would be very time consuming. We are going to assume that there's a limit to how much a feature can be displaced -- this defines a search space which is represented by the large green rectangle (here 11x11). Notice that, even though our images are named left and right, our search space extends in both the left/right and the up/down directions. Since we search over a region, if the "left image" is actually the right and the "right image" is actually the left, proper distance maps should still be generated.

The feature (a corner of a white box) was found at the position indicated by the blue square in the right image.

We'll say that two features are similar if they have a small Squared Euclidean Distance. If we're comparing two features, A and B, that have a width of W and height of H, their Squared Euclidean Distance is given by:


(Note that this is always a positive number.)

For example, given two sets of two 2×2 images below:

← Squared Euclidean distance is (1-1)2+(5-5)2+(4-4)2+(6-6)2 = 0 →

← Squared Euclidean distance is (1-3)2+(5-5)2+(4-4)2+(6-6)2 = 4 →
(Source: http://cybertron.cg.tu-berlin.de/pdci08/imageflight/descriptors.html)

Once we find the feature in the right image that's most similar, we check how far away from the original feature it is, and that tells us how close by or far away the object is.

Definitions (Inputs)

We define these variables to your function:

We define the variables feature_width and feature_height which result in feature patches of size: (2 × feature_width + 1) × (2 × feature_height + 1). In the previous example, feature_width = feature_height = 2 which gives a 5×5 patch. We also define the variable maximum_displacement which limits the search space. In the previous example maximum_displacement = 3 which results in searching over (2 × maximum_displacement + 1)2 patches in the second image to compare with.

Definitions (Output)

In order for our results to fit within the range of a unsigned char, we output the normalized displacement between the left feature and the corresponding right feature, rather than the absolute displacement. The normalized displacement is given by:

This function is implemented for you in calc_depth.c.

In the case of the above example, dy=1 and dx=2 are the vertical and horizontal displacement of the green pixel. This formula will guarantee that we have a value that fits in a unsigned char, so the normalized displacement is 255 × sqrt(1 + 22)/sqrt(2 × 32) = 134, truncated to an integer.

Bitmap Images

We will be working with 8-bit grayscale bitmap images. In this file format, each pixel takes on a value between 0 and 255 inclusive, where 0 = black, 255 = white, and values in between to various shades of gray. Together, the pixels form a 2D matrix with image_height rows and image_width columns.

Since each pixel may be one of 256 values, we can represent an image in memory using an array of unsigned char of size image_width * image_height. We store the 2D image in a 1D array such that each row of the image occupies a contiguous part of the array. The pixels are stored from top-to-bottom and left-to-right (see illustration below):


(Source: http://cnx.org/content/m32159/1.4/rowMajor.png)

We can refer to individual pixels of the array by specifying the row and column it's located in. Recall that in a matrix, rows and columns are numbered from the top left. We will follow this numbering scheme as well, so the leftmost red square is at row 0, column 0, and the rightmost blue square is at row 2, column 1. In this project, we will also refer to an element's column # as its x position, and it's row # as its y position. We can also call the # of columns of the image as its image_width, and the # of rows of the image as its image_height. Thus, the image above has a width of 2, height of 3, and the element at x=1 and y=2 is the rightmost blue square.

Your task

Edit the function in calc_depth.c so that it generates a depth map and stores it in unsigned char *depth_map, which points to a pre-allocated buffer of size image_width × image_height. Two images, left and right are provided. They are also of size image_width × image_height. The feature_width and feature_height parameters are described in the Generating a depth map section.

Here are some implementation details and tips:

Usage

Use make to compile the calc_depth. Your code must compile with the given Makefile:

$ make
gcc -Wall -g -std=c99 -o  ....
    ....

You can now run ./depth_map to see the options that the program takes.

$ ./depth_map
USAGE: ./depth_map [options]

REQUIRED
    -l [LEFT_IMAGE]       The left image
    -r [RIGHT_IMAGE]      The right image
    -w [WIDTH_PIXELS]     The width of the smallest feature
    -h [HEIGHT_PIXELS]    The height of the smallest feature
    -t [MAX_DISPLACE]     The threshold for maximum displacement

OPTIONAL
    -o [OUTPUT_IMAGE]     Draw output to this file
    -v                    Print the output to stdout as readable bytes

The -o option will let you visualize your depth map as a BMP image that you can open in your file browser. In these images, blue regions are far away and red regions are close by.

A testing framework and a few sample tests are provided for you. These tests are not a guarantee of the correctness of your code. Your code will be graded on its correctness, not whether or not it passes these tests. You can run the testing framework with make check. For calc_depth, the output images and bytes will be written to test/output/TEST_NAME-output.bmp and test/output/TEST_NAME-output.txt. For quadtree, the printed output is written to test/output/TEST_NAME-output.txt.

You can use your own 8-bit grayscale BMP images to test your code. There are helper functions in utils.c to generate BMP files from unsigned char arrays. Alternatively, you can generate them with an image editing program like Photoshop.

$ ./depth_map -l test/images/quilt2-left.bmp -r test/images/quilt2-right.bmp -h 0 -w 0 -t 1 -o test/output/quilt2-output.bmp -v
00 00 00
00 ff 00
00 00 b4

Advice

Task B (Due 9/18)

Your second task will be to implement quadtree compression. This function takes a depth map (unsigned char *depth), and generates a recursive data structure called a quad tree.

Quadtree Compression

The depth maps that we create in this project are just 2D arrays of unsigned char. When we interpret each value as a square pixel, we can output a rectangular image. We used bitmaps in this project, but it would be incredibly space inefficient if every image on the internet were stored this way, since bitmaps store the value of every pixel separately. Instead, there are many ways to compress images (ways to store the same image information with a smaller filesize). In task B, you will be asked to implement one type of compression using a data structure called a quadtree.

A quadtree is similar to a binary tree, except each node must have either 0 children or 4 children. When applied to a square bitmap image whose width and height can be expressed as 2N, the root node of the tree represents the entire image. Each node of the tree represents a square sub-region within the image. We say that a square region is homogenous if its pixels all have the same value. If a square region is not homogenous, then we divide the region into four quadrants, each of which is represented by a child of the quadtree parent node. If the square region is homogeneous, then the quadtree node has no children and instead, has a value equal to the color of the pixels in that region.

We continue checking for homogeneity of the image sections represented by each node until all quadtree nodes contain only pixels of a single grayscale value. Each leaf node in the quadtree is associated with a square section of the image and a particular grayscale value. Any non-leaf node will have a value of -1 (outside the grayscale range) associated with it, and should have 4 child nodes.


(Source: http://en.wikipedia.org/wiki/File:Quad_tree_bitmap.svg)

We will be numbering each child node created (1-4) clockwise from the top left square, as well with their ordinal direction (NW, NE, SE, SW). When parsing through nodes, we will use this order: 1: NW, 2: NE, 3: SE, 4: SW.

Given a quadtree, we can choose to only keep the leaf nodes and use this to represent the original image. Because the leaf nodes should contain every color value represented, the non-leaf nodes are not needed to reconstruct the image. This compression technique works well if an image has large regions with the same grayscale value (artificial images), versus images with lots of random noise (real images). Depth maps are a relatively good input, since we get large regions with similar depths.

Your Job

Your task is to write the depth_to_quad(), homogenous() and free_qtree() functions located in make_qtree.c. The first function, depth_to_quad() takes an array of unsigned char, converts it into a quadtree, and returns a pointer to the root qNode in the tree. Keep in mind that local variables don't last after your function returns, so you must use dynamic memory allocation in the function. Since memory allocation could fail, you need to check whether the pointer returned by malloc() is valid or not. If it is NULL, you should call allocation_failed() (defined in utils.h).

Your representation should use a tree of qNodes, all of which either have 0 or 4 children. The declaration of the struct qNode is in quadtree.h.

The second function homogenous() takes in the depth_map as well as a region of the image (top left coordinates, width, and height). If every pixel in that region has the same grayscale value, then homogenous() should return that value. Or else, if the section is non-homogenous, it should return -1.

The third function free_qtree() should take in the root of a qtree and should free all the memory associated with that tree. Since any node is itself the root of a subtree the root passed in just needs to be a malloced node, not the necessarily the root of the original tree.

Here are a few key points regarding your quadtree representation:

  1. Leaves should have the boolean value leaf set to true, while all other nodes should have it as false.
  2. The gray_value of leaves should be set to their grayscale value, but non-leaves should take on the value -1.
  3. The x and y variables should hold the top-left pixel coordinate of the image section the node represents.
  4. We only require that your code works with images that have widths that are powers of two. This means that all qNode sizes will also be powers of two and the smallest qNode size will be one pixel.
  5. The four child nodes are marked with ordinal directions (NW, NE, SE, SW), which you should match closely to the corresponding sections of the image.
  6. Some test cases are provided by make check. These are not all of the tests that we will be grading your project on.
  7. Don't worry about NULL images or images of size zero, we won't test for these (but you're welcome to have a check for it anyways and return null)
  8. Your final code needs to have no memory leaks. Make sure your free_qtree() free the entire subtree associated with a root.
  9. Your may not assume that the pointer passed in to free_qtree() is not NULL.

The following example illustrates these points:

Turning a matrix into a quadtree.

You can compile your code for task B with the following command:

$ make quadtree

Running the program with no arguments will print out the quadtree and compressed representation for a few arrays defined in print_basic(). You can also pass in the name of a grayscale bitmap image, and the code will compress the image and print out the quadtree and compressed representations (although we have not included any tests). Note that for any images whose dimensions are not square powers of two, the program will grab a square section at the approximate center of the image.

Advice

Task C (due 9/18)

The final task is a small exercise intended to teach you have to use valgrind. From lecture and lab you should have seen that memory that is allocated and not freed results in a memory leak. One particularly useful piece of software for detecting memory leaks is valgrind and it is already installed on the hive. If you learn how to use valgrind you can quickly detect many memory errors that occur (not just memory leaks). Unfortunately many students do not realize how powerful valgrind can be and so this exercise is intended to assist you in learning to use valgrind and to approach documentation in general. In the depth_map program there exists exactly 1 memory leak in the starter code. Your task is to find the memory leak and develop a solution. When you find the solution you will edit leak_fix.py with the location of the leak and the line of c code needed to fix it. For example if there a file called example.c which had a memory leak that could be solved by freeing the variable weezy right before line 15, then you would fill the python file to contain.

filename = "example.c"
linenum = 15
line = "free (weezy);"

We will insert the line you specified in the file you specified via a script. You only need to supply a working line number (not any particular line number). The line you insert should be a valid line of C code. This exercise is not meant to be difficult but is intended to get you to explore learning how to use testing software from documentation. It is of course possible to brute force but it will defeat the purpose and most likely take longer. Because of these goals we will have the following rules:

Debugging and Testing

Your code is compiled with the -g flag, so you can use CGDB to help debug your program. While adding in print statements can be helpful, CGDB can make debugging a lot quicker, especially for memory access-related issues. While you are working on the project, we encourage you to keep your code under version control via your github classroom account.

In addition, we have included a few functions to help make development and debugging easier:

The test cases we provide you are not all the test cases we will test your code with. You are highly encouraged to write your own tests before you submit. Feel free to add additional tests into the skeleton code, but do not make any modifications to function signatures or struct declarations. This can lead to your code failing to compile during grading.

Running make check will run the test cases against your code. You will see results like this:

$ make check
Running: ./depth_map -l test/images/quilt1-left.bmp -r test/images/quilt1-right.bmp -h 0 -w 0 -t 1 -o test/output/quilt1-output.bmp -v
Wrong output. Check test/output/quilt1-output.txt and test/expected/quilt1-expected.txt
...

You can open up test/output/quilt1-output.bmp with an image viewer to see what kind of depth map your algorithm produced. You can open up test/output/quilt1-output.txt with a text editor to see the actual values it produced. The expected values will be in test/expected/quilt1-expected.txt.

CUnit

For further testing you can optionally write cunit tests, whose documentation can be found here. A starter framework has been provided showing sample tests on an example helper function square_euclidean_distance (). These CUnit tests are not graded and do not test much at all. Instead these are provided to give you a starting framework to run these tests. CUnit is already installed on the hive. You can run the CUnit tests with:

$ make run-unit-tests

The result of which is a breakdown for what tests you pass. For example initially it will look like:

Restoring Your Work

Before you turn in your project you should ensure that if you checkout the skeleton code and only swap in the files you are allowed to submit that the project will work as intended. You can use the git checkout command to do so, but be careful about the arguments you pass in as this will overwrite files. (Git will usually make sure you've committed changes before running the checkout though.)

$ git fetch starter
$ git checkout starter/master [files]

Before you submit, make sure you test your code on the Hive machines. We will be grading your code there. IF YOUR PROGRAM FAILS TO COMPILE, YOU WILL AUTOMATICALLY GET A ZERO FOR THAT PORTION (ie. If depth_map works but quadtree doesn't compile, you will receive points for depth_map but none for quadtree). There is no excuse not to test your code.

Submission

Submitting is a two step process. You will need to submit on both the hive machine through glookup (where we will actually grade the submission) and tag your submission on github in case any issues arise. The full project 1-1 is due Tuesday 9/18 at 23:59.

To submit the full proj1-1 through glookup, enter in the following on the hive machine. You should be turning in calc_depth.c, make_qtree.c, and leak_fix.py.

$ cd ~/fa18-proj1-[YOUR USERNAME]
$ submit proj1-1

To tag the commit of your submission on github run the following commands.

$ cd ~/fa18-proj1-[YOUR USERNAME]
$ git add -u # should add all unmodified files in proj1 directory
$ git commit -m "Project 1-1 submission" # or any other commit msg
$ git tag -f "proj1-1-sub" # The tag MUST be "proj1-1-sub". Failure to do so will result in a loss of credit.
$ git push origin master --tags # Note the --tags must be included to push tags to github

Part 2 COMING SOON