This morning I wrote the initial implementation of a raytracer to satisfy the requirements of assignment 3, roughly following the guidelines given in the "Note on Raytracer Design" from the resources page of the class website.

My goal to start was to rough out the different classes that would be needed and put in just enough details to get past The Black Screen of Doom.

First, just get the BSOD

I started by writing a Scene class that took the basic arguments needed to establish the scene:

  1. the eye position
  2. the four corners of the image plane
  3. the output image dimensions in pixels

Next, I wrote the Sampler and Film classes that would generate and store samples. For now, the sampler just generates a single sample in the center of each pixel. The Film aggregates samples, storing them in buckets (one bucket for each pixel). When it's time to write out the file, the Film averages all the samples in each pixel's bucket to produce that pixel's output color.

At this point, I have everything I need to generate my first image! Suppose I specify an output image of 100 x 100 pixels. All 10,000 samples hit nothing (I'm not even tracing the rays yet), so this is what I get: BSOD

Sweet! I have achieved The Black Screen of Doom in the correct dimensions.

Now, get something to look at

I next implement the Sphere class. The sphere is defined by just two parameters, it's center and it's radius. I also implement a method to test if a sample ray hits the sphere (check Shirley for how to do this efficiently).

Also, I write my RayTracer class that checks each sample ray to see if it hits any surfaces in my scene. If it hits something, it sets that sample to be red. Otherwise, it sets it to be black.

We've got everything we need now to generate a much more interesting image. Suppose I set up my scene with the following parameters:

	eye = ( 0,  0,  0)
	UL  = (-1,  1, -1)
	UR  = ( 1,  1, -1)
	LR  = ( 1, -1, -1)
	LL  = (-1, -1, -1)
	output resolution = 100 x 100
	

And add the following surfaces to it:

	sphere
	  radius = 1
	  center = (0, 0, -2)
	
When I render the scene, this is what I get: redcircle

Oh baby, look at that! Familiar, eh? This image doesn't seem like much, but getting it means that we have tons of stuff working. We _know_ we're correctly

Not too shabby for a couple of hours work, right? Now I should test the heck out of the code I've written so I won't have to worry about these parts in the future. For example, what happens if I change the output resolution to 200 x 100, but nothing else? This: redcircle wide

Does this output make sense?

What if I change the image plane so that it's tilted with respect to the eye? Say,

	eye = ( 0,  0,  0)
	UL  = (-1,  1,  0)
	UR  = ( 1,  1, -2)
	LR  = ( 1, -1, -2)
	LL  = (-1, -1, -0)
	

Here, the image plane is closer to the eye on left and further on the right. Here's the output: redcircle tilt

Does this output make sense?

As you can see, even though I only have a very basic raytracer so far, I can still do a lot of testing, and I can start exploring how changing the eye position, image plane, and output resolution affect my final image.

Lights, Shading, and Shadows

Now that I have the basic structure of the raytracer laid down, I can start fleshing it out a bit.

First, I write a class to represent directional lights. The class constructor just takes the direction and intensity of the light, and has one method that, given a point in space, returns the light vector. Because directional lights point the same direction everywhere, this method just returns a constant value, but you can think of how it would change for other types of lights.

Next, I need to change the shading of my spheres to take advantage of lights. This means implementing a shading model more or less identical to the one in assignment 2. It looks something like

(given a particular ray we wish to determine the color of that hits a sphere with particular surface properties ka, kd, ks, specular exponent)

	color = black
	color += ka
	for each light in the scene:
	    color += kd term
	    color += ks term
	

At this point, I should be able to reproduce something like the final output from assignment 2, e.g.: phong

Now I'd like to add something to the Phong model that raytracing makes easy&emdash;shadows. Implementing them is a snap. Just before I shade the phong model due to a particular light, I send a ray from the point on the surface I'm shading toward the light. If the ray hits anything, I stop the shading computation. This shadows the surface at that point.

If you're not careful about the "anything," numerical error may cause you to think that an object is shadowing itself. You can avoid this by adding a "shadow bias," which is just a minimum t value for shadow ray intersections, or you can skip checking for shadows caused by the surface you're currently shading.

Now we can toss in another light and a couple of more spheres to get scenes like this: shadow

Triangles and Reflections

The next thing I'll add to my raytracer is triangle intersection. The book provides a pretty efficient method to do this. The triangle class should expose the same methods as the sphere class we originally made.

You'll also probably notice that getting the intersection code debugged is harder than it was for the sphere--this is why people always seem to raytrace spheres first. Triangles have some advantages, though. For example, is it necessary to compute the normal vector for each point on the triangle that gets hit by a ray?

I test my triangle code by adding one to the scene:

	Triangle with v1=[  5.   5. -17.] v2=[  1.   4. -20.] v3=[  6.  -1. 20.]
	    ka=[ 0.1  0.1  0.1]
	    kd=[ 0.1  0.1  0.1]
	    ks=[ 1.  1.  1.]
	    kr=[ 0.  0.  0.]
	    sp=50.0

Resulting in this image: triangle

Adding reflection to the scene is almost as easy as adding shadows was. First, I need to add a material property (I call it "kr" for "reflection") that describes how to attenuate the color that comes from the reflected ray. Next, I need to compute the reflected ray (check Shirley for the formula).

To figure out the color of the reflected ray, I can just reuse my trace routine to compute the color of the reflected ray. The simplest way to do this is just include a recursive call in my trace function, so this is how I'll implement it. But what if a reflected ray hits something that is itself reflective, and so on forever? I need to add another argument to my trace function that keeps track of the depth of recursion, and set a limit at which I bail out. I'll start by setting this depth at 1, so reflective materials trace just one bounce. If I make my new triangle reflective, I can check that everything looks okay:

The complete scene description is now:

	Scene configuration:
	        eye=[ 0.  0.  0.]
	        LL=[-1. -1. -3.]
	        LR=[ 1. -1. -3.]
	        UR=[ 1.  1. -3.]
	        UL=[-1.  1. -3.]
	        pixelsX=500
	        pixelsY=500

	Lights:
	    Directional light of intensity [ 1.  1.  1.] in direction [ 0.57735027 -0.57735027 -0.57735027]
	    Directional light of intensity [ 0.  0.  1.] in direction [ 0.57735027  0.57735027 -0.57735027]

	Surfaces:
	    Sphere with center=[  0.   0. -20.] radius=3.0
	        ka=[ 0.1  0.1  0.1]
	        kd=[ 1.  0.  1.]
	        ks=[ 1.  1.  1.]
	        kr=[ 0.  0.  0.]
	        sp=50.0

	    Sphere with center=[ -2.   2. -15.] radius=1.0
	        ka=[ 0.1  0.1  0.1]
	        kd=[ 1.  1.  0.]
	        ks=[ 1.  1.  1.]
	        kr=[ 0.  0.  0.]
	        sp=50.0

	    Sphere with center=[ -2.  -2. -15.] radius=1.0
	        ka=[ 0.1  0.1  0.1]
	        kd=[ 0.  1.  1.]
	        ks=[ 1.  1.  1.]
	        kr=[ 0.  0.  0.]
	        sp=50.0

	    Triangle with v1=[  5.   5. -17.] v2=[  1.   4. -20.] v3=[  6.  -1. -20.]
	        ka=[ 0.1  0.1  0.1]
	        kd=[ 0.1  0.1  0.1]
	        ks=[ 1.  1.  1.]
	        kr=[ 1.  1.  1.]
	        sp=50.0

Now that we've got basic reflection working, it's probably worth trying a deeper recursion. We can have lots of fun with it. Here's a scene with several reflective spheres and a recursion depth limit of 5:

Bounce my little rays, BOUNCE!

Here's the complete scene spec:

	Scene configuration:
	        eye=[ 0.  0.  0.]
	        LL=[-1. -1. -3.]
	        LR=[ 1. -1. -3.]
	        UR=[ 1.  1. -3.]
	        UL=[-1.  1. -3.]
	        pixelsX=1000
	        pixelsY=1000

	Lights:
	    Directional light of intensity [ 1.  1.  1.] in direction [ 0.57735027 -0.57735027 -0.57735027]
	    Directional light of intensity [ 1.  1.  1.] in direction [-0.57735027  0.57735027  0.57735027]

	Surfaces:
	    Sphere with center=[  0.   0. -17.] radius=2.0
	        ka=[ 0.1  0.1  0.1]
	        kd=[ 1.  0.  0.]
	        ks=[ 1.  1.  1.]
	        kr=[ 0.9  0.9  0.9]
	        sp=50.0

	    Sphere with center=[  0.   4. -17.] radius=1.5
	        ka=[ 0.1  0.1  0.1]
	        kd=[ 0.  1.  0.]
	        ks=[ 1.  1.  1.]
	        kr=[ 0.9  0.9  0.9]
	        sp=50.0

	    Sphere with center=[  0.  -4. -17.] radius=1.5
	        ka=[ 0.1  0.1  0.1]
	        kd=[ 0.  0.  1.]
	        ks=[ 1.  1.  1.]
	        kr=[ 0.9  0.9  0.9]
	        sp=50.0

	    Sphere with center=[  4.   0. -17.] radius=1.5
	        ka=[ 0.1  0.1  0.1]
	        kd=[ 1.  1.  0.]
	        ks=[ 1.  1.  1.]
	        kr=[ 0.9  0.9  0.9]
	        sp=50.0

	    Sphere with center=[ -4.   0. -17.] radius=1.5
	        ka=[ 0.1  0.1  0.1]
	        kd=[ 0.  1.  1.]
	        ks=[ 1.  1.  1.]
	        kr=[ 0.9  0.9  0.9]
	        sp=50.0

Transformations

One of the requirements for the assignment is to support arbitrary ellipsoids in the scene. While it's possible to intersect rays with ellipsoids directly, it's much more of a pain than to intersect a ray with a sphere. What to do?

If you think about an axis-aligned ellipsoid, you can see that it's just a sphere that's been stretched in x, y, or z. So I'm going to think of axis-aligned ellipsoids as spheres that have had some linear transformation applied to them.

For example, an ellipsoid centered at the origin with x, y, z radii of (2, 3, 1) could be represented as a unit sphere centered at the origin transformed by the matrix

[ 2 0 0 0 ]
[ 0 3 0 0 ]
[ 0 0 1 0 ]
[ 0 0 0 1 ]

Furthermore, I can think of an ellipsoid with the same shape but centered at (3, 4, 5) as a unit sphere centered at the origin transformed by the matrix

[ 2 0 0 3 ]
[ 0 3 0 4 ]
[ 0 0 1 5 ]
[ 0 0 0 1 ]

I could even go on and apply a rotation--the transformation still stays linear. Suppose I have an ellipsoid that is a sphere scaled by the matrix S, rotated by the matrix R, and translated by the matrix T. Then I can think about the overall transformation as

M = T * R * S

Great, but how does this help me? You can think of these matrices as taking you from the simple, happy space where the shape is a unit sphere at the origin into the world space of the scene. So,

M * (Unit sphere space) = World space

It also stands to reason, because this is a linear transform, that

M-1 * (World space) = Unit sphere space

I could find M-1 either using a matrix inversion library or just observing that

M-1 = S-1 * R-1 * T-1

and building it by hand.

Great, but why do I care? The key thing is this: If I take a ray, and apply a linear transformation to it, another ray will pop out the other side. So, if I have a ray and I want to know if it will hit the ellipsoid in world space, why not just move that ray into unit sphere space and perform the intersection there? To be clear, the answers to the two questions ARE EXACTLY THE SAME:

  1. For what values of t, if any, does my ray s + td hit this crazy ellipsoid?
  2. For what values of t, if any, does my transformed ray (M-1)s + t(M-1)d hit the unit sphere centered at the origin?

Sweet.

So now, I can just subclass Sphere to make my Ellipsoid class, and call the parent's intersection routine after transforming the ray. As Shirley notes in 10.8, you have to be a bit careful about how the normals transform.

Once I've added ellipsoids, I can create more gratuitous scenes such as:

Here's the scene spec:

Scene configuration:
        eye=[ 0.  0.  0.]
        LL=[-1. -1. -3.]
        LR=[ 1. -1. -3.]
        UR=[ 1.  1. -3.]
        UL=[-1.  1. -3.]
        pixelsX=1000
        pixelsY=1000

Lights:
    Directional light of intensity [ 1.  1.  1.] in direction [ 0.57735027 -0.57735027 -0.57735027]
    Directional light of intensity [ 1.  1.  1.] in direction [-0.57735027  0.57735027  0.57735027]

Surfaces:
    Ellipsoid with center=[  0.   0. -17.] rx=4.0 ry=2.0 rz=2.0
                rotx=0.0 roty=0.0 rotz=0.0
        ka=[ 0.1  0.1  0.1]
        kd=[ 1.  0.  0.]
        ks=[ 1.  1.  1.]
        kr=[ 0.9  0.9  0.9]
        sp=50.0

    Ellipsoid with center=[ -2.   4. -17.] rx=0.5 ry=1.5 rz=1.0
                rotx=0.0 roty=-45 rotz=45.0
        ka=[ 0.1  0.1  0.1]
        kd=[ 0.  1.  0.]
        ks=[ 1.  1.  1.]
        kr=[ 0.9  0.9  0.9]
        sp=50.0

    Ellipsoid with center=[ -2.  -4. -17.] rx=0.5 ry=1.5 rz=1.0
                rotx=0.0 roty=-45 rotz=-45.0
        ka=[ 0.1  0.1  0.1]
        kd=[ 0.  0.  1.]
        ks=[ 1.  1.  1.]
        kr=[ 0.9  0.9  0.9]
        sp=50.0

    Ellipsoid with center=[  2.   4. -17.] rx=0.5 ry=1.5 rz=1.0
                rotx=0.0 roty=45 rotz=135.0
        ka=[ 0.1  0.1  0.1]
        kd=[ 1.  1.  0.]
        ks=[ 1.  1.  1.]
        kr=[ 0.9  0.9  0.9]
        sp=50.0

    Ellipsoid with center=[  2.  -4. -17.] rx=0.5 ry=1.5 rz=1.0
                rotx=0.0 roty=45 rotz=-135.0
        ka=[ 0.1  0.1  0.1]
        kd=[ 0.  1.  1.]
        ks=[ 1.  1.  1.]
        kr=[ 0.9  0.9  0.9]
        sp=50.0

Intersection Acceleration

The last thing I want to do with my raytracer is accelerate ray-object intersections. My basic strategy is to build a bounding-box hierarchy as a tree that will let me reduce the number of intersection tests I make for each ray.

The strategy is described in detail here.

It basically amounts to using a kd-tree heuristic to merge bounding boxes in a sane manner. I could be even more aggressive and choose my bounding boxes based on the probability that a ray will hit them, but this is a simpler implementation.

In order to get this approach started, though, I need to know how to compute axis-aligned bounding boxes for each of my surface primitives. For a triangle, it's easy--I just take the maximum and minimum x, y, and z value over all three vertices to define the two corners of the box. For a sphere, I can use the radius and center information to determine how big the box should be. For ellipsoids, I need to be more careful--the bounding box for the untransformed sphere will not, in general, be axis aligned after I've applied a rotation to it. But, I can still easily pick a (not necessarily optimal) bounding box by taking the maximum and minimum x, y, and z values of the eight corners of the transformed bounding box of the original sphere.

The final problem we want to solve efficiently is intersecting rays with these axis-aligned bounding boxes. There are a lot of ways to do this, but it's important to realize (as Smits' notes mention) that we don't care _where_ a ray hits the AABB, just whether it hits it. This means we can save a lot of time. Two efficient algorithms for determining intersection are:

but a slower, more traditional intersection method would do just fine as well.

Once I have built this bounding box hierarchy, I can use it to answer the question "what surface will this ray hit" where before I was using a for loop over all the surfaces in the scene. This makes it much easier to do scenes like this one, where I've visualized the RGB color cube with a bunch of spheres:

You could extend this idea even further. For example, if you have a car model in your scene that's represented by 10,000 triangles, you'd certainly want to using a bounding box hierarchy to represent it. But what if you wanted 5 cars in your scene, each one transformed differently? In this case, it makes sense to add a surface primitive which is itself a bounding box hierarchy to which an arbitrary transformation can be applied, so that you could instance it without wasting memory.