I want to write down what our group has learned in our final project for the Computational Photography course. We wanted to explore how to capture a 360° environment map in real world and use that to perform image-based lighting in Maya. We tried two approaches, the mirror ball unwrapping approach and the panorama stitching approach. I will mainly talk about the panorama-based approach since that is what I worked on.

I mainly followed the slides and problem sets in MIT's 6.815 Digital Computation Photography course to implement the auto-stitching part. The auto-stitching consists of several stages to produce high quality corresponding feature point pairs between two images. It mainly consists of 4 stages: corner detection, descriptor creation, correspondence search and RANSAC.

In the first stage, we use the Harris Corner Detector to produce the feature points, which involves find structure tensors, calculating responses and find the local maximums of the responses. We also need a way to describe the neighborhood of a feature point, which is called “descriptors”, to help us measure the similarity of two feature points. We just used a simple patch descriptor, which is just a $k \times k$ patch around a feature point with some gaussian blur and values normalized. After getting the descriptors, we simply used L2 distance to determine how "close" any two descriptor are and kept the descriptor pairs whose distance is under a threshold. We also used the second-best test to filter out too ambiguous matches. Finally, even after the second-best test, usually our descriptor pairs will still contain some outliers. We used RANSAC to filter out these final outliners.

RANSAC is really powerful for eliminating outliners in descriptor matches, as demonstrated in the image below. The green lines shows the inlier matches while the red lines shows the outliner matches. The blue lines shows the matches selected by RANSAC to calculated the final rotation matrix $R$ between the two images.

After being able to stitch two images together with autostitching, we still need to figure out how to compose a full panorama by stitching N images. We can either stitch the images pair by pair locally, or use some global optimization approach. Due to the time constraint, we decided to use the local approach. We found that if you first stitch all pitch angles of a single yaw, then stich images of different yaw angles, you will tend to get much less ghosting in the output image.

And this is how we automatically generated the panorama image above given a series of individual images. Compared to the mirror ball unwrapping approach, the panorama-based approach can produce environment maps with much higher resolution. However, when using the panorama-based approach, the north and south pole area will cause a lot of problems. For example, if the sky is of a single color, the corner detector won't be able to find any corner and the whole pipeline will fail. To generate the final image combining a real-world photo with a virtual scene, we had to use some tricks (such as adding a hat to the mirror ball) to hide the holes in the north and south pole area.

We further compared these two methods in this table:

MethodMirror BallPanorama