S. Mann, 2022 March 17

Featured on Instructables.com

Panoramic imaging using algebraic projective geometry (i.e. combining multiple possibly differently exposured images) was first proposed and implemented by Mann, and published in 1993 [1][2].

Broadly there are two kinds of panorams (2 categories under which pictures can be prefectly combined together):
Type I panorama: the center-of-projection of the camera should remain at a fixed location, and if there are no moving elements in the picture, all pictures will be in the same orbit of the projective group of coordinate transformations. This type of panorama is often used to capture a full 360 degrees, as in the selfie below, in which I appear twice (once at the left edge of the picture at 0 degrees and once at the right edge at 360 degrees):

For this to work perfectly nothing in the image should move. I didn't do a very good job of staying still while whipping the camera around in a circle, so I have a different facial expression at 0 degrees than at 360 degrees. But that movement can sometimes be used artistially, e.g. deliberate violation of the no-movement rule, as below, where people moved around so they'd appear in the photo multiple times with different facial expressions and different poses each time they appear:

Type II panorama: the subject matter is planar. An approximate example is shown below:

The (somewhat) flat storefronts are depicted in an approximately true and accurate way. All pictures of static planar subject matter are in the same orbit of the projective group of coordinate transformations. However, objects that violate this assumption, either by moving, or by being away from this planar surface, are not rendered in a true and accurate way. This effect can of course be used creatively, artistically, or the like.

The easiest way to make a Type II panorama is to move the camera parallel to a planar surface, e.g. as a "drive-by shooting" (a photo/video shoot from a moving vehicle, e.g. if you're a passenger in a car or train or bus), or using a wagon or bicycle or track or rail to move the camera in a controlled way.

Alternatively you can capture a planar subject in motion and just hold the camera still, as the object moves.

Type I versus Type II panoramas: The difference between the Type I and Type II panorama is depicted in the diagram below:

(from Chapter 6 of Intelligent Image Processing, Wiley 2001, author S. Mann)

The Type I panorama:

In the Type I panorama, the camera's center-of-projection remains stationary and the camera turns to face in a variety of different directions, giving rise to a very simple mathematical relationship between the images. Below is a simplified depiction of this concept using, for simplicity, the situation of a one-dimensional picture of two-dimensional subject matter:

Here are some examples of Type I panoramas:

In the last example, note the effects of motion. The camera is panning from right to left (West to East), from the roof of a building on Dundas Street West, facing the AGO (Art Gallery of Ontario). The Eastbound streetcar exhibits prograde ("forward" with respect to the camera) motion, moving with the camera, and thus appears dilated (stretched). The Westbound streetcar exhibits retrograde motion, moving against the direction of the camera, and thus appears contracted (squashed). While in reality both these streetcars are the same length, Westbound occupies greater imagespace and Eastbound occupies lesser imagespace then they are in reality.

The Type II panorama:

In the Type II panorama, the subject matter is planar, ideally. In practice there may be some depth, but it is best if the primary subject matter of interest is largely planar (flat). Interesting things can happen, of course, to things that are outside that flat plane, as well as to things that are moving. Here are some examples of Type II panoramas:

Note that the effects of prograde or retrograde motion may still be present. Here for example (below), we see the effects of retrograde motion on a passing streetcar:

We might also hold the camera still and let the relative motion arise from the moving streetcar, or let there be a combination of self motion (camera motion, i.e. egomotion) and subject-matter motion. Here's a chance to have some fun:

The Type III panorama:

The Type III panorama combines Type I and Type II portions of the image. Here are some examples of Type III panoramas. In the first example, I'm a passenger in a contractor's van, moving along at a constant speed, scanning the street, and then when the van stops at a red light, I swing the camera around to show the vehicle interiour, thus making a Type III selfie:

In the last example, I'm walking along a drugstore shelf, and then I stop and pan the camera around. The left side of the image is a TypeI and the right side is a Type II.

Here are some more additional examples (link).

Lab assignment:

To understand VideoOrbits, read the textbook, Chapter 6[4] and review the lecture notes and videos, as well as research papers and links, e.g. http://wearcam.org/orbits/.

Part A: estimation of image shift (translation). Construct two images that are merely shifted (translated) versions of each other. You can do this by cropping a large image into two smaller overlapping regions. Devise a simple way to estimate the translation (shift). You can use the approach outlined in class, for example. Alternatively, you can consider Fourier cross-correlation, phase-only fitering, or the like, as outlined in the Textbook[4].

Part B: Construct some Type III panorams. Explain your findings, and try to show examples that illustrate your understanding of the differences and similarities between Type I and Type II panoramas.

Post your results on the TypeIII Panorama Instructable.

Bibliographic reference citations:

[1] Mann was the first to propose and implement an algorithm to estimate a camera's response function from a plurality of differently exposed images of the same subject matter. He was also the first to propose and implement an algorithm to automatically extend dynamic range in an image by combining multiple differently exposed pictures of the same subject matter. High-dynamic-range imaging (HDR): "The first report of digitally combining multiple pictures of the same scene to improve dynamic range appears to be Mann." (Robertson et al.)

[2] 1993: Mann was the first to produce an algorithm for automatically combining multiple pictures of the same subject matter, using algebraic projective geometry, to "stitch together" images using automatically estimated perspective correction. This is called the "Video Orbits" algorithm. [32][33][34]

[3] Video orbits

[4] Intelligent Image Processing, Wiley, 2001.

   author = "Steve Mann",
   title = "Intelligent Image Processing",
   publisher = "John Wiley and Sons",
   pages = "384",
   month = "November 2",
   year = "2001",
   note = "ISBN: 0-471-40637-6",