This article was written as an ASCII text document by Steve Mann and Charles
Wyckoff in 1991, before HTML existed.
Here it is encapsulated in an
HTML document for World-Wide Web compatability.
For the original text, see xr.txt
---------------------------------------------------
Extended Reality Steve Mann and Charles Wyckoff, MIT 4-405, 1991. XR is most widely known as eXtended Response photographic film [Wyckoff 1962] which is a color film designed for black and white photography where the extra dimension of color is instead used to extend our ability to see without color perception but instead to see extreme light and shade in black-and-white. A goal of this film is to extend vision beyond what a camera can normally see. We now generalize this concept toward what we call eXtended Reality. A simple form of X-Reality is generalized real-time X-Response that works to interactively extend human senses about 30 to 60 times a second i.e. in approximately 16 to 33 milliseconds rather than the hours it takes to shoot XR film and then run the C-22 chemical development processing to see the results. In this new sense of XR we can see the world through a special viewer made from a VR (Virtual Reality) head-mounted display connected to special television cameras. The cameras rapidly capture different exposures, and an electric system calculates the true amount of light present at each pixel (picture element) location. Rather than display in psuedocolor like XR film does, the calculated true values are displayed so that a person wearing an X-Reality system has true extended vision. Each pixel operates as if it were a light meter measuring a true quantity of light dependent only on the spectral response of the sensor. XR can also be used beyond extending the dynamic range of human vision in visible light, to extended spectra of infrared, ultraviolet, and other kinds of sensing like radio waves and sound waves. Waves may be measured by various computational means such as by way of a lock-in amplifier which performs computations by way of electric circuits, including multipliers and electric wave filters. Consider for example the PAR 124A lock-in amplifier. It has three PSD (Phase-Sensitive Detector) settings: (1) Low Drift; (2) Normal; and (3) High-Dynamic Range. Doppler radar and Doppler sonar embody this kind of computation, and in this sense, we can attain a form of High-Dynamic Range vision in the frequency ranges of radio waves and sound waves, as a form of XR. High-Dynamic Range sensing in general can be done by measuring any quantity at a variety of different sensitivities simultaneously, so that when extremely strong and weak signals are intermingled in various ways, the true nature of the signal can be analyzed, understood, recorded, and reproduced in various ways. What that means is that XR can simultaneously extend both our dynamic range response as well as our spectral response, i.e. to simultaneously sense very weak and very strong signals across widely varying spectra. Many in the lab have a fascination with Nessie and like to do underwater observations in Scotland's Loch Ness with new ways of combining sonar and photography [Edgerton and Wyckoff, 1978]. Our sonar technologies help probe the depths of Loch Ness and other bodies of water [Edgerton, Wyckoff, and Rines, 1979]. XR can help extend our sensory intelligence on land, in air, or in water. We developed new marine radar technologies for seeing growlers (small iceberg fragments) in a marine environment, greatly improving the safety of navitating vessels [Mann and Haykin, 1991]. For many years members of our lab have been collaborating with Jacques Cousteau on underwater photography and sonar [Edgerton and Cousteau, 1959], and we have a long history of designing underwater cameras and light sources and new ways of seeing and exploring underwater, studying marine life and erosion, and exploring shipwrecks and underwater ruins. XR is for more than just helping us see. We're making XR into something that is cybernetic. Norbert Wiener defines cybernetic in terms of feedback loops [Wiener 1948], and the word "cybernetic" originates from the Greek word "kybernetes" which means "helmsman", i.e. steering a vessel upon the water. Cybernetic XR allows us to experience and interact with computational media in a closed loop feedback system. Consider, for example, a small recreational vessel, i.e. a vessel that responds quickly. Upon the vessel is a Dopper sonar apparatus having an output to a VR (Virtual Reality) display. As we navigate the vessel, we quickly see, upon the head-mounted display, the effects of our navigation of the vessel. This is one form of XR in the sense that we wish to see it expand. SWIM (Sequential Wave Imprinting Machine) from the 1970s [Mann 1985, 1986] is another example of Cybernetic XR, as it allows us to not only see, but to directly interact with, radio waves and sound waves in air, water, or solid matter. When applied to sensors like radar and sonar, SWIM is a meta sensor in the sense that it senses the sensing of sensors and their means of sensing. As a form of research or teaching or recreation or play, consider a small vessel, either one that a person can ride upon, or an even smaller vessel that is too small to hold a person, but large enough to house a cathode ray oscillograph and perhaps other small additional items, within a long narrow trough that is lined with rubber sheets to keep the water from leaking out and to also absorb stray sound waves. The oscillograph is moved back and forth upon the vessel, while a waveform is presented to the vertical deflection plates of the cathode ray tube. The waveform is the output of a PSD (Phase Sensitive Detector). A Doppler radar or sonar aimed at the vessel, in which the baseband output of the radar or sonar deflects the spot on the tube, will give rise to what we call Cybernetic XR, i.e. an ability of a person to interactively explore a "world" in which electromagnetic radio waves or sound waves are interactively visible. This is distinct from the real-time exploration of virtual worlds [Sutherland 1968] in the sense that the lag or delay between the computationally generated XR content and the movement of the vessel is very close to zero, and also in the sense that with SWIM, the interaction is part of a feedback loop. Rather than just looking at some virtual thing, what we really want to look at is something that changes due to our actions. Computation is instantaneous because the electric circuits performing the computation are instantanous, especially if the deflection plates are driven directly, in which case the bandwidth is extremely high. A cathode ray tube running in vector graphics in this way responds within millionths of a second. In the above example, the spatial scale is reduced 2x due to the path doubling of the wave to the vessel and back. Were that to be considered a problem, we merely need to place an antenna or sonar transducer upon the vessel, to move with it, and another at a stationary point, such that the two move toward and away from one-another, thus removing the path doubling. Unlike VR which is typically a solo experience, XR is a shared experience, if, for example, we have the vessel in a darkened room, such as Strobe Lab. As our eyes accustom to the dark, one person can move the vessel and its associated oscillograph (and possibly an antenna or sonar transducer) back-and-forth, while a number of people can see the SWIM waves. And to be truly a shared experience, consider a somewhat larger-scale SWIM made from a very long linear array of lamps, each lamp illuminating when a particular voltage is reached. This is done by way of a digital computer that switches the lamps on and off in accordance with a voltage or calculated signal level. In this way much larger groups of people can participate, on land, or in water, since their body movements will affect the SWIM. At the pool, for example, we have a shared computational experience in which none of the players need to wear a head-mounted display, and thus there is no need to have the thousands of volts present to run cathode-ray tubes upon each player's head in the water. If the SWIM is large enough, it can be seen underwater, or at least in the water, providing a shared immersive XR experience. This experience is made into a game by suggesting a particular goal, such as maintaining the position on the SWIM. Thus we an ask a swimmer to try to swim along lines of constant Dopler shift, i.e. to swim in such a way that the spot on the SWIM does not move. With a group of swimmers the game becomes a collaborative game of sorts because for example two swimmers can move opposite each other perpendicular to lines of constant Doppler, and cancel out each other's Doppler shift. SWIM is often a one-dimensional display, so in some ways we can regard this game as a one-dimensional video game. If we create a complex-valued PSD, and display the result on the Argand plane, i.e. as an "X-Y" oscillograph, we turn the experience into a two-dimensional video game. A large-screen oscillograph is made from two moving coil loudspeakers, each driving a very small mirror, so as to deflect a laser beam in each of the two axes, thus providing a large-screen flying-spot visible from anywhere in the pool. To move the spot counter-clockwise, a player advances toward the sonar transducer, or, for clockwise, retreats. To move the spot outward, a player spreads their body to have a greater sonar cross-section, or to move the spot inwards, collapses their body to a lesser sonar cross-section. The oscillograph can be replaced with any kind of display that accepts two analog input voltages. For example, when a video game that has two game port inputs is connected to the real and imaginary outputs of the PSD, any of a variety of games can be played using the sonar as input to the video game. With a sufficiently large television display, the game can be seen from anywhere in the pool. Consider now a vessel navigating toward a growler, in which the operator of the vessel has such a SWIM affixed to the mast of the vessel. Let us suppose that the purpose is educational/recreational, rather than practical, i.e. that the growler is approached at very slow speed, so as not to present any safety hazard. Suppose that an additional "player" so-to-speak, boards the growler (with appopriate safety precautions) while holding a paddle, to thus be able to affect the growler's trajectory in the water, while viewing the mast of the vessel from the different perspective afforded by the growler. In this way, as a form of education or fun activity, both parties are able to navigate toward and around each other, while both watching the mast of the vessel, and thus both seeing the waveforms as the computed Doppler baseband output of the lock-in amplifier / PSD. In this way the two players shall learn a great deal about sound wave propagation in water or radio wave propagation in air, as the case may be, due to their own navigational forces. Indeed a plurality of vessels or growlers or other floating objects can be boarded by various individuals each seeing a common display, or each wearing an eyeglass upon which is displayed a common computational electrical signal. Thus XR is not merely about being able, for example, to see growlers and navigate around them, or to see a dynamic range of a hundred million-to-one, but, rather, it is a truly magical experience in which we can interact with physics itself at the speed of light or speed of sound, rather than at the much slower speed at which a conventional raster graphics computer screen is updated. XR extends our senses and meta senses and shared senses and shared meta senses, and will ultimately extend our intelligence. We believe that XR should be broadened toward this kind of Cybernetic XR, as a form of extended intelligence and extended collective intelligence. Thus we shall use the term XR in this broader sense going forward. XR goes beyond what we can experience in a VR display, to include any kind of sensing (technology) + sensory (human) interaction with reality. XR is any combination of a virtual environment with reality where the virtual environment is responsive to a real or complex-valued output from reality, by way of real-time computation. It is generally assumed that the virtual environment remains aligned with or strongly coupled to some aspect of reality, e.g. it may simply be an extended response display of reality itself, or it may be a fun or playful interpretation of reality. Dedicated in memory of Harold Edgerton. We also thank John Benedict for welcoming our crazy experiments at the MIT pool, and everyone at Strobe Lab.