What is a monocular cue that helps to determine the distance of a distant object motion parallax?

Learning to See

RON BRINKMANN, in The Art and Science of Digital Compositing (Second Edition), 2008

Motion Parallax

Motion parallax refers to the fact that objects moving at a constant speed across the frame will appear to move a greater amount if they are closer to an observer (or camera) than they would if they were at a greater distance. This phenomenon is true whether it is the object itself that is moving or the observer/camera that is moving relative to the object. The reason for this effect has to do with the amount of distance the object moves as compared with the percentage of the camera's field of view that it moves across. An example is shown in Figure 2.29. An object that is 100 m away may move 20 m in a certain direction and only move across 25% of the field of view, yet the same 20 m displacement in an object that is only 40 m away will cause the object to move completely out of frame.

What is a monocular cue that helps to determine the distance of a distant object motion parallax?

Figure 2.29. Motion parallax diagram.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B978012370638600002X

Space Perception

Colin Ware, in Information Visualization (Fourth Edition), 2021

Perceiving Patterns in 3D Trajectories

A common problem in geospatial visualization is to understand the path of a particle, animal, or vehicle through space. A simple line rendering only provides 2D information and this is therefore unsuitable. Using motion parallax or stereoscopic viewing will help. Also, periodic drop lines to a ground plane can be used. In addition, rendering the trajectory as a tube or box adds perspective and shape-from-shading cues, especially if rings are drawn around the tube at periodic intervals. An additional advantage of a box trajectory is that it can also convey roll information. Fig. 7.48 shows the trajectory of a humpback whale carrying out a bubble-net feeding maneuver (Ware, Arsenault, Plumlee, & Wiley, 2006).

What is a monocular cue that helps to determine the distance of a distant object motion parallax?

Figure 7.48. The trajectory of a humpback whale bubble-net feeding is shown using an extruded box.

[G7.20] To represent 3D trajectories, consider using shaded tubes or box extrusions, with periodic bands to provide orientation cues. Apply motion parallax and stereoscopic viewing, if possible.

Computer simulations used in science and engineering can often produce 3D tensor fields. In a 3D vector field, each point in space is endowed with the attributes of direction and speed. In a tensor field each point can have many more attributes, representing phenomena such as shear or twist. The visualization of these fields is a specialized topic and we will not delve into it here. However, visualizing tensor fields can be done by adding attributes such as shape and color to extruded trajectories. A good starting place for work in this area is Delmarcelle and Hesselink (1993).

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128128756000074

Getting the Information: Visual Space and Time

Colin Ware, in Visual Thinking for Information Design (Second Edition), 2022

Abstract

We begin with an introduction to depth cues. These are the means whereby we process distances away from our view point. The different kinds of depth cues are described including linear perspective, occlusion, stereoscopic depth and motion parallax. Incorporating depth cues can enable us to design visualizations that seem three dimensional. However, depth cues are not an all-or-nothing design choice. We can, for example, choose to use linear perspective, or not, and we can choose to use stereoscopic viewing, or not. Cognitive task requirements should determine which cues to incorporate in a design. In a 3D environment, our viewpoint determines how information is processed by the brain, or if it can be seen at all. The picture plane dimensions of visual space are very different from the depth dimension in terms of how information is processed by the brain. The idea of 2.5D design is introduced as a way of specifically taking the structure of visual space into account, and introducing elements of 3D in a judicious way to enhance a mostly 2D information display. Ultimately design decisions should stem from the goal of creating visualizations that are efficient cognitive tools. To this end, we start the discussion of interactivity in visualization with the typical costs associated with getting knowledge, including, eye movements, visual processing, and the time to navigate an information space by means of mouse clicks, walking, or zooming.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128235676000057

Visual Development: Infant

R.N. Aslin, in International Encyclopedia of the Social & Behavioral Sciences, 2001

6 Depth and Binocular Rivalry

The relative distance (depth) of objects can be appreciated using three different sources of information: motion, retinal disparity, and pictorial cues. A rapidly approaching (looming) stimulus elicits a blink response in one-month-olds, and motion parallax (more rapid image speed for near than for far objects) enables three-month-olds to discriminate small differences in object distance. Thus, sensitivity to depth from motion is present in very early infancy and does not require the use of both eyes (Kellman and Arterberry 1998).

Retinal disparity refers to the subtle differences in the images projected to the two retinas from an object at near (less than five meters) viewing distances. FPL and VEP studies have demonstrated that sensitivity to retinal disparity does not emerge until three to four months after birth. Moreover, the smallest retinal disparity that is just discriminable by infants improves very rapidly between three and five months of age, progressing from no sensitivity to nearly adult values (less than one minute of arc) in this age range (Birch et al. 1985).

During this same age range, infants become sensitive to binocular rivalry: the perceptual conflict induced by presenting grossly different images to the two retinas (e.g., horizontal stripes in one eye and vertical stripes in the other). Binocular rivalry occurs when the discrepant retinal images cannot be fused into a single percept. Prior to three months of age, infants appear to have a much greater tolerance for fusing discrepant images than adults, perhaps because of their poor acuity and contrast sensitivity (Birch et al. 1985).

In adults, failure to align both foveas onto a stimulus typically leads to binocular rivalry and prevents stereopsis (the appreciation of depth from retinal disparity). Some individuals, including some infants, have an ocular misalignment (strabismus) that eliminates fusion and stereopsis. If uncorrected in infancy, this misalignment can result in a permanent loss of the capacity for stereopsis, even if the eyes are surgically realigned in childhood (Banks et al. 1975). Thus, there is a sensitive period during which a normally developing neural mechanism for stereopsis, present by four months of age, can be permanently disabled by subsequent abnormal binocular experience (strabismus). These same processes were earlier demonstrated in the visual cortex of cats and monkeys (see Neural Plasticity in Visual Cortex), and subsequently confirmed behaviorally.

Pictorial cues to depth are contained in flat (two-dimensional) representations of actual (three-dimensional) scenes. These cues include shading, occlusion, and linear perspective (e.g., receding railroad tracks that converge in the picture plane). A preferential reaching technique was developed by Albert Yonas to determine, under monocular viewing conditions, whether infants perceive the depth information in pictures. At approximately seven months of age, infants under these testing conditions become sensitive to a variety of pictorial cues to depth, as indicated by their reliable reaching for theapparently nearer picture (Yonas and Granrud 1985).

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0080430767036135

Space Perception

Colin Ware, in Information Visualization (Third Edition), 2013

Judging Relative Positions of Objects in Space

Judging the relative positions of objects is a complex task, performed very differently depending on the overall scale and the context. When very fine depth judgments are made in the near vicinity of the viewer, such as are needed to thread a needle, stereopsis is the strongest single cue. Stereoscopic depth perception is a superacuity and is optimally useful for objects held at about arm's length. For these fine tasks, motion parallax is not very important, as evidenced by the fact that people hold their heads still when threading needles.

In larger environments, stereoscopic depth perception has a minimal role for objects at distances beyond 30 m. Instead, when we are judging the overall layout of objects in a larger environment, known object size, motion parallax, linear perspective, cast shadows, and texture gradients all contribute to our understanding, depending on the exact spatial arrangement.

Gibson (1986) noted that much of size constancy can be explained by a referencing operation with respect to a textured ground plane. The sizes of objects that rest on a uniformly textured ground plane can be obtained by reference to the texture element size. Objects slightly above the ground plane can be related to the ground plane through the shadows they cast. In artificial environments, a very strong artificial reference can be provided by dropping a vertical line to the ground plane.

Because 3D environments can be so diverse and used for so many different purposes, no specific additional guidelines are given here relating to judgments of object position in 3D. The optimal mix is a complex design problem, not something for simple guidelines. All of the depth cues we have been discussing can be applied and should be considered in a design solution.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123814647000077

Interface to the Virtual World — Input

William R. Sherman, Alan B. Craig, in Understanding Virtual Reality, 2003

The Head

The head is tracked in almost every VR system, although not always the full 6-DOF. Most VR systems need to know something about the user's head orientation and/or location to properly render and display the world. Whether location or orientation information is required depends on the type of display being used.

Head-based displays require head orientation to be tracked. As users rotate their heads, the scenery must adapt and be appropriately rendered in accordance with the direction of view, or the users will not be physically immersed. Location tracking, while not essential, enhances the immersion quality of these VR experiences. Location tracking helps provide the sense of motion parallax (the sense that an object has changed position based on being viewed from a different point). This cue is very important for objects that are near the viewer. Some VR experiences avoid the need for location tracking by encouraging or requiring the user to continuously move (virtually) through the environment. This movement through space also provides depth information from motion parallax. There may be other interactions that benefit from tracking head location, so applications that lack head location tracking might be harder to use.

Stationary VR visual displays, such as a computer monitor or a projection screen, must determine the relative position between the eyes of the user and the screen. Since the screen is stationary, its position does not need to be tracked. A good approximation of eye position (the bridge of the nose) can be made from the head location data. For the display of monoscopic images, this information is enough, but for the proper display of stereoscopic images, the system must have the location of each eye to render the appropriate views. Unless a separate tracker is located near each eye, the system will need information about head orientation as well as head location to calculate the location of each eye.

Hand-based VR displays, those that can be held in the hand, similar to PalmPilots and Gameboys, are like stationary displays in that the location of the user's head is more important than its orientation. Again, the relative position between the screen and the eyes must be known. Since the display is also mobile, both the head and display must be tracked to determine the viewing vector.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9781558603530500045

Input

William R. Sherman, Alan B. Craig, in Understanding Virtual Reality (Second Edition), 2018

The Head

The head is tracked in almost every VR system, although not always the full 6-DOF. A typical VR system needs to know something about the user’s head orientation and/or location to properly render the world from the user’s perspective. Whether the minimally required information is location or orientation depends on the type of display. Perspective rendering, of course, is based on where the sensory organs are (the eyes, the ears, the nose, etc.), which can be calculated as offsets from the coordinate system assigned to the head when the orientation is known.

HBDs require tracking at least the head’s orientation because, as users rotate their heads, the scenery must adapt and be appropriately rendered in accordance with the direction of view, or the users will not be physically immersed. Location tracking, while not always essential, enhances the immersive quality of these VR experiences. Location tracking helps provide motion parallax (the perception of an object’s position in three-dimensional space based on sensing from different positions) as the user moves their head. The motion cue is especially important for objects that are near the viewer. Some VR experiences avoid the need for location tracking by encouraging or requiring the user to continuously travel (virtually) through the environment. This movement through space also provides spatial information from motion parallax. Some interface interactions benefit from tracking head location, and as a consequence, applications that lack head location tracking can be harder to use.

Stationary VR visual displays, such as a computer monitor, a projection screen, or tiled display, require the relative position between the eyes of the user and the screen to calculate perspective rendering. For the display of monoscopic images, the bridge of the nose can be used to approximate the view position, but for the proper display of stereoscopic images, the system must have the location of each eye to render appropriate views for each eye. This is where tracking head orientation is important for stationary displays—unless a separate tracker is located near each eye. Some systems strike a compromise whereby the system presumes all viewers maintain a horizontal gaze, and thus render stereoscopic images that can be viewed from a large region within the display, based on a point of view most suitable for the group. The compromise comes from deperfecting the view provided to the tracked user, which then results in small discontinuities at the borders between viewing surfaces [Febretti et al. 2014].

Hand-based VR displays, those that can be held in the hand, such as a smartphone or tablet (held away from the face), are like stationary displays in that the location of the user’s head is more important than its orientation. Ideally, for proper perspective rendering, the relative position between the screen and the eyes is required. However, this is often compromised, especially in Magic Lens-style AR displays which presume the eyes are a certain distance directly in front of the screen.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128009659000040

The human visual system

David R. Bull, Fan Zhang, in Intelligent Image and Video Compression (Second Edition), 2021

2.7.2 Depth cues

There is no doubt that depth assessment is a dominant factor in our ability to interpret a scene; it also serves to increase our sense of engagement in displayed image and video content. Stereopsis, created through binocular human vision, is often credited as being the dominant depth cue, and has indeed been exploited, with varying degrees of success, in creating new entertainment formats in recent years. It is however only one of the many depth cues used by humans – and arguably it is not the strongest. A list of depth cues used in the human visual system is given below:

Our model of the 3D world: Top-down familiarity with our environment enables us to make relative judgments about depth.

Motion parallax: As an observer moves laterally, nearer objects move more quickly than distant ones.

Motion: Rigid objects change size as they move away or toward the observer.

Perspective: Parallel lines will converge at infinity – allowing us to assess relative depths of oriented planes.

Occlusion: If one object partially blocks the view of another it appears closer.

Stereopsis: Humans have two eyes and binocular disparity information, obtained from the different projections of an object onto each retina, enables us to judge depth.

Lighting, shading, and shadows: The way that light falls on an object or scene tells us a lot about depth and orientation. See Fig. 2.25.

What is a monocular cue that helps to determine the distance of a distant object motion parallax?

Figure 2.25. Pits and bumps – deceptive depth from lighting.

Elevation: We perceive objects that are closer to the horizon as further away.

Texture gradients: As objects recede into the distance, consistent texture detail will appear to be finer-scale and will eventually disappear due to limits on visual acuity.

Accommodation: When we focus on an object our ciliary muscles will either stretch (a thinner lens for distant objects) or relax (a fatter lens for closer objects). This provides an oculomotor cue for depth perception.

Convergence: For nearer objects, our eyes will converge as they focus. This stretches the extraocular muscles, giving rise to depth perception.

It should be noted that certain depth cues can be confusing and can cause interesting illusions. For example, we have an in-built model that tells us that light comes from above (i.e., the sun is above the horizon) and our expectation of shadows reflects this top-down knowledge. Fig. 2.25 shows exactly this. The top diagram clearly shows alternating rows of bumps and pits starting with a row of bumps at the top. The bottom diagram is similar except that it starts with a row of pits at the top. In reality, the only difference with these diagrams is that the bottom one is the top one flipped by 180 degrees. Try this by turning the page upside down. This provides an excellent example of how hard-wired certain visual cues are and how these can lead to interesting illusions. Another example of an illusion driven by top-down processes is the hollow mask. In Fig. 2.26, we can see a normal picture of Albert Einstein. In reality this is a photograph of a concave mask. Our visual system is so highly tuned to faces that, even when our disparity cues conflict with this, we cannot help but see this as a convex face. The right picture shows that even when we rotate the mask to an angle where the features are distorted and you can see it is clearly concave, it still looks like a convex face!

What is a monocular cue that helps to determine the distance of a distant object motion parallax?

Figure 2.26. The hollow mask illusion.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128203538000116

Visual Cues

Virginio Cantoni, ... Bertrand Zavidovique, in 3C Vision, 2011

Space Extraction from Egomotion

In the case of a mobile camera, space sensing and reconstruction by stereo or motion are even more closely related, whether conceptually or technically. Indeed, both features can make the camera displacement to arouse interest in given parameterized surfaces such as planes. An extended research was completed these last 20 years on 3D motion or structure-from-motion estimation [149–154] to quote among the precursors. Irani et al. [154] exemplifies methods exploiting the above-mentioned parallax generated by motion (motion parallax, affine motion parallax, plane, and parallax).

They mostly exploit the fact that depth discontinuities make it doable to separate camera rotation from translation. For instance, in approaches such as plane and parallax, knowing the 2D motion of an image region where variations in depth are not significant permits us to eliminate the effects of camera rotation. Using the obtained residual motion parallax, the translation is exhibited.

A single example is now detailed to illustrate the peculiar duality between volume and dynamics sensed from egomotion. In [155], motion vectors of given lengths and directions are proved to lie on the image at particular loci. The location and form of these loci depend solely on the 3D motion parameters. Considering an optical-flow velocity field, equal vectors lie on conic sections. This result is valid for stereo disparity too. In [156], it is proven that the disparity is constant along a line of a stereo pair of rectified images and varies linearly over a horizontal plane in function of the depth. Then, in the 2D histogram “disparity values versus line index,” the so-called v-disparity frame, a straight line of modes translates a road or building or obstacle plane. The computation was later generalized to the other image coordinate and vertical planes, using the u-disparity. Note that both studies [155,156], foster isovalue curves, velocity, or disparity. The second one promotes an extraction procedure in a novel projection space built on a set of line disparity histograms. Bouchafa and Zavidovique [157] defends that such process is general, because any move of a camera results into an apparent shift of pixels among images, disparity for a stereo pair, and velocity for an image sequence. They transpose the v-disparity to motion in designing the so-called c-velocity frame. This frame leads to voting schemes as well [158], which additionally rends detection robust against optical flow imprecision.

The camera model is the classical pinhole one again. The case considered is that of an onboard sensor moving on a smooth road (avoiding rapid bumps that shift the FOE).

The egomotion verifies the same motion model for all still objects (Figure 2.78), defined by three translations T and three rotations Ω. Conversely, mobile obstacles pop out as not resorting to the former dominating model.

What is a monocular cue that helps to determine the distance of a distant object motion parallax?

Figure 2.78. Camera and motion models.

Under such assumptions, the following classical equations hold (e.g., [159] ):

(2.58)ut=−fTX+xTZZ,ur =−xyfΩX−(−x2 f+1)ΩY+yΩZv t=−fTY+yTZZ, vr=−xyfΩY−(−y2f+1)ΩX+xΩZ

where w=[u, v]T=[ut+ur, vt+vr] stands for the 2D velocity vector of the pixel p(x,y) under the focal length f.

With no loss in the generality of computations, assume a translational straight move T=[0, 0, Tz]Tof the camera in the Z direction; [u, v]Tbecomes:

(2.59)u=TZZx,v= TZZy

Suppose now the camera is observing a planar surface of equation:

(2.60)nTP= d

with n=[nx, ny, nz]Tthe unit vector normal to the plane and d the distance from the plane to the origin. According to notations of Figure 2.78, the corresponding motion field is given by:

(2.61)u=1fd(a1x2+a2xy+a3fx+a4fy+a5f2)v=1fd(a1xy+a2y2+a6 fy+a7fx+a8f2)

where:

(2.62)a1=−dωy +Tznx,a2=d ωx+Tzny,a3 =Tznz−Txnx, a4=dωz−Txnya5=−dωy−Txnz,a6=TZ nz−Tyny,a7=−dωz−Tynx, a8=dωx−Tyn z

In an urban environment, for instance, four pertaining cases of a moving plane (see Table 2.1) can be considered toward scene reconstruction: obstacles, road, and building.

Table 2.1. Four Moving Planes of Interest

Horizontal (road) n=[0,1,0]T T=[0,0,Tz]T dist. d
Lateral (buildings) n=[1,0,0]T T=[0,0,Tz]T dist. d″
Frontal1 (fleeing obstacle) n=[0,0,1]T T=[0,0,Tz′]T dist. d′
Frontal2 (crossing obstacle) n=[0,0,1]T T=[TX,0,0]T dist. d′

The corresponding motion fields, after (2.61), are:

(2.63)(a){u=Tzfdxyv=T zfdy2(b){u=Tzfd′x2v=Tzfd′xy(c){u=Tz’d’x v=Tz’d’y(d){u=−Txd’ v=0

Let wo, wr, and wb be, respectively, the module of the apparent velocity of an obstacle point, a road point, and a building point. It becomes:

(2.64)|wr|=|T zfdr|y4+x2 y2,|wb|=|Tzfdb|x4+x2y2,|w0|=|Txdo|or|w0|=|Tz ’do|x2+y2

Each type of w leads to the corresponding expression of c and the related isovelocity curve. For instance, in the case of the road plane:

(2.65)c=|wK|= y2(y2+x2)whereK=|Tzfd|

The final formula above proves that c, constant along isovelocity curves by definition, is proportional to w there, just as the disparity is proportional to the line value v. The image of a plane can thus be extracted as a set of pixels verifying this property, abstracted into the constancy of K. To that aim, a first cumulative process in the c-velocity frame (c,w) exhibits the straight lines (or equivalently, the parabolas in the (√c,w) frame used for both homogeneity and computation complexity reasons), and a second one extracts automatically K. See Figure 2.79 for examples of results. The reader can refer to [160] for an analysis of perturbations and uncertainty computations bound to this process.

What is a monocular cue that helps to determine the distance of a distant object motion parallax?

Figure 2.79. Results obtained from a database of the French project “Love” (Logiciel d’Observation des Vulnérables). (A) Left: optical flow. Right: resulting vertical plane detection. Planes get a label according to K-mean clustering, and the same color as that of the corresponding parabola. (B) Resulting c-velocity for the building model. Each vote is normalized by the number of points in each c-curve. (C) Example of a 1D Hough transform on the c-velocity space for detecting parabolas. For each (c,w) cell, a P-value is cumulated. Classes of the histogram split by K-mean or any other clustering. (D) Results of parabolas extraction using the 1D Hough transform followed by a K-mean clustering (four classes). In white, the discarded points: they probably belong to another plane model. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this book.)

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123852205000024

Russell Watkins, Hamid Jahankhani, in Strategy, Leadership, and AI in the Cyber Ecosystem, 2021

4 Holographic reality

Whilst research is ongoing and widespread in the field of augmented reality, there is a dearth of information regarding holographic reality and the potential challenges that this poses in the areas of ethical and legal protection for consumers. Initial hype around the arrival of 5G and the potential of holographic calls has faded away as other closer to market technologies such as virtual reality have taken the limelight. With holographic technologies in their infancy, developers are quietly getting on with what will become a new disruptive area for consumers and businesses (Stoyanchev, 2019).

The nirvana for technology engineers is to build a true 3-D holographic display; however, due to pixel density requirements, motion parallax, and data bandwidth limitations (Suzuki et al., 2012; Verma, 2018), this has not yet being realised. Whilst this technology is not yet available, other novel methods of holographic delivery are in flight. The first holographic smartphone was released in 2018, utilising specific light refraction and led stacking techniques to produce 3-D images (Byford, 2017). Lack of holographic content and mixed reviews regarding the technology have meant that this revolutionary device has not reached mainstream adoption; however, several demonstrations of holographic projections have captured the imagination of the general public.

One of the biggest challenges for holographic phone calls is the requirement for low-latency communication systems with low jitter and bandwidth requirements not seen before for consumer products. The ITU Internet 2030 project (ITU, 2019) highlights data bandwidth as the biggest challenge for holographic communications. The focus is on the correct balance of hologram quality versus resolution. In-development codecs and compression algorithms will help to alleviate bandwidth challenges; however, this can bring its own security challenges. Compressing data can open up the contents to side channel attacks such as Breach and CRIME (Gluck et al., 2013) and will require more sophisticated hardware (and possible impact on battery life) for mobile devices, degrading the experience of the end user.

In an article titled ‘The Dawn of Cyber Politicians’ (Geiger and Bienaime, 2017), the spectre of sinister use for holographic projections is raised. The article reports on the use of this technology by a French politician to give the illusion of appearing on stage in six different locations in France at once. Whilst the technology used was not a true hologram, the effect was the same in that the politician appeared to be present. The technique used is commonly referred to as ‘Pepper's Ghost’ (Patterson and Zetie, 2017). Attendees reported that after the initial surprise element, the hologram was seen as the real person. “You have the feeling to see the real guy. You just forgot that it's just a hologram” was the comment of one such attendee. This supports Suzuki et al.’s research findings that intimated at the human minds inability to distinguish between differing points on the virtuality continuum.

A holographic interaction in the future could be via disparate mechanisms—mobile device, fixed office or home-based device, or some other method not yet on the horizon of development. In each instance, there will be an initiation of communication by a primary party, synchronisation of the communication between the two parties, and authorisation at the secondary party if one to one. If one-to-many communications are initialised, then this would likely require authorisation from each party.

Holographic interaction adds many challenges to the traditional threat landscape of electronic communication. For true two-way communication to be realised, a sea of sensors will be required to monitor sound, video, and quality. Depending on the circumstances, permissions granted, and environment, each holographic avatar will likely have a requirement for a 360-degree view of the other avatar and its surroundings. Without this a true holographic end-to-end call will be nothing more than an enhanced video call or one-way holographic experience.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128214428000185

What is a monocular cue that helps to determine the distance of a distant object?

Monocular motion parallax This one's a mindblower. The monocular motion parallax happens when you move your head and objects that are farther away appear to move at a different speed than those closer to you.

Is parallax a monocular cue?

Motion parallax is monocular depth cue that arises from the relative motion of objects at different distances that is created when an observer translates laterally.

What are the monocular cues to distance?

Monocular cues include relative size (distant objects subtend smaller visual angles than near objects), texture gradient, occlusion, linear perspective, contrast differences, and motion parallax.

What is the motion parallax?

Motion parallax refers to the fact that objects moving at a constant speed across the frame will appear to move a greater amount if they are closer to an observer (or camera) than they would if they were at a greater distance.