Marginal Notes on Notes on Gesture

Motion capture. Captured motion.

It is no coincidence that in his essay "Notes on Gesture" Giorgio Agamben only provides the reader one concrete exemplar of what actually constitutes a gesture, and that is gait. Recall that Muybridge and Marey became godfathers of not only the art of cinema but also the science of biomechanics, the relation becoming more apparent over the course of the twentieth century insofar as both serve to capture motion. Or, more specifically, as they both serve to capture gesture: walking and gait have become as important to the processes of consumption as they have to those of production.

It is gait that provided the basis for some of Muybridge and Marey's early cinematic works, but is also the foundational human movement that has driven most innovations in biomechanical measurement during the past century, from stroboscopic photography to force plate analysis to high-speed videography. As Francesco Careri suggests, walking is the "first aesthetic act" of humans in that it assumes a "symbolic form" shaping our very being in the world and our relationships to landscape and architecture. Gait is integral to this symbolic form and thus integral to our built environment both real and virtual. While Careri argues convincingly that the built environment of humans emerges from nomadic walking peoples, eventually it comes to mark the character of the sedentary city in both material and immaterial fashion: the polis and the walking subject enter biunivocal relations of naming the other. Walking is not simply an aesthetic act, then, but a political one as well.

Courtesy of Gabriel Orozco

Gabriel Orozco
digitally-manipulated photographic print

And while Agamben devotes his attention to cinema for the remainder of the essay, perhaps we ought to follow the twin genealogies created by Muybridge and Marey to consider parallel developments in biomechanics as well. Extending an argument from Deleuze's book on cinema, Agamben suggests that "the element of cinema is gesture and not image." If Agamben and Deleuze are correct, then the reason gesture has been obscured in cinematic analysis appears to be simple, as it is literally a matter of appearances. Until recently, cinematic scenes were always shot from a single perspective at a time, from a single camera, and many of these single shots (perhaps from different cameras) were edited together to form a final filmic image — with the audience member, as Benjamin points out, assuming the position of the camera and the gaze of the director.

With this flattening of the perspectival gaze to the two-dimensional surface it appears that the image constitutes the foundational element of cinema, but this is due to the technical limitations of the input device rather than to any truth of the form itself — if we can consider "cinema" to be an assemblage of bodies and technologies that produces the final filmic image. Given such an input, one can never see all sides of a volume from a single point in Euclidean space — and gesture is volumetric.

What technical vision wants is to see the subject from all directions at once — in other words, to become omnidirectional or omnipresent (and here we can explain the "replacement" for an idea of God, in a technocratic sense of becoming-secular). Following Agamben and Deleuze, this is because technical vision wants to represent gesture rather than simple image.

The goal of omnidirectionality had been accomplished to some degree in biomechanics with motion capture technology, an apparatus that features multiple simultaneous camera angles synthesized together to identify the position of markers located on key anthropometric sites of the body. In doing so, it became possible to create volumetric models of gesture for the purposes of measurement, analysis and optimization.

But omnidirectionality has truly taken off with videogames, which took the practical fruits of biomechanic research and made them profitable for the industry of integrated spectacle. Financial gain may now accrue by capturing and expropriating the gestures of athletes and actors to create identity-constructs that are tried on like well-made Armani suits. While playing these games the user reduces one's own gestures to a programmed and nearly-pure electromagnetic impulse almost unrecognizable in comparison to those movements taking place on the screen.

Motion Capture Collage - Courtesy EA Sports

And since it is the integrated spectacle we are describing it is no surprise that innovations in the videogame medium were fedbackforward into cinema, as with the bullet time effects in The Matrix. It is perhaps most impressive, then, that Deleuze recognized cinema's gestural character without ever having seen Trinity levitate to raise holy hell on two units of simulated police.

Intensities and Null Values

MJ - Bullet Time - Courtesy MJ to the MaxThe green or blue screen used in chroma key photography — once simply the flat planar backdrop for the local TV weatherman, but increasingly entire 3-D environments — would seem to provide an ideal instance of smooth space. In fact, it should be considered a prototype for striation.

Smooth spaces, according to Deleuze and Guattari, are characterized by intensities; these are variable and affective and may function in the service of navigation for the nomad. With the chroma key screen there are no intensities, but rather a void of solid blue or green. In addition, lights are strategically placed to completely eliminate shadows or hot spots (glare) that may interfere with the null of the solid blue/green and provide a bearing of intensity.

When the chroma key photography set becomes a volumetric environment, it still does not function as an intensity, but rather as a nothing in which bodies and objects become solely referential to one another or to the camera. How does one move through green space? Where is the line? What intensities provide guidance?

No, this is a striated space, or perhaps better a proto-striated space. It only makes sense in the context of the computer that overlays a grid on top of the entire greenspace. The body is plotted in three dimensions, and more importantly, is transposed to some fantasy space, a virtual space. The greenscreen, then, as a nexus between the real and virtual. At that nexus, any smoothing of space and time (eg. bullet time) may be possible, but it is fundamentally predicated upon the striation of this proto-striated space.

High-Speed Photography and Time Dilation

A few notes comparing two of Eadweard Muybridge's offspringbullet time photography and the high-speed photo finish system — more than a century after the godfather of biomechanics kickstarted a new science.

The camera

Muybridge pioneered the technological visioning of human movement by having a single fixed-location camera take a motion and strobically break it into individual segments for analysis.


With bullet time photography we take many photos at once to dilate a moment of action/time and create a fluid movement of the "camera" during that dilated moment. In other words, we have multiple cameras shooting from multiple points to create a "virtual camera" that moves on any line that the photographer desires. Although the "virtual camera" is moving, the spirit of Muybridge's technique remains the same. Crucially, however, computer software interpolates between the photographic data points to re-create fluid movement (of the camera) and reconstitute the moving object (though relatively static compared to the camera), thus dilating time.

In the sprint photo finish, on the other hand, Muybridge's technique is exactly the same, except accelerated by a camera taking 2,000 photos per second. Instead of the act of interpolation uniting the discrete image (data points) together, as with bullet time, the computer software removes all images not required to determine the exact moment that a body crosses the finish line — one camera shooting from a single point.


In the case of bullet time photography, time is dilated by the "movement" of the virtual camera, as the sub-component cameras fire sequentially or simultaneously. Through interpolation, we have a particular form of produced spectacle in which we create that which does not occur. Becoming is controlled by a software algorithm.

Photo Finish

In the case of the photo finish, time is dilated with the assistance of the graduated clock ruler at the bottom of the layered image. By eliminating all images save the ones in which a runner crosses the finish line, we effect the erasure of that which did occur. Motion is arrested.

Edge detection

MJ - Bullet Time - Courtesy MJ to the MaxBullet time usually requires the concurrent use of chroma key (greenscreen) techniques in order to construct its spectacular outcome. At that point, colour serves as the means of edge detection such that contours may be traced and the individual subject separated from its environment. Contra Benjamin, it is not so much the aura of the actor's individual performance that is lost, but rather the entire lived spatial environment that is forcibly removed by fiat of computer software.

The edge detection of the photo finish system is more powerful and insidious in that it doesn't require a special background from which to isolate objects and trace their contours relative to the finish line. Although no background is being substituted in producing the final representational output, the ability to detect edges in spite of this becomes all the more impressive.