Marginal Notes on Notes on Gesture

Motion capture. Captured motion.

It is no coincidence that in his essay "Notes on Gesture" Giorgio Agamben only provides the reader one concrete exemplar of what actually constitutes a gesture, and that is gait. Recall that Muybridge and Marey became godfathers of not only the art of cinema but also the science of biomechanics, the relation becoming more apparent over the course of the twentieth century insofar as both serve to capture motion. Or, more specifically, as they both serve to capture gesture: walking and gait have become as important to the processes of consumption as they have to those of production.

It is gait that provided the basis for some of Muybridge and Marey's early cinematic works, but is also the foundational human movement that has driven most innovations in biomechanical measurement during the past century, from stroboscopic photography to force plate analysis to high-speed videography. As Francesco Careri suggests, walking is the "first aesthetic act" of humans in that it assumes a "symbolic form" shaping our very being in the world and our relationships to landscape and architecture. Gait is integral to this symbolic form and thus integral to our built environment both real and virtual. While Careri argues convincingly that the built environment of humans emerges from nomadic walking peoples, eventually it comes to mark the character of the sedentary city in both material and immaterial fashion: the polis and the walking subject enter biunivocal relations of naming the other. Walking is not simply an aesthetic act, then, but a political one as well.

Courtesy of Gabriel Orozco

Gabriel Orozco
digitally-manipulated photographic print

And while Agamben devotes his attention to cinema for the remainder of the essay, perhaps we ought to follow the twin genealogies created by Muybridge and Marey to consider parallel developments in biomechanics as well. Extending an argument from Deleuze's book on cinema, Agamben suggests that "the element of cinema is gesture and not image." If Agamben and Deleuze are correct, then the reason gesture has been obscured in cinematic analysis appears to be simple, as it is literally a matter of appearances. Until recently, cinematic scenes were always shot from a single perspective at a time, from a single camera, and many of these single shots (perhaps from different cameras) were edited together to form a final filmic image — with the audience member, as Benjamin points out, assuming the position of the camera and the gaze of the director.

With this flattening of the perspectival gaze to the two-dimensional surface it appears that the image constitutes the foundational element of cinema, but this is due to the technical limitations of the input device rather than to any truth of the form itself — if we can consider "cinema" to be an assemblage of bodies and technologies that produces the final filmic image. Given such an input, one can never see all sides of a volume from a single point in Euclidean space — and gesture is volumetric.

What technical vision wants is to see the subject from all directions at once — in other words, to become omnidirectional or omnipresent (and here we can explain the "replacement" for an idea of God, in a technocratic sense of becoming-secular). Following Agamben and Deleuze, this is because technical vision wants to represent gesture rather than simple image.

The goal of omnidirectionality had been accomplished to some degree in biomechanics with motion capture technology, an apparatus that features multiple simultaneous camera angles synthesized together to identify the position of markers located on key anthropometric sites of the body. In doing so, it became possible to create volumetric models of gesture for the purposes of measurement, analysis and optimization.

But omnidirectionality has truly taken off with videogames, which took the practical fruits of biomechanic research and made them profitable for the industry of integrated spectacle. Financial gain may now accrue by capturing and expropriating the gestures of athletes and actors to create identity-constructs that are tried on like well-made Armani suits. While playing these games the user reduces one's own gestures to a programmed and nearly-pure electromagnetic impulse almost unrecognizable in comparison to those movements taking place on the screen.

Motion Capture Collage - Courtesy EA Sports

And since it is the integrated spectacle we are describing it is no surprise that innovations in the videogame medium were fedbackforward into cinema, as with the bullet time effects in The Matrix. It is perhaps most impressive, then, that Deleuze recognized cinema's gestural character without ever having seen Trinity levitate to raise holy hell on two units of simulated police.

In the age of the integrated spectacle (cf. Agamben), few of the static two-dimensional images that are presented to us in the course of everyday life — magazine ads, billboards, posters, direct mailings, and the like — are in fact truly depthless artefacts. Rather, they are the result of careful processes in which part-objects have been layered on top of one another, grouped together, and transformed in various ways before being flattened out to the final "static" image.

Generally speaking, these part-objects may be either textual elements or other image elements, that is, the fundamental building blocks of Flusser's line and surface thinking. The graphic design software that facilitates the creation of this final flattened image retains within the file all of the meta-information about each of these part-objects in terms of position, understood as the x-y coordinates of grid plane and the z-index of layer — in other words, the file contains the relations that existed between each part-object before flattening took place.

wii would like to play - we don't have tickets, courtesy of HomeShop

But a skilled and experienced designer doesn't need the original file to understand the relations that created the final image. Simply by assessing the visual outcome in the context of embodied memory, one is able to unlayer and reconstitute that which has been usurped of its depth in its rendering-spectacular.

The complexity of the spectacular apparatus increases as we move from the processed image into the realm of cinema and television and literally introduce motion to the process. Chion identifies new building blocks that are added to the image and text within the two-dimensional frame, most importantly the audio elements of speech and field sound captured during recording, and the music and sound effects added in post-production. To the moving image we also add the graphic overlay, a visual element that may be static or animated and which is visually distinct from the images that have been captured by the camera during filming. These overlays are increasingly connected to external (relational) databases in the specific example of television, as with statistics during a sports broadcast or with the latest quotes on a news channel stock market ticker.

Nonetheless, the experienced director or video editor may similarly be able to quickly apprehend after the fact the layers and corresponding relations that produced the final cinematic outcome. In doing so, we may already understand that the layer is not a two-dimensional phenomenon, as Chion's inclusion of audio and acoustic space illustrates.

Global Village Basketball 2009 - courtesy of marcef33

Now consider those works that find smooth passage through categorical barriers identified variously as interventions, conceptual pieces, participation-oriented performances or community-based art projects. Three such examples, different though interrelated, might include Global Village Basketball, HomeShop, and wii would like to play // we don't have tickets. While these works were "framed" with more or less well-defined spatiotemporal parameters, they are most definitely of the realm of the volumetric and hence introduce new complexities to the apparatus.

Of course, with such events there is no "file" to which we have recourse for determining the layers and relations between the part-subjects that comprised their contextual fabric. As Massumi points out, they are ontogenetic. But, as with the processed static and moving video images described earlier, is it possible to unlayer the volumetric interactions of the intervention after the fact? Can we assess the audiovisual outcomes in the context of embodied memory and perhaps in the process identify new building blocks for the becoming-social each work facilitated, such as gesture, tango, translation, risk and exchange?

(a work-in-process between elaine w. ho and sean smith towards "unlayering the relational: microaesthetics and micropolitics," a text for the mediamodes art and technology conference in new york)

a [sic] patient

or, perhaps a freudian typo…

Garrick Barr, CEO of Synergy Sports Technology, a company that provides real-time video-indexing statistical engine and online retrieval for professional sports teams and whose products support the Dynamic DNA feature in NBA Live:

"So we have 11 generic play types. In '98 when I designed the first report, I had to sort of examine and figure out, if you will, the oncology of the sport so that we could log it accurately and consistently to satisfy professionals, and having been one I was in a pretty good position to try to do that."

Secure Volumes and Docile Identities

"Two very beautiful naked girls are crouched facing each other. They touch each other sensually, they kiss each other's breasts lightly, with the tip of the tongue. They are enclosed in a kind of cylinder of transparent plastic. Even someone who is not a professional voyeur is tempted to circle the cylinder in order to see the girls from behind, in profile, from the other side. The next temptation is to approach the cylinder, which stands on a little column and is only a few inches in diameter, in order to look down from above: But the girls are no longer there. This was one of the many works displayed in New York by the School of Holography" (Umberto Eco, 1975, p.3).

2008 Olympic Ticket

Umberto Eco cleverly juxtaposes desire and reality in the opening lines of his essay "Travels in Hyperreality" and its observation of the hologram. Naturally, the objects of our desire assume greater value to us the more they approach a real that has purportedly been denied to us. Or, perhaps more correctly, when they return in a hyperrealized form from a "real" that was always already there for us to seize. And so much the better if these two writhing nymphs can lift themselves off the surface of the page or screen (not unlike those silicone or elastomeric gel sex dolls designed for intercourse) to build an edifice of technological titillation on the abject foundation of absent sensuality.

The process of creating such a hologram is in fact a double process as well as a process of doubling. A beam of light (laser or white light) is split such that one beam illuminates the object from which some of the reflected light falls on the recording medium. The other light beam resulting from the split, known as the reference beam, also illuminates the recording medium such that an interference pattern occurs between the two beams, which forms the hologram itself. Once this hologram is illuminated with a beam of light identical to the original reference beam, it becomes visible to the human eye as a represented image seemingly within the surface of inscription.

But to create Eco's beautiful naked girls requires a second process. By recording a hologram of a hologram we may create an image "in front" of the photographic plane and produce the sorts of three-dimensional projections that induce such awe in the society of spectators, with their desires and realities. As Eco points out, these second-generation holograms are no mere child's play, as they have serious applications in astronomy, medicine, manufacturing and art.

Of course in credit cards, Olympic tickets, or NBA merchandise the "lesser" first-generation hologram also eludes mere play, having become a serious marker of value and authenticity (one of many in what me might refer to as a security assemblage). The hologram provides a sufficiently complex technology of mass-produced inscription that fashions a volumetric projection of a three-dimensional figure in the non-space created on a two-dimensional plane (credit card, ticket, authentic replica jersey tag). In representing the "authentic" it also serves to assure the identity of the owner.

Is it so difficult then to entertain the notion that the superstar identity-vehicle within an NBA videogame, a three-dimensional or volumetric construct within the non-space created on the two-dimensional plane of the screen (and its offer, rooted in desire and hyperrealism, of prosthetic talent or surrogate style), might also stand as an assurance of identity?

In case this wasn't clear from the outset of videogames, it becomes even more certain in the age of online multiplayer gaming communities (eg. PlayStation Network, Xbox Live). Those who wish to participate in these online communities must gain passage to the space and its identity-vehicles by following two steps: first, by paying the toll of a subscription fee, and second, by guaranteeing identity through the financial means of payment.

FirstnameLastname (the unasked-for original gift) to SocialInsuranceNumber to BankAccountNumber to CreditCardNumber to OnlineGamingCommunityID to SportsVideogameIdentityVehicle, each link in the modulating chain of identification a unique number in a relational database table.

We reiterate: if the function of power in disciplinary societies served to produce docile bodies, its correlate in the societies of control is to produce docile identities, which may also include docile bodies.


In America, Jean Baudrillard suggested that the mirror phase had "given way" to the video phase and the contemporary era of the screen image. But have we not changed again, reverted back to the mirror or at least mutated into a new hybrid of mirror and video?

There is a model for what we are attempting to describe here: the two-way mirror so adored by psychology practice. As children we play in these special rooms while the medical gaze and its recording devices sit quietly behind the silvered glass. Eventually, we learn of the duplicity, not unlike those moments in which we discover the fictions that are Santa Claus, Tooth Fairy, Easter Bunny, and Michael Jackson's whiteness. Whenever in the same situation again, we are subconsciously aware of the mirror and wonder what lurks on the other side.

The regime of the screen intensifies, both in quantity and quality. The sheer number of screens increases beyond even that which Baudrillard could imagine. There is a viral proliferation of the screenal, vectoring beyond home (television) and work (computer) to infect every public space (monitor, jumbotron, electronic billboard, arcade game, etc.), and even the very flows of human movement themselves (laptop, PDA, cellphone).

But the nature of the input interface has changed as well, "democratized," a contagion of interactivity to match the proliferation of the screenal. Now we are all "creators," all able to see ourselves extended into the data networks of the ludic-virtual. In other words, all complicit in the creation of a new mirror — a slightly kaleidoscopic mirror, mind you — but one that captivates us like Narcissus long beyond that mirror phase of childhood.

Like the two-way sort used in psychology, however, this new era of the interactive is at once mirror and screen, at once opportunity for enclosed self-contemplation and open performance. For we all know what lurks behind the silvering of this new mirror and that is the gaze: sometimes manifest as benevolent glance and sometimes as cold, clinical, unblinking stare. Always performance.

Narcissus never suspected that Echo was swimming below the surface of the pool, but we know better.

* * *

There is a certain congruency here with videogames that allow one to toggle between first- and third-person perspectives. Vilém Flusser discusses the difference between line and surface and its implications for perception and thought, but, writing before the videogame revolution, neglects to consider the volumetric. All three appear in the planar form, but since Flusser distinguishes between line and surface, or text and image, it seems important to understand that the videogame is also of an altogether different character, for one actually enters its non-space to control the avatar during one's play.

This is not the same as a three-dimensional setting being reduced to the two-dimensional planar surface through perspectival optics, as with photography, film or television. In that case, one's vision identifies strictly with the point-of-view of the camera and one must imagine the depth of field that is represented on the surface. With the contemporary videogame, on the other hand, there is literally a three-dimensional non-space that has been mathematically modeled "behind" the screen. While the screen thus appears as a site of reduction, this is not due to the nature of cognitive engagement with this non-space, for we are continually monitoring multiple points-of-view as our bodily expertise increases in these ludic environments.

Admittedly, an entire history of static versus scrolling versus spatial gameplay environments needs to be told here, but suffice it to say that the emergence of the ludic subject from the primordial digital ooze of the surface to become volume is the most significant challenge to perception and thought since the invention of photography.

The split of the two-way silvering between mirror and screen is perhaps one way to understand this challenge to perception and thought, manifest in the ludic environment as the ability to instantaneously switch between the subject and object, between the I/je/? and the one/on/??? pronoun positions.

What of the you/tu/??

This was the opening in which wii would like to play // we don't have tickets found its niche. In "sprinting" the videogame 100-metre dash against a local, embodied competitor there was an explicit engagement with the you/tu/? at the nexus of I/je/? and one/on/??? positions. No, people didn't actually run, but yes, they did flail their arms, breathe heavy, and perhaps even shed a bead of sweat. No, people didn't face each other, but yes, through a Japanese interface both Chinese and English engaged amicably, not in translation but rather as a mediation.

And yes, in the process a temporary we/nous/?? was established: a micropolitics of the social body that first began with a politics of the moving and sensing animal body.

(a work-in-process between elaine w. ho and sean smith towards "17 days in beijing: screen of consciousness on the micropolitical," a text for public issue 40)