In ‘Analysing Musical Multimedia’ Nicholas Cook provides an extremely clear general theory of how media engage with one another in multimedia works in general (from opera through to commercials). Judging the book solely by its cover and thinking I had found a first year textbook, I almost placed it back on the library shelf. Thankfully, I checked the contents-

Chapter 1 – Synaesthesia and Similarity
Chapter 2 – Multimedia as Metaphor
Chapter 3 – Models of Multimedia
and so on, and decided it might be worth a shot.

Cook’s thoughts on synaesthesia were extremely refreshing, ‘Synaesthesia provides some hints as to what multimedia is: but, perhaps more importantly, it supplies an illuminating model of what multimedia is not.’ Cook believes the most useful way to study multimedia is in its element’s differences and interactions, rather than through the eyes of synaesthesia, a phenomenon based on similarity, duplication and translation.

Speaking on Skriabin’s (an apparent synaesthete) fifth symphony, Prometheus, which includes a part for a Tastiera per luce, or colour keyboard:

    The luce part literally does add little; for while the slower part has no discernible relationship to what is heard, the faster part simply duplicates information that is already present in the music. In neither case is there a substantial degree of perceptual interaction between what is seen and what is heard—which means that, in a significant sense, Prometheus does not belong to the history of multimedia at all. And to say this is to suggest there is a definite limit to what the phenomenon of synaesthesia can tell us about multimedia, because synaesthesia consists precisely of the duplication of information across different sensory modes. To demand something other than duplication is to go beyond the bounds of synaesthetic correspondence.

The paragraph alone makes Cook’s attack seem more severe than in correct context but there remains an interesting point. Synaesthesia is an amazing phenomenon, and has metaphorical relevance but it also has its limits; its subjective nature among synaesthetes and its complete adherence to similarity. We need to be armed with a wider approach to cross-media works. Cook sees synaesthesia as potentially ‘enabling condition for multimedia, but not a sufficient one.’

    To analyse music is also to be committed to the idea that we perceive notes in terms of the relationships between them: we perceive each note as influencing, and being influenced by other notes—or at any rate, if we do not, it is hard to see what we could be analysing. In a nutshell, we analyse the interaction between the elements of the music: that is what analysing music means. And exactly the same applies to multimedia. To analyse something as multimedia is to be committed to the idea that there is some kind of perceptual interaction between its various individual components, such as music, speech, moving images, and so on: for without such interaction there is nothing to analyse.

The rest of the book is based on this idea of interaction and the idea of multimedia as metaphor. Cooks sees in media interaction the potential for emergence (the result of putting medias together as more than the sum of its parts) and sees the metaphor model as invoking ‘similarity not as an end, but as a means.’ Cook talks about semiotics, motion and gesture (something I might elaborate later on with my reading of The Sonic Self) and arrives at a basic model for the analysis of multimedia.

For a full understanding, read the book, but I’ll explain simply and inadequately. We start at the top and decide whether two media’s relationships are consistent or coherent. A colour organ or keyboard would be a simple and perfect example of a consistent relationship (a relationship he sees as quite rare in current multimedia) where as coherence allows for differential elaboration. The next distinction is a fuzzy one, which Cook acknowledges; he also acknowledges the potential for media forms to shift between classifications. These two quotes might help:

    Conformance begins with originary meaning, whether located within one medium or diffused between all; contest, on the other hand, ends in meaning. And the association of conformant models with synaesthestic and metaphysical speculation demonstrates, conformance tends towards the static and the essentialized, whereas contest is intrinsically dynamic and contextual.
    The term ‘contest’ is intended to emphasize the sense in which different media are, so to speak, vying for the same terrain, each attempting to impose its own characteristics upon the other. One might develop the analogy by saying that each medium strives to deconstruct the other, and so create space for itself.
    The mid-point between these two extremes is represented by the third model of multimeda, complementation, which Figure 3.1 represents in negative terms as that which exhibits neither consistency nor contradiction… complmentation is readily associated with the succesive phases of multimedia production. The classical Hollywood film for example, for instance, was in general virtually complete before it was passed onto the composer for scoring: the composers job was understood as one of complementing….

The main point I wanted to get to was Cook’s ideas on contestation. The possible repercussion being that that works such as mine; generative AV work (dynamic and cross imposing characteristics), form an AV relationship of contestation, rather than harmony. Cook would be useful in supporting the argument that such works are linked through metaphor, motion, gesture (and illusion) rather than similarity or some sort of natural harmony.

    Considering arts techniques from the broad perspective of the present, I observed that the best “computer art” did not compare
    well with lacework from Belgium made a century ago. But the computer possessed a unique capability of making very complex pattern flow. One could plan exacting and explicit patterns of action and distinctive motions as intricate as lace, but in a way no Belgian lace maker would ever imagine. – John Whitney, 1980.

This 1975 film is reportedly John Whitney’s first foray into computer graphics. Until ‘Arabesque’, Whitney used a converted mechanism of a World War II M-5 Antiaircraft Gun. Essentially a twelve-foot-high analog computer of amazing complexity; where design templates were placed on three different layers of rotating tables and photographed by multiple-axis rotating cameras.

In Digital Harmony (1980), the book that describes his life’s work, his hypothesis –

    …assumes the existence of a new foundation for a new art. It assumes a broader context in which Pythagorean laws of harmony operate. These laws operate in a graphic context parallel to the established context of music. In other words, the hypothesis assumes that the attractive and repulsive forces of harmony’s consonant/dissonant patterns function outside the dominion of music.

Whitney acknowledges that, ‘Music does not need images any more than paintings need sound’ but saw in computing, ‘a visual medium which is more malleable and swifter than musical airwaves. That medium is light itself.’

The book often communicates personal opinion rather than rigorous argument but Whitney makes some original and interesting points. It seems Whitney is not really pursuing visualisation or a tightly fused AV form. Whitney’s search is instead for abstract graphics with the fluidity, expressiveness and structural qualities of music. Whitney begins the book by highlighting the inherent spatial and visual qualities of music and damning early ‘visual music’ inventions:

    Most people visualize music as two-dimensional, with time represented by the horizontal lines and pitch by vertically arrayed symbols, as is the convention on paper. But the perception of music is not two dimensional. The ears reside at the center of a spherical domain. We hear from all around. We hear music as patterns of ups and downs, to and fro in a distinctly three-dimensional space – a space within.

    The eye, more outwardly oriented, perceives objects and events outside at the point where our eyes focus. Yet the eye enjoys design equally as well as the ear. The mind’s eye shares with the ear any inward experience of architectonic spatial constructions and would perceive them with the same pleasure, were they to exist.

    The fact is, however, that these interior fluid visual edifices hardly exist. Anyone can visualize an architectural fantasy of music dancing in the head, but manifesting in reality is another matter! Each century since Leonardo, a vision, grand and obscure as its myth, compelled one or two inventors to struggle with the pathetic inadequacies of the color organ. Twentieth-century abstract art has been a training ground for visual response to musical experience, but in the mind’s eye, architecture in motion lies at the root of our enjoyment of music. Many people, with closed eyes at a concert, are “watching” the music, but after all these centuries, there still exists no universally acceptable visual equivalent to music! It should exist and it will soon.

Whitney also documents his and others failed attempts at experimental film based endeavours:

    Pointing their cameras at the world, all those “symphonists” inadvertently recorded the stasis of the world, even as they filmed its busiest moments – its winds and storms and birds and water and city traffic. Those films are not symphonies, I thought, poetry perhaps, but not liquid architecture, not music.

    …wherever I pointed my camera, I failed to discover that special quality of any material possessing the controllable visual fluidity that I desired … pointing my camera anywhere resulted in recording images of somewhere. If the camera’s record is unclear, blurred by the smear of too fast panning or being out of focus, the sense of somewhere as place is simply flattened. The spatial content of an image is flattened. The eye resists the attempt to domesticate abstraction. This sort of deception hardly satisfies the eye, because the sense of being (or seeing) somewhere is so strong. The eye is the natural master of pattern recognition. The eye demands satisfaction by invoking in us strong feelings of puzzlement.

And makes the important point that, “No abstraction in my camera had the generative potential, the capability to propagate fluid patterns or especially, the liquid variability of the intervallic families of music tones.”

This is where the computer comes into play and Whitney’s argument gets interesting. Whitney sees a parallel between musical tones and generative animation. Whitney sees music as an abstract and generative form in itself:

    There is no such thing as the harmonic organization of musical tone in nature. Occasionally a stone may ring like a bell, birds pattern “song,” but there are few natural bells, fewer natural flues where the winds sound organ tones. Even the whistle of the wind is eerie and non-musical. Patterning of musical tones is a man-made reality of the aural world, universally accepted as such, but nowhere looked upon as an abstraction that has been extracted (or abstracted) out of the natural environment, nowhere regarded as a manifestation of the environment.

Whitney in deciding that music is not an abstracted picture of anything, allows for his second level of pure abstraction and generation. He focuses on three qualities applicable to both forms:

    A benchmark was reached when I began to apprehend the relationship of the three terms: differential, resonance and harmony. First, motion becomes pattern if objects move differentially. Second, a resolution to order in patterns of motion occurs at points of resonance. And third, this resolution at resonant events, especially at the whole number ratios, characterizes the differential resonant phenomena of visual harmony.

    What I knew about music confirmed for me that emotion derives from the force-fields of musical structuring in tension and motion. Structured motion begets emotion. This, now, is true in a visual world, as it is a truism of music.

Digital Harmony, the documentation of a life’s work is the most comprehensive study of generative animation and its musical potential that I have found yet. It provides some useful counterpoints when compared to Chion’s deconstruction of audio visual relations. A simple reading of Chion would state that audio is predominately temporal while vision is predominately spatial but Whitney’s musical ‘liquid architecture’ metaphor is a wonderful one. Regardless, I’m starting to side with Chion’s idea of ‘audiovisual illusion’ and perhaps through a lifetime of work and focus, Whitney has merely become a better magician.

This is not to say Whitney is wasting his time. Magic is an art form. This also doesn’t devalue his ideas of visual consonance, dissonance, harmony and disharmony. A work where consonance and dissonance is linked between audio and visual, temporally and structurally without doubt creates moments of audio-visual resonance. These ideas are particularly interesting in regards to my choice of song and visual aesthetic.

Michel Chion’s ‘Audio-Vision: Sound of Screen’ provides a rare theoretical framework for studying the audiovisual relationship. Chion forges terms such as synchresis, spatial magnetization, acousmatic sound, reduced listening, rendered sound and Acousmêtrē highlighting the lack of theory regarding sound which he accredits to, “something about sound that bypasses and surprises us, no matter what we do.”

Chion begins by pointing out that there is no “natural and pre-existing harmony between image and sound” and makes a point with definite conceptual implications for the fused AV practitioner:

    Visual and auditory perception are of much more disparate natures than one might think. The reason we are only dimly aware of this is that these two perceptions mutually influence each other in the audiovisual contract, lending each other their respective properties by contamination and projection.

The reassociation of image and sound is the fundamental stone upon which film sound is built. Using example after example Chion highlights the ease in which the viewer can be fooled by sounds ambiguous nature. This reassociation is done for many reasons – often because a simulated or rendered sound seems more real than the original sound. Synchresis explains the phenomena and simultaneously highlights the perceptual possibility of audiovisual construction while revealing the unfeasibility of a true and unique audiovisual harmony:

    The spontaneous and irresistible mental fusion, completely free of any logic, that happens between a sound and a visual when these occur at exactly the same time.

Excluding the small section on music video, it is important to note that Chion’s observations describe the process of adding sound to image – the reverse process to which many fused AV producers work. Chion’s notions of the audiovisual illusion and added value are particularly attractive providing the process works both ways. Imagine the following excerpt with the words sound and image substituted for each other:

    By added value I mean the expressive and informative value with which a sound enriches a given image so as to create the definite impression, in the immediate or remembered experience one has of it, that this information or expression “naturally” comes from what is seen, and is already contained in the image itself.

Chion does later point out the danger of applying musical analogy to film and explains the relationship between counterpoint and harmony by studying audiovisual dissonance; points in audiovisual experience in which the audio has no tangible counterpoint to the film. I’ll be thinking about this some more and applying some other theories of consonance and dissonance in fused AV and visual music.

I just received a big box of books in the mail including David Lynch by Chion, a filmmaker who is definitely worth studying considering his powerful and strange use of audio/music in film. It will also be interesting to apply John Whitney’s quest for Digital Harmony to Chion’s thoughts of audio visual disparity.