The Visuality in the Structure of Japanese Language

Takahiko Iimura


In terms of the visuality of the Chinese characters which Japanese also uses (and which it calls "Kanji"), it has been said that these figures are ideographic. But it is my view that it is not the visuality of the characters but the structure of Japanese which differentiates it from Chinese and makes it a unique communication system.

We say in Japanese "I You see" (Watakushi wa Anata o miru) so far as the order of the words is concerned; in English we say "I see You." The difference in the position of the object indicates the priority in communication: in Japanese, the object "you"; in English, the verb "see". In Japanese the subject is linked to the object directly, whereas in English it is necessary to have a predicate in advance of the object. If we take the subject as "I," as in the above sentence, it is in English that the ego must be set up at a distance from the object. This is in opposition to Japanese, where the syntagmatic contiguity of subject and object (unmediated as it were by the predicate) makes for the assumption of a pre-established ego. In English it is the subject that is most strongly emphasized; this is not so in Japanese.

During the past few years I have studied the structural relationships of language and video using English as a base, although quite naturally I have always had Japanese in mind. Video is a unique system for applying this study of comparative linguistics. It is capable of recording image and sound simultaneously. And in the closed-circuit system (which is self-referential), a camera (observer) is fed back by the monitor (observed) so that the image not only refers to the object which is shot but is also able to refer back to the subject - the observer who is shooting. This constitutes a sentence-like structure. In language too what I am concerned with is not a word as object, but a sentence and its structure (as in my example above).

Sergei Eisenstein was the first to study extensively the relation of image (in film) and language. A pioneer in the creation of the "montage theory," he also engaged in research on Chinese characters and analyzed the ideographic elements in detail; his theory of montage, in fact, developed out of this work.

As an example (though not necessarily one of Eisenstein's), the combination of the word SUN and the word MOON makes the word ILLUMINATION. This combining technique Eisenstein adapted in film editing: montage. Of course, Eisenstein's theory is far more complex than this, but his main interest remained in the word as object, as in the above instance, in the individual shot as image. He correlated word to shot so that, in his theory, the shot is treated as a single photographic object just as the words is separated from the context of its sentence.

Thus, in terms of his theory, there is an analogy between word and image (shot); but the theory does not relate them to the larger internal structure, their contexts - that is, the interrelation of word and sentence, and of shot and the observer. Because of this, I find Dziga Vertov, Eisenstein's contemporary, more interesting and closer to our concerns in his theory of the "kino-eye," as displayed in such works as The Man with a Movie Camera. Vertov dealt with the camera eye and the point of view of the observer - "The Man with the Movie Camera" - in relation to the observed.

What I have tried to do in video in terms of the relation of image to language is to include the observer "I" (the subject) as an integral part of the system and sentence. It is the structure of "seeing" involved for both the observer and the observed which is posited by the closed-circuit system of video. In contrast to the case in film, in closed-circuit video (which connects the camera and monitor in a circuit) it is possible to see the observer and the observed at the same time; there is a feedback in the time / space relation. The relation of the observer and the observed is taken on the language level as "I" and "You" - as in the quoted "I see you." The very sentence is adapted into a closed-circuit system in which a camera and monitor mediate between "I" and "You." And the one with the camera is not only the "I" who observes, but also the "You" who is observed.

This means that an image functions as the observer as well as the observed in relation to speech; these are double functions, and speech has an importance equal to image, either working separately (de-synchronized) from or identically (synchronized) to the image. The words "I" and "You" switch roles according to what is on and off of the screen (monitor) and assume each other's position.

This may be seen, for instance, if the picture says "I" and an off-screen voice says "You" or, conversely, if the picture says "You" and an off-screen voice says "I." As these words have different possible combinations with pictures, the roles of observer and observed shift accordingly. It is the direction of the observation which is affected; the direction from subject to object and from object to subject is interrelated. Therefore, an identical image has different implications according to the description of the words; and, of course, an identical speech is also able to indicate different images.

All of these are taped using the feedback control of a closed-circuit system which permits manipulation of image and sound simultaneously at different levels. The relation of the observer and the observed therefore has the structure of a round-trip.

While I was recording the sound in English to show my videotapes to an American audience, Japanese always came into my mind and I could not avoid comparing English with Japanese. What interested me most was that in Japanese the object comes immediately after the subject. When I direct a camera toward the observed "You," what I am looking at through the viewfinder is "You." If I verbalize the situation it approximates the "I You see" of Japanese more closely than it does the English "I see you." It is in Japanese that the object has to be confirmed first before the appearance of the predicate. In other words, the structure of Japanese is closer to the camera-eye than that of English, so what one may say that Japanese is a language oriented to visual objectivity.

Actually, I often recognized while recording the image into English simultaneously with Japanese the there were certain delays: first, in the relation of the language to the image, and, second, in the relation of English to Japanese because of the English predicate coming before the object.

Furthermore, it is common practice in informal Japanese - written as well as spoken - to omit the subject. That is even closer to the camera-eye: the eye which never indicates who is the subject. When we say "(I) You see"- (Watakushi wa) Anata o miru - omitting the subject, it is obvious who is the subject in certain cases, but there is also the implication of ambiguous possibilities concerning the identity of the subject. (Japanese is a more speaker / situation oriented language than English, so that little misunderstanding occurs when the subject is not directly referred to, except in some obvious cases and in some less clear-cut.) This appears very much like a camera-eye / image which is realized with no subject, and yet it is fully communicable.

The tapes I made in 1975 and in 1976. Observer / Observed and Observer / Observed / Observer, are developed out of the basic pattern of "I see you who is shooting me" and "I see myself who is shooting you." These compound sentences are set up by a pair of facing cameras and monitors which are fed back by each other; therefore exact transfer of sentences is made possible, and sentences are made switchable according to the image of who is shooting whom. The works are composed in advance with the diagrammed program and then taped.

top

(Reprinted from Art & Cinema, December, 1978, New York, pp.16-22)