Trans Internet-Zeitschrift für Kulturwissenschaften 15. Nr. Juli 2004

1.2. Signs, Texts, Cultures. Conviviality from a Semiotic Point of View /
Zeichen, Texte, Kulturen. Konvivialität aus semiotischer Perspektive"

HerausgeberIn | Editor | Éditeur: Jeff Bernard (Wien)

Buch: Das Verbindende der Kulturen | Book: The Unifying Aspects of Cultures | Livre: Les points communs des cultures

Grundlagen/Fundamentals Teil 1/Part 1:
Teil 2/Part 2:
Moderation / Chair: Astrid Hönigsperger
Teil 3/Part 3:
Teil 4/Part 4:
Nonverbale Zeichen/Non-verbal Signs

Verbal/Nonverbal Interaction
(Cognitive and Functional Viewpoints)

Natalya V. Sukhova (Moscow)


Summary: The article aims at investigating two viewpoints on verbal/nonverbal interaction. Verbal and nonverbal sign systems are interacting in a communicative act. There is a certain zone of their interaction which can be viewed through cognitive and functional approaches. The cognitive approach deals with the production of verbal and nonverbal signs. Semiosis starts on the level of symbolic representation and goes on through motor programming. The process consists of: intention - meaning - forms of mental representation (verbal and nonverbal) - concrete means of expression (words and gestural movements). The functional approach is concentrated on the interrelation between verbal and nonverbal units in a communicative act.


The study of the interaction between gestures and speech is being done from different perspectives; evolutional, cognitive, functional ones are the most notable. The necessity of such study is evident because verbal and nonverbal behavior associate in motor execution and in perception and these two systems may relate in several ways that experimental analysis has to distinguish.

We shall consider the cognitive and functional levels of verbal/nonverbal interrelation with reference to the experimental data.


The cognitive approach to gesture-speech interaction

Theoretical considerations

The cognitive approach deals with two major problems: production and perception of language and gestures. The aim of it is to analyze physiological, psychological and cognitive mechanisms of speech production and speech perception and to study the place of gesture in them.

Firstly, we shall turn to the mechanisms of speech production in general as they are identical to the production mechanisms which occur when the utterance is accompanied with a gestural phrase.

Psychological and psycholinguistic investigations have proved that the processes of production and perception of any behavioral act, speech acts including, develop in similar stages. It means that the way we produce speech units is practically the same as we perceive them. The only difference is the direction of the processes. Such similarity enables the researchers to study the process of production, which is more complicated for the investigation, through perception and to obtain data about the production process. Hence, there are a number of works devoted to the process of speech production and its different stages.

The general outlines of them are as follows. There are four large functional blocks or planes in any speech act production: 1) a plane of orientation; 2) a plane of utterance forming; 3) a plane of realization; 4) a plane of control. For quite a long time only a plane of realization has been in the forefront. However, they are the mechanisms of orientation and utterance forming which are the most important in the production process.

A plane of orientation consists of an initial motive to speak, and then it develops into an idea, which stimulates speech intention.

A plane of realization triggers first an interior and then an exterior speech. Thus, a mechanism of interior programming of an utterance is activated. Then the transition from the program to a syntactic/grammatical structure of a sentence is fulfilled through mechanisms of grammatical prognosis. Then there is a search for an appropriate word by some semantic and phonetic attributes. The next stage is a motor prognosis with phonetic operations and filling in a programmed form. The result is a realized exterior sounded speech.

A plane of control is necessary to compare the resulted speech utterance with an initial speech program. If needed the utterance is corrected.

These planes are also working and developing in the case of a mixed utterance, when the speech utterance is accompanied by a gestural phrase(1) A gestural accompaniment involves some peculiarities. We shall consider the basic planes and the most important mechanisms of utterance production with gestural involvement.

Firstly, a speech mechanism is triggered with a motive and a communicative intention. Speech intention, or the aim of the utterance, is based on memory, motivation, afferential elements and a starting stimulus. Thus, a motive triggers an idea, which is developing and forming and then the whole process results in interior/exterior utterance (verbal and nonverbal). The aim of a speech act forms the meaning of a future utterance.

Some scholars argue that speech and gesture function together to convey one and the same meaning (McNeill et al. 1994, Kendon 1997, etc.). However, there is no semantic redundancy in their co-functioning, as it was investigated. Speech and gesture evoke in a listener some conceptual changes, forming a single cognitive and semantic unit.

Secondly, after setting the aim, the mechanisms of symbolic representation are activated. Both phenomena - speech and gesture - are forms of mental representation. They are an entity on the deep representational level, but then they go different ways. Meanings are not transformed into a gestural unit through linguistic formats; they are transmitted directly and independently (cf. Kendon 1987). The production of a gestural and a linguistic sign is understood as two aspects of one and the same representational process, though they are organized separately from each other. Moreover, the channels of transmission are different.

Thirdly, the process of realization is very complicated to study. However, some attempts have been made. For instance, Schegloff (1986) suggested that there is a "projection space" where a word appears for the first time (the programming stage)(2). It stays there at least "as early as the thrust/acme or perhaps even the onset of the gesture selected or constructed by reference to it" (Schegloff 1986: 278). Hence, a gesture can be finished before the appearance of an affiliated word, and as well it can coincide with it. However, there are cases when the word takes the lead over the gesture. This study has shown that there is a well-organized program of actions to produce an utterance with gestural accompaniment. Moreover, the semantic prognosis is realized quicker on a gestural level than a verbal one; words at the moment are still on the stage of deep motor programming.

Summing it all up, the interrelation between gestural and verbal parts of an utterance starts when the process of aim-setting starts, when the one meaningful cognitive entity is formed (see Fig. 1). Then the stage of programming follows: meaning embodies into an utterance(3), moreover, sometimes a gestural phrase passes the verbal one already in the zone of symbolic representation. The motor programming stage is characterized by the forming of a common meaning and packing it into verbal and gestural formats, which come out as speech signals and certain gestural movements. Then the realized utterance is corrected in accordance with the initial model if necessary.

Fig. 1: Production of utterance with gestural accompaniment / Planes of production


The functional approach to gesture-speech interaction

Theoretical considerations

The functional approach is concentrated on joint speech-gesture functioning in a communicative act, on their functional roles. It aims at defining the basis on which the comparative analysis to investigate functional gesture-speech interaction can be held.

We shall consider two main points in speech-gesture co-functioning, namely: 1) the elementary level (a level of verbal-nonverbal constituents); 2) the communicative level (a level of their functional roles).

Verbal as well as nonverbal sign systems are grounded on three essential parts: a) elements of a system; b) functioning of these elements within its own system in space and time; c) co-functioning with other systems.

As far as elements are concerned, they have a form; a meaning; syntactic relations (elements collocation); pragmatic relations (a sign-speaker relation, a sign-listener relation, a sign-situation relation). Verbal elements/signs are quite clear in that respect. Nonverbal signs need some clarification.

The nonverbal system can be roughly viewed as a following structure (see Fig. 2).

Fig. 2: Parts of the nonverbal system, their elements and hierarchical structure

In our work we deal with a kinetic code (gestures and gestural movements) wherein hands, head, facial expressions, body movements and postures (all of them being mostly illustrators) are regarded from the communicative aspect.

It seems important to mention some general characteristics concerning illustrators as far as the illustrators are in the focus of our further attention.

Illustrators are speech-related gestures serving to illustrate what is being said verbally. They may relate to phrases, content of utterances, melodic contour, loudness, etc. Most frequent here are the gestures of hands.

Origin. Illustrators are "socially learned through imitation" (Ekman/Friesen 1981: 74). They have ethnic, cultural, social peculiarities, which show themselves through different types of illustrative gestures and their usage frequency.

Coding. Practically all illustrators are either iconically or intrinsically coded. Coding iconically means that the nonverbal illustrative act stands for something else, external towards the action, but looks in some way like what it means (e.g. a hand gesture picturing a fish of a big size). Intrinsic coding presupposes that the nonverbal act meaning is intrinsic towards the action, but it does not resemble its significant (e.g. a hand gesture imitating a movement of shutting the book with effort - it is one form of being irritated, but not irritation itself).

Gestural structure. The structure of gesture, its minimal unit, is considered to be the most difficult problem to study. As for elementary constitution of gestures, it is based on a notion of "kin" (Birdwhistell 1952). It should be mentioned that a kin is produced in the similar way as the words are. Therefore, there are three phases of a gestural "articulation" (see Kreidlin 2001): excursion (the initiation of a gesture, introduction of it, when the gesture form is being prepared); realization/production (a gesture production, which has a peak or culmination); recursion (the close of the gesture). An example of a kin is a gaze.

As for the "global" gestural structure, nowadays we can summarize some significant investigations in the field and say that there are:

a) Kinemes (in Birdwhistell's terms), or nonverbal signals of simple structure (for the details see Konetskaya 1997). They are one-component signs (e.g. a nod);
b) Clusters of kinemes, or nonverbal signals of compound structure. There are several components involved (e.g. a gaze and a gesture denoting some entity, as they always go together);
c) Complex of kinemes, or nonverbal signals of complex structure. A complex of gestures (e.g. a gaze, a hand movement and a posture to express some feeling).

The elementary gestural distinctions are not always used as there are not very many studies which need them (see, for example, G. Calbris 1999, and others). Most of the scholars are interested in the flow of gestures in speech acts. Hence, it is important to distinguish larger gestural units. For instance, it has been suggested that gestural movements are organized into sequences of phrases (Kendon 1987), which, in their turn, form other hierarchically higher levels of phrases, such as superphrasal unities and texts, compatible with speech organization.

Position in an utterance. There can be several positions of an illustrative gesture in the utterance as it has been found in the literature. First, the gesture coincides with hesitational elements in the phrase; secondly, it may develop simultaneously with speech phrase. The latter position is more common, and it is proved (cf. Kendon 1987) by the fact that a gestural phrase usually overlaps with a tone unit (intonation group).

Usage/Functions. To give an outline, I mention that there is a number of functional classifications of illustrators. They are based on the information which gestures render (Ekman/Friesen 1981, McNeill et al. 1994, Kreidlin 2001); on the role of illustrators in the dialogue (Bavelas 1994, Fox 1999).

The wide-spread edited classification from the first group is as follows;

- batons (gesture emphasizes words);
- underliners (emphasizes sentences);
- ideographs (sketching a direction of thought);
- kinetographs (depicting actions);
- pictographs (showing objects);
- rhythmics (depicting the rhythm or tempo of an event);
- spatials (depicting spaces);
- deictics (pointing to objects);
- identificators (style and manner of behavior);
- externalizers (they are reactions to what has been said: hand movements, body movements, etc.).

The second group is presented by works of Bavelas (1994) who stresses the role of gestures in a communicative act, either they are topic, or listener-oriented. So, topic gestures depict some aspect of the topical content of the conversation (e.g. the size of an object). They may shed more light on speech-gesture interaction. Interactive gestures, those which are oriented towards the listener, are a much smaller group; they provide no information about the topic-at-hand - though they serve many functions necessary for dialogue (e.g. a gesture marks the material which is probably already known by the addressee).

To put it generally, the nonverbal illustrative acts are gestures which accompany speech utterances. We have mentioned structure peculiarities of illustrators and their functional classifications. They have a form and they play various roles in a speech flow. They interact and co-function with a speech utterance.

Thus, study of verbal/nonverbal interaction on a functional level can be approached by investigating peculiarities of verbal/nonverbal elements or by their communicative value within the communicative act.


Cognitive and functional gesture-speech interaction

We have conducted an experiment aiming at studying different ways of verbal/nonverbal interaction, namely: what are the relations between a prosodic nucleus of the utterance and gestures (kinetic forms), concomitant with it.

The empirical material gives rise to the following hypothesis: there is a definite interrelation between prosodic and nonverbal units which even may form prosodio-nonverbal complexes in certain speech acts aiming at intentional possibilities of an utterance.

Thus, the hypothesis is to predict a specific relationship between nonverbal and verbal behaviors, and the difference in this relationship depending on social and age group-belonging of subjects.

There are several essential stages to solve the concrete tasks:

a) to analyze the way how a nucleus of an utterance and a gestural phrase interact to render the aim of the utterance (a cognitive level of an utterance production);
b) to analyze the way how certain gesture combinations (gestural phrases) collocate with a prosodically prominent nucleus of an intonation group, or an utterance (a communicative/functional level).

Experimental material

We considered two documentaries produced by British film companies(4). The whole scope of the episodes used as an experimental corpus were 363 meaningful fragments. But for the purpose of the work we have chosen 59 episodes. An overall time of recording (59 episodes) is 32 minutes.

Participants are only representatives of the upper middle class and upper class of English society. Their age varies from 40 to 70 years old.

Thus, social status, age and sex are relevant and rather decisive characteristics, social status being the most essential.


The analysis consisted of three main parts: auditory analysis, visual analysis and computational analysis. The material was chosen first, processed by the author and informants aurally and visually. The data was put into elaborated protocols: a combination of prominent visual (kinetic forms) and auditory (nuclei of utterances) data base. The next stage was to process the data with computer programs like Sound Edit 16 (v. 2.0.7) and then with Praat v. 4.1.3.

Results and discussion

Preliminarily we can distinguish the following results.

There are a number of intentions the speaker tends to convey. N.K. Ryabtseva (1994) considers that the subject utters something if he wants:

a) to communicate new information he has;
b) to involve the listener into the changing world;
c) to express his emotional and rational attitude towards the changes in the world and in himself.

The intentional possibilities of the speaker are formed in terms of communicative and pragmatic types of utterance. I.E. Galochkina (1985) suggests four sets of pragmatic types, each of which has a center and a peripheral part. These types are statements, commands, estimations, phatical utterances and the intermediate zone of interrogations. The core of this system is types with definite pragmatic meaning, at the edges of the system there are types with several pragmatical intentions. The conclusion drawn by the researcher is the most valuable: prosody conveys only some pragmatic meanings but the true communicative intention is formed only as a result of the correlation of different elements from different levels. Thus, within a certain pragmatic type there will be various modifications of prosodic meanings, which function as a complex with all other elements in a communicative act.

The scope of our material is characterized by three pragmatic types: statements, estimations and commands. The combination of estimation and statement is the most frequent. It is practically 100% of episodes where it occurs. So estimation/statement type means that the speaker intends to express his/her attitude towards people, situation, action, event, etc. and at the same time he/she conveys some new information about the subject, which can be people, situation, action, event, etc. The way of emotional expression of the meaning varies significantly: from positive to negative (kind, joyful, sympathetic, objective, persuasive, subjective, with indignation, sarcastic, indifferent expression, etc.). Thus, the intentions of the utterances, which have been elicited, may be generally viewed as follows:

a) to tell about (objectively):

- event,
- reaction to it,
- person,
- his/her inner world,
- time context,
- opinions,
- relationships/feelings

b) to describe (subjectively):

- situation,
- person,
- people,
- surroundings/atmosphere,
- actions,
- events,
- feelings/opinions,
- different attitudes towards people, yourself, things

c) to estimate (subjectively):

- person

d) to give one's own opinion (subjectively and emotionally charged)

Prominent nonverbal elements function in all pragmatic types. In the pragmatic type of statement and estimation combination there are gestural phrases(5) which repeat, double and add something to the meaning of a speech utterance. In the pragmatic type of statement only gestures repeat what was being said verbally. In the pragmatic type of estimation only gestures add something to the verbal meaning, strengthen it, or substitute the verbal phrase with the identical meaning. There is only one case of command, where gestures repeat and add something to the verbal phrase.

Thus, gestural phrases (complex of gestural movements) function in pragmatic types for certain reasons (intention of the utterance). To convey the utterance intention they work together with prosodic nuclei and form certain prosodio-nonverbal complexes.

Thus, summing the whole data up, we have the resulting table (Fig. 3).

Fig. 3: Interrelation between prominent prosodic elements (nuclei) and prominent nonverbal elements (%)

There are 54% of 59 cases where hand movements are exploited at all; 93% of the cases bear head movements; 95% of the cases have notable changes in facial expressions; 75% of the cases are characterized with changes in body positions and 15% in postures. Hence, all nonverbal elements may be ranged in the following manner starting from the most used - facial expressions, head, body movements, hands and postures.

There is figure 100% in 27% of 59 cases in hands. It means that 27% of the cases of prosodic nuclei bear 100% changes in hands movements; 73% goes with head movements, 69% - with facial expressions; 19% - with body movements; and 2% - with postures. Hence, it is head movements which accompany the most cases of prosodic nuclei.

It is also important that there are cases of complete coincidence of each prominent prosodic element and each nonverbal element (16 cases all in all from 59 cases). It means that there are 38% of 16 cases which have full coincidence in nuclei and hand movements; 100% of 16 cases bear head movements and facial expressions; 50% of 16 cases have changes in body movements.

The next stage of this study is concentrated on the concrete correlation between prosodic nuclei (certain tone) and definite gestural phrase. The work is being continued at the moment.

© Natalya V. Sukhova (Moscow)


(1) People do not always gesture while speaking.

(2) Schegloff has considered the gestural stresses on the stressed words and the iconic gestures, accompanying certain lexical units.

(3) We should underline once again that we are dealing with the utterances concomitant with gestures.

(4) Diana. Princess of Wales - Her Life (1997), Churchill (1997).

(5) These are the preliminary results of pragmatic type - gestural function relations.


Bavelas, Janet B. (1994). "Gestures as part of speech: methodological implications". Research on Language and Social Interaction 27(3): 201-221 [Lawrence Erlbaum Associates, Inc.]

Birdwhistell, Ray L. (1952). Introduction to Kinesics: An Annotation System for Analysis of Body Motion and Gesture. Louisville: University

Calbris, Genevieva (1987). "Geste et motivation". Semiotica 65(1/2): 57-96

Ekman, P.aul & Wallace V. Friesen (1981). "The repertoire of nonverbal behavior". In: Sebeok, Thomas A.; Umiker-Sebeok, Jean & Adam Kendon (eds.). Nonverbal Communication, Interaction, and Gesture. Selection from Semiotica (=Approaches to Semiotics 41). The Hague-Paris-New York: Mouton Publishers, 46-105

Fox, Barbara (1999). "Directions in research: language and the body. Research on Language and Social Interaction 32(1-2): 51-59 [London: Lawrence Erlbaum Associates, Inc.]

Galochkina, Irina E. (1985). Rol' intonatsii v formirovanii pragmaticheskikh tipov vyskasyvaniy. Kandidatskaya dissertatsiya. M.oscow: Ms., 186p.

McNeill, David; Cassell, Justine & Karl-Erik McCullough (1994). "Communicative effects speech-mismatched of gestures". Research on Language and Social Interaction 27(3): 223-237 [London: Lawrence Erlbaum Associates, Inc.]

Kendon, Adam (1987). "On gesture: its complementary relationship with speech". In: Siegman, Aron W. & Stanley Feldstein (eds.). Nonverbal Behavior and Communication. 2nd. ed. London: Lawrence Erlbaum Associates, Publ., 65-97

- (1994). "Introduction" to the Special Issue: "Gesture and Understanding in Interaction". Research on Language and Social Interaction 27(3): 171-173 [London: Lawrence Erlbaum Associates, Inc.]

Kreidlin, Grigory E. (2001). "Kinesika" In: Grigor'yeva, S.A.; Grigor'yev, N.V. & G.E. Kreidlin. Slovar' yazyka russkikh zhestov. Moscow-Vienna: Yazyki russkoi kultury, 166-254

Konetskaya, Victoria P. (1997). Sotsiologiya kommunikatsii. Uchebnik. Moscow: Mezhdunarodnyi universitet bisnessa i upravleniya "Brat'ya Karich", 302ff.

Ryabtseva, Natalya K. (1994). "Kommunikativnyi modus i metarech'". In: Arutyunova, Natalya D. (ed.). Logicheskiy analiz yazyka: Yazyk rechevykh deistviy. Moscow: Nauka, 82-93

Schegloff, Emanuel A. (1986). "On some gestures' relation to talk". In: Atkinson, Maxwell J. & John Heritage. (eds.) Structures of Social Actions. Studies in Conversational Analysis. Cambridge University Press, 266-296

Grundlagen/Fundamentals Teil 1/Part 1:
Teil 2/Part 2:
Moderation / Chair: Astrid Hönigsperger
Teil 3/Part 3:
Teil 4/Part 4:
Nonverbale Zeichen/Non-verbal Signs

1.2. Signs, Texts, Cultures. Conviviality from a Semiotic Point of View /
Zeichen, Texte, Kulturen. Konvivialität aus semiotischer Perspektive"

Sektionsgruppen | Section Groups | Groupes de sections

TRANS       Inhalt | Table of Contents | Contenu  15 Nr.

For quotation purposes:
Natalya V. Sukhova (Moscow): Verbal/Nonverbal Interaction. (Cognitive and Functional Viewpoints). In: TRANS. Internet-Zeitschrift für Kulturwissenschaften. No. 15/2003. WWW:

Webmeister: Peter R. Horn     last change: 2.7.2004    INST