Encoding certainty: On some epistemic modality markers
in English and Polish research articles. The case of MUST / MUSIEC

Krystyna Warchal (University of Silesia, Poland)



1. Introduction

Once believed to be a realm of objectivity and impersonality, with language acting as a transparent tool for the description of facts, transmission of knowledge and identification of problem areas, academic communication is now widely regarded as a highly rhetorical field of discourse, with speakers/writers pursuing goals that go beyond a simple reflection of the world and planning their discourse actions with these goals in mind (see, e.g., Hyland, 2005; Koutsantoni, 2004; Myers, 1989).

Apart from transmitting information about facts and setting up hypotheses, academic authors aim at convincing their audiences that the matter discussed is indeed important to the field, that their point of view is well supported by data, that the analysis is carried out in an objective way and that their knowledge of the field is extensive enough to sanction claims; in other words, they make a bid for acceptance.

 It has also been observed that the world of academic discourse is far from homogeneous, that its values, norms and practices are far from universal, and that there are often cultural differences in the kinds and hierarchy of goals that are set up by communicators and in the preferred strategies of their pursuit (see, e.g., Clyne, 1987; Duszak, 1997; Flottum et al., 2006).

This paper addresses the problem of such culture-sensitive interpersonal meanings in academic written discourse by looking into epistemic modality markers in one academic genre, the research article. More specifically, it attempts to discuss the meaning and function of the high-value modal auxiliary MUST and its Polish modal lexical equivalent MUSIEC  in two (English and Polish) corpora of research articles, 200 papers each,  with a view to identifying possible differences in the way the psychological state of certainty is coded in these two languages.

2.  Epimistic modality markers in English and Polish

 Epistemic modality encodes the speaker’s commitment to the expressed proposition and his or her assessment of its probability. While encoding a message concerning a particular state of affairs, the sender expresses certainty, belief or doubt about its actual occurrence - in the past, at the moment of speaking or in the future. As Tutak (2003: 63) observes, the problem is not whether the statement concerning a particular state of affairs and the state of affairs in the real world tally, but how the relation between the two is construed by the speaker or writer.

Epistemic modality expresses either possibility or necessity that something is, or is not the case (Palmer, 1979: 41), with epistemic possibility encoding the speaker’s lack of confidence in the proposition expressed and epistemic necessity relaying the speaker’s confidence in the truth of the statement (Coates, 1983: 41, 131), as shown in Examples 1 and 2 respectively.

  1. This may be entirely correct, but how to apply this thinking to the present circumstance with precision is not entirely clear. (LP2001-8)
  2. There must be somethingwrong with the question. (LS2004-3)

In her cross-linguistic study of epistemic modality in research articles, Vold (2006: 65) identifies epistemic modality markers according to the following criteria:

  1. the marker must explicitly qualify the truth value of a certain propositional content (to the    exclusion of such verbs as propose, which being reporting verbs, contribute to the propositional content and if they qualify it at all, then it is an implicit qualification;
  2. it must be a lexical or a grammatical unit.

The next two subsections explore the meaning of epistemic MUST and MUSIEC, the modal verbs imparting the highest degree of epistemic necessity and commitment in English and in Polish.

2.1   Epistemic MUST

MUST can express obligation (the essential root meaning), root necessity (weaker root meaning) and epistemic necessity (Coates, 1995: 145f). Epistemic Must is best paraphrased as ‘the only conclusion is that…’. “It essentially makes a conclusive judgment, usually from evidence of some kind” (Palmer, 1988 [1965]: 122).

Discussing the epistemic meaning of MUST, Coates (1983: 41) suggests that it should be viewed along a cline extending from the subjective core  MUST of the I-confidently-infer-that-x’ type to the objective peripheral MUST of the ‘in-the-light-of-what-is-known-it-is-necessarily-the-case-that-x’ type. Coates lists the following features of the core epistemic must:

  1. Main predication refers to state or activity in the present or past (must have).
  2. Subject is frequently inanimate . . .
  3. Verb is usually stative.
  4. Speaker expresses confidence in truth of utterance. (Coates, 1983: 42)

As regards the peripheral meaning of MUST, Coates (1983: 43) adds also (rare) instances of

  1. reference to states and activities in the future; and
  2. lack of speaker’s involvement with MUST expressing pure logical necessity.

2.2   Epistemic MUSIEC

According to Slownik języka polskiego (1994), epistemic MUSIEC indicates probability that the action expressed by the lexical verb indeed has taken or is taking place. As an epistemic modal operator, it scopes over the subject, modalising the entire proposition, imparting high degrees of confidence in the truth of the proposition at the time of speaking (Ligara, 1997: 96).  With regard to epistemic MUSIEC,  she observes that:

  1. it occurs in the present and past forms but the tense marking affects the lexical verb (infinitive), not the modal predication (which refers to the moment of speaking); and
  2. the present tense form may combine with present or future time reference of the lexical verb; a distinction signalled by the aspect of the infinitive (perfective for future time reference in relation to the moment of speaking;  Ligara, 1997: 100).


3.   Epistemic MUST and MUSIEC in research articles

3.1 The corpus

This paper attempts to investigate the ways in which the epistemic modal verbs MUST and MUSIEC are used in English and Polish research articles to impart the highest degrees of commitment to claims, and in this way to express the psychological state of certainty. The study was conducted on two corpora of texts (English and Polish), each consisting of 200 research articles published in the years 2001-2006 in linguistics-related journals, each journal contributing at most 40 articles of varied length.

The English corpus included electronically available papers published in the following internationally recognised journals: Journal of Pragmatics, Language and Communication, Language Sciences, Lingua, and Linguistics and Philosophy. The total number of words in the corpus was about 2,400,000. On the basis of the affiliation notes of the first two authors it was assumed that the writers had a native-like command of English.

The Polish corpus comprised research papers published in the following journals: Acta Baltico-Slavica, Biuletyn Polskiego Towarzystwa Językoznawczego, Etnolingwistyka, Język a Kultura, Onomastica, Poradnik Językowy, Slavia Meridionalis, and Studia z Filologii Polskiej i Słowiańskiej, all of them included in the 2003 list of Polish scientific journals issued by KBN (Polish Committee for Scientific Research). The total number of words in the corpus was approximately 1,000,000. The first two authors of each article were checked for affiliation at Polish academic institutions. On this basis it was assumed that the articles were written in good quality academic Polish. Since few Polish journals make their issues available in the electronic form, most of the Polish material was scanned with HP Scanjet G3010, the scans manually controlled for accuracy, time permitting.

The corpora were scanned with Oxford Word Smith Tools 4.0 for Windows for occurrences of MUST and MUSIEC (with inflected forms). The resulting lists were saved as 160 character sequences available for immediate survey. This context was wide enough to eliminate the occurrences of the search words in examples and instances of mention (as contrasted with use). This produced a list of 1,662 entries for the English corpus of data and a list of 278 entries for the Polish corpus. Further analysis was conducted by entering each file separately to remove direct quotations, which produced lists of 1,549 and 256 words for each search word.

Next, instances of MUST and MUSIEC were identified to modality type: root or epistemic. At this stage the analysis was limited to epistemic uses: firstly, occurrences of  MUST and MUSIEC in main and subordinate clauses were identified; secondly, it was determined which instances were attributed to other authorities or specific points of view; and thirdly, the records were classified as epistemic proper, inferred evidential or quotative evidential (the last mentioned group overlapping with attributions; Examples 3-5 respectively). Further discussion was limited to main clauses only.

  1. The choice of which of the two homonyms not to use and which to use must have been based on both the availability of a synonym for English and the increased usage and therefore increased (primary) unmarked status and importance of  English (JP2003-2).
  2. If and-parentheticals are dysfluencies, then they must be regarded as dysfluencies that arise in the pursuit of relevance. (JP2005-10)
  3. But as Peirce tells us, and as we are reminded by Keane (this volume), icons and indexes in and of themselves ‘assert nothing’ (Peirce, 1955, p.111); there must be some means of construing these signs as significant. (LC2003-8)

3.2  Epistemic MUST in research articles(*)

Among the 1549 occurrences of MUST in the English corpus of research articles, only 16% carried the epistemic meaning (Fig. 1), which tallies with the results obtained by Keck and Biber (2004) for the written subcorpus of academic texts (the T2K-SWAL Corpus). In 7% of cases the status of MUST was unclear either because of ambiguity (unresolved by the context) or because of a merger of epistemic and root senses.

Fig. 1: Root and epistemic MUST
More than half of the epistemic uses were found in main clauses (56%), with 44% of epistemic MUST recorded in subordinate clauses (Fig. 2). Among the main clause occurrences, 6% were found to be attributed (3% of all epistemic records; Ex. 10); attributed uses in subordinate clauses amounted to 30% of subordinate clause occurrences (13% of all epistemic records; Ex. 11). Altogether, 16% of all epistemic records of MUST were found to be attributed.

  1. [Instead] of CP being targeted for deletion, as Hiraiwa and Ishihara and others propose, it must be TP which is deleted, parallel to the English structure, since otherwise *t2 would be eliminated and the structure should be grammatical, contrary to fact. (LP2005-10)
  2. Bolkestein (1998, p. 211) comments that topicality and focality must arise in the pragmatic component of a comprehensive linguistic model, though she does not go so far as to say that pragmatic function assignment itself belongs to such a component. (LS2005-4)

Fig. 2:  Epistemic MUST in main and subordinate clauses (attributed and authorial)

An analysis of the type of epistemic modal meaning was limited to main clauses, subordinate clause MUST awaiting a separate discussion. More than half of the main clause records of epistemic MUST were identified as inferred evidentials (54%; Ex. 8), epistemic proper cases were recorded in 40% of cases (Ex. 9), while quotatives (attributed) accounted for 6% of the main clause epistemic senses (Ex. 10).

 As shown in Fig. 3, modality based on indirect evidence (inferred and quotative evidentials) accounted for 60% of all the main clause epistemic MUST. By attributing the degree of certainty expressed by the modal verb to an external authority, quotatives absolve the speaker from responsibility for the claim and so reduce his or her involvement. With regard to inferred evidentials, they provide a justification for the speaker’s claim, thereby reducing the authoritativeness of the claim, the soundness of which can be evaluated by receivers on their own.

  1.  Consequently, the modelling of such a resource must, in some systematic way, integrate the options into a single system. (LS2005-5)
  2.  (We might charitably conclude that this is an infelicity of expression; but there must be an element of doubt about the clarity of Chomsky’s thinking here. (LS2003-5)
  3.  And since subjects constitute themselves by outward objectification in (linguistic) expression, this authentic expression must embody itself in a ‘‘distinctive way ofspeaking and/or writing’’ which can then be claimed as “a language of one’s own,” the study of which becomes a central field for analysis(Cameron and Kulick, 2003, p.xiii)

Fig. 3: Epistemicproper, inferred evidential and quotative evidential MUST in main clauses

3.3   Epistemic MUSIEC in research articles

Among the 256 occurrences of MUSIEC in the Polish corpus of research articles, only 14% were identified as epistemic (Fig. 4). In 5% of the cases it was not possible to establish the prevailing or most likely type of modality at work; these cases remained ambiguous.

Fig. 4: Root and epistemic MUSIEC

Among the few epistemic records of MUSIEC, almost three fourths were found in main clauses (73%). As for attribution, no such instances were noted among epistemic MUSIEC in either main or subordinate clauses (Fig. 5).

Fig. 5: Epistemic MUSIEC in main and subordinate clauses (attributed and authorial)

As with epistemic MUST, an analysis of the type of epistemic meaning was limited to main clauses. The results obtained here were reminiscent of those for MUST (Fig. 6), except that quoted evidentials were absent from the Polish corpus (no attributed occurrences of MUSIEC were recorded). Thus more than half of the main clause occurrences of MUSIEC were classified as inferred evidentials (57%), the remaining 43% identified as proper evidentials (Ex.11, 12 respectively):

  1.  Tego rodzaju nacechowaną diatezę sygnalizować mogą oczywiście wyłącznie tzw. czasowniki przechodnie, czyli takie, które dopuszczają możliwość ujawnienia przez podmiot drugiego argumentu reprezentowanego przez siebie predykatu. Czasowniki mogące sygnalizować tę diatezę reprezentować zatem muszą predykaty conajmniej dwuargumentowe. (BPTJ2001-6)
    [This kind of marked diathesis can of course be signalled only by so called transitive verbs, that is verbs which allow the subject to disclose the second argument of the predicate they represent. Thus verbs which can signal this diathesis must be predicates with at least two arguments.]
  2. Liczne skupisko nazw terenowych koło Szczecina (6 nazw) musiało pierwotnie mieć związek z wcześniejszą nazwą miejscową lub rzeczną. (ON2003-4)
    [The big cluster of toponyms in the vicinity of Szczecin must originally have been connected with a former name of a place or a river]

Fig. 6: Epistemic proper, inferred evidential and quotative evidential MUSIEC in main clauses

4.   Concluding remarks

The first important and immediately observable difference between the two corpora was their size. With the same number of articles representing closely related fields, the English corpus proved almost 2.5 times as big as the Polish one (both included complete papers, with bibliographic references, tables and notes, as well as abstracts, if these were published with the article). It seems that the difference in length between standard English and standard Polish linguistic papers is a stable feature, which cannot be accounted for merely by the presence of several very short contributions in the Polish corpus or several exceptionally long ones in the English one.

Another very well marked difference is the frequency of MUST and MUSIEC in the English and Polish corpora. In the English corpus, there were 1,549 findings of MUST (examples, ‘mention’, and direct quotations omitted), while in the Polish corpus there were merely 256 records of MUSIEC, which, allowing for the difference in the number of words in the two corpora, gives a ratio of 2.4:1 for MUST and MUSIEC respectively. These results pose the following questions:

  1. Do Polish academic authors writing in Polish rely on markers of epistemic modality to a lesser extent than English authors?
  2. Are Polish academic authors writing in Polish less inclined to encode high degrees of certainty and commitment to the expressed propositions than English authors?
  3. Do Polish academic authors writing in Polish consistently choose other linguistic means to encode high degrees of certainty and commitment to the expressed propositions than those preferred by English authors?

To find answers to these questions, a frequency analysis is needed of a variety of epistemic modality markers in English and Polish research articles (Question 1), an analysis of frequency of those encoding high degrees of certainty and commitment against other epistemic markers (Question 2), and, finally, an analysis of frequency of MUST and MUSIEC against other high-value markers of epistemic modality (Question 3). This is what the present research has not done.

It shows, though, that both MUST and MUSIEC are comparatively rarely used epistemically in research articles -16% of all the English records and 14% of the Polish records were epistemic. Differences were observed with regard to the occurrence of the epistemic modal verbs in main and subordinate clauses: if in the English corpus the preference for MUST in main clauses was weakly marked (56% in main clauses), in the Polish corpus MUSIEC occurred in main clauses in almost three fourths of all cases (73%). Another important difference concerned attribution: 16% of all epistemic records were found to be attributed in English, while no such cases were noted in Polish.

With regard to the type of epistemic meaning in main clauses, the results were similar in both corpora, with inferrential evidential MUST and MUSIEC appearing most often and with a similar frequency (54% and 57% respectively), followed by epistemic proper (40% and 43% respectively), in the English corpus followed by a small number of quotatives (6%), which were absent from the Polish texts. Altogether, indirect evidence was evoked with a similar frequency by English and Polish authors, in 60% and 57% of main clause epistemic MUST and MUSIEC respectively.

5.  Interpretation

The results may indicate that, on the one hand, Polish authors seem to be more reluctant to rely on MUSIEC as a marker of certainty and epistemic necessity than English authors with regard to MUST. On the other hand, MUSIEC, if used, appears to be stronger and slightly more authoritative in that in the corpus of texts it does not occur with attributions and is markedly more frequent in main clauses than MUST. At the same time, the research shows that both epistemic MUST and epistemic MUSIEC in research papers are more frequently used as vectors of indirect evidentiality than exponents of proper epistemic meanings, which reduces the speaker’s involvement and responsibility for the expressed claim.

 It seems that the differences and similarities between epistemic MUST and MUSIEC in research articles will be better understood, if additional contextual factors are taken into account, such as the type of subject, the type of lexical verb, co-occurrence with selected syntactic features (e.g. the passive voice), occurrence in harmonic combinations (e.g. with of course / oczywiście) and the presence of hedges expressions that downtone the force of a statement either by limiting the commitment of the author to th expressed proposition, or by limiting the validity of the proposition (e.g. it seems that / wydaje się, że).




* Epistemic and root uses of MUST in research articles are discussed and compared in Warchal  (2008).

For quotation purposes:
Krystyna Warchal: Encoding certainty: On some epistemic modality markers in English and Polish research articles. The case of MUST / MUSIEC

