1 Collocation and lexical priming


In this book I want to argue for a new theory of the lexicon, which amounts to a new theory of language. The theory reverses the roles of lexis and grammar, arguing that lexis is complexly and systematically structured and that grammar is an outcome of this lexical structure. The theory grew out of an increasing awareness that traditional views of the vocabulary of English were out of kilter with the facts about lexical items that are routinely being thrown up by corpus investigations of text. What began as an attempt to account for colloca- tion turned into an exploration of grammatical, semantic, sociolinguistic and text-linguistic phenomena. This book is the story of my intellectual journey. Accordingly it begins with my journey’s starting point – the pervasiveness of collocation.

The traditional view of the lexicon and grammar

The classical theory of the word is well reflected in those two central compendia of linguistic scholarship of the eighteenth and nineteenth centuries – the diction- ary and the thesaurus. According to such texts, words have pronunciation, grammar(s), meaning(s), etymology and relationships with words of closely related meanings (synonyms, superordinates, co-hyponyms, antonyms). Accord- ing to the theory underpinning these texts, lexis interacts with phonology through pronunciation, with syntax through the grammatical categories that lexical items belong to, with semantics through the meanings that the lexical items have and with diachronic linguistics through their etymology. In the most extreme versions of the theory, the connection between the word and the other systems has been so weak that it has been possible to argue that grammar is generated first and the words dropped into the grammatical opportunities thereby created (e.g. Chomsky 1957, 1965) or that the semantics is generated first and the lexis merely actualises the semantics (e.g. Pinker 1994).

Theories of lexis that can claim to be more sophisticated, such as systemic- functional linguistics, likewise sometimes represent the relationship between grammar and lexis as if the precise lexical choice was the last choice to be made. Even if one starts from the assumption that lexis is chosen first, or at least earlier, it is easy to assume that it passes through what might be regarded as a grammatical filter which organises and disciplines it. Tagmemics treats lexis as having as much theoretical importance as grammar (Pike and Pike 1982), in that this theory posits three hierarchies – the grammatical hierarchy, the phonological hierarchy and the referential or lexical hierarchy, but such a tripartite division only underlines how separate the levels of lexis and grammar are conceived to be.

The picture I have just sketched is in some respects lacking in light and shade. Construction grammar, for example, does not separate syntax and the lexicon in the manner I have been describing (Fillmore et al. 1988; Goldberg 1995). Even so, its theory still talks of lexical constructions and syntactic constructions, and a key tenet of the grammar is that grammatical patterns have precise meanings that are distinct from those of the lexical items used in the patterns. Chomsky (1995) assigns inter-language variation to the lexicon. Hudson’s Word Grammar (1984), as its name implies, starts from the assumption of a connection between lexical and syntactic description. Likewise Hunston and Francis (2000) identify and describe the close relationships found between lists of specific lexical items and the availability of particular grammatical patterns, and in so doing arrive at much more interesting accounts of grammar than are normal in descriptive grammars (as illustrated in Francis et al. 1996, 1998). A precursor of this work was the Cobuild English Grammar (Sinclair 1990). Nevertheless they continue with a separation of lexis and grammar; indeed, their approach depends upon it. I shall return to these grammars in Chapter 8, along with a discussion of the work of Sinclair, who goes furthest in dissolving the distinction between lexical study and grammatical study and whose work was in several important respects the starting point for the positions presented here.

Collocation and naturalness

The problem with all but the last two theories is that they account only for what is possible in a language and not for what is natural. This book is concerned, in part, with how naturalness is achieved and how an explanation of what is natural might impinge on explanations of what is possible. A key factor in naturalness, much discussed in recent years, is collocation, and this is therefore an appro- priate place to start such an explanation. Collocation is, crudely, the property of language whereby two or more words seem to appear frequently in each

other’s company (e.g. inevitable  consequence). (I shall provide a more careful

characterisation below.) Collocations – recurrent combinations of words – are

both pervasive and subversive. Their pervasiveness is widely recognised in cor- pus linguistics; probably all lexical items have collocations (Sinclair 1991; Stubbs 1996). The notion is usually attributed to Firth ([1951]1957), and certainly his discussion of the concept underpins all that has followed on the subject. Interest- ingly, though, Doyle (2003) draws attention to the fact that the word collocation was being used in linguistic discourse prior to Firth; in this connection he draws attention to a citation from 1940 in the Oxford English Dictionary (1995). This observation is confirmed by inspection of the 1928 edition of Webster’s New International Dictionary, which has the following entry for collocation:

collocation . . . Act of placing, esp. with something else; state of being placed with something else; disposition in place; arrangement.

The choice and collocation of words. Sir W Jones.

… COLLOCATION denotes an arrangement or ordering of objects (esp. words) with reference to each other.

It is improbable that the eighteenth-century amateur linguist Sir William Jones, who is traditionally credited with having set in motion the nineteenth- century’s exploration of language change and language families, can also be credited with the late twentieth-century’s exploration of lexical relations, though I have not sought out the original from which the quotation is taken to verify that assumption. But the OEDs and Webster’s definitions do suggest that colloca- tion has been slowly maturing as a notion.

As befits a notion that has been developing slowly and whose study has been transformed with the onset of large corpora and sophisticated software, collocation is a word with a number of definitions. Partington (1998) groups these neatly into textual, statistical and psychological definitions. The textual definition is closest to the use of the word exemplified in the Webster definition quoted above: ‘the occurrence of two or more words within a short space of each other in a text’ (Sinclair 1991: 170). This definition (which I should add does not reflect Sinclair’s own use of the term) is not useful and can result in a woolly confusion of single instances of co-occurrence with repeated patterns of co-occurrence. I shall not be using collocation in this way. Whenever I need to refer to the occurrence of two or more words within a short space of each other, I shall talk of ‘lexical co-occurrence’.

The statistical definition of collocation is that it is ‘the relationship a lexical item has with items that appear with greater than random probability in its (textual) context’ (Hoey 1991a: 6–7). This definition, though better, confuses method with goal. It is true that to discover collocations one needs to examine the statistical distribution of words and that those that occur in each other’s company more often than can be accounted for by the mechanisms of random distribution can be said to collocate. But the definition says nothing interesting

about the phenomenon; it gives no clues as to why collocation should exist in the first place. For this we need to turn to Partington’s third type of definition – the ‘psychological’ or ‘associative’ definition.

There are two well-known ‘psychological’ definitions, and neither is success- ful for our purposes, though they are both insightful. The first is that provided by Halliday and Hasan (1976: 287) in their pioneering work on cohesion in English. They refer to collocation as a cohesive device and describe it as ‘a cover term for the kind of cohesion that results from the co-occurrence of lexical items that are in some way or other typically associated with one another, because they tend to occur in similar environments’. Their discussion of colloca- tion as a cohesive device, and the exemplification they provide, makes it clear that they are not talking about the regular co-occurrence of words in close proximity to each other. The association they refer to must therefore be a psychological one, in which words are regularly associated in the mind because of the way they are regularly encountered in similar textual contexts. As a definition, it is hard to operationalise and indeed Hasan (1984) abandons the concept, replacing it with more specific semantic relations (hyponymy, meronymy etc.). It does however place collocation where it belongs – as a property of the mental lexicon. (We shall revisit Halliday and Hasan’s notion in Chapter 6, where it will be found to have a proper place in an account of text after all.)

The second definition comes from Leech (1974: 20), who talks of ‘collocative meaning’ which, he says, ‘consists of the associations a word acquires on account of the meanings of words which tend to occur in its environment’. As couched, this is too general to cover the word in its most common current usage. It also implies that the word acquires connotations as a result of the words that surround it, a position that was formulated by Louw (1993) and taken up by Stubbs (1995, 1996). This position is discussed in Chapter 2 and is not uncontroversial (see Whitsitt 2003). Leech’s definition does however pick up both the statistical reality and the psychological reality and, most valuably, posits a causal connection between the two. Partington (1998: 16), commenting on his definition, notes that ‘it is part of a native speaker’s communicative competence

. . . to know what are normal and what are unusual collocations in given cir- cumstances’. I would quarrel with the wording here in that Partington allows for ‘unusual collocations’ but the point he is making is important.

We now have to consider what counts as being in the environment of another word and, more fundamentally still, whether the word or the lemma is the appropriate analytical category in this context. Jones and Sinclair (1974) pro- vided the first influential computational analysis of collocation. Their corpus was only 147,000 words – computers at that time struggled to deal with even that much data – but it was sufficient to allow them to determine that the optimum span for identifying collocation is up to four words on either side of the node word (the node word being the word under investigation and typically shown at

the centre of the concordance lines). This finding has not been seriously dis- puted, though collocational software will often permit a wider span (e.g.  5). Collocational analysis can be done on lemmas or words. Renouf (1986), Sinclair (1991), Stubbs (1996) and Tognini-Bonelli (2001) have all argued against conflating items sharing a common lemma (e.g. politicalpoliticsbreakbrokeoniononions) on the grounds that each word has its own special collocational behaviour. In Hoey (1991a, 1991b) I found it useful to work with lemmas, but for present purposes I concur with these linguists that conflation often disguises collocational patterns. Williams (1998) notes that in the context of molecular

biology research papers the collocates of the word gene and those of the word genes are quite distinct, both prior and subsequent to the node word. Doyle (2003) likewise shows that there are few shared collocates between grammati- cally related forms of lemmas in scientific textbooks; he looks, for example, at amplifieramplifiers (only three shared collocates), circuitcircuits (only two shared collocates), frequencyfrequencies (only one shared collocate) and shiftshifts where he finds no shared collocates at all. When various forms of a lemma do share collocates (e.g. training and trained share collocation with as a teacher), they can of course be discussed together, but common collocates should never be assumed.

So our definition of collocation is that it is a psychological association between words (rather than lemmas) up to four words apart and is evidenced by their occurrence together in corpora more often than is explicable in terms of random distribution. This definition is intended to pick up on the fact that collocation is a psycholinguistic phenomenon, the evidence for which can be found statistically in computer corpora. It does not pick up on the causal relationship identified by Leech, but only because that will be attended to separately.

The pervasiveness of collocation

The importance of collocation for a theory of the lexicon lies in the fact that at least some sentences (and this puts it cautiously) are made up of interlocking collocations such that they could be said to reproduce, albeit with import- ant variations, stretches of earlier sentences (Hoey 2002). It could be argued that such sentences owe their existence to the collocations they manifest. As evidence of these claims, consider the following two sentences:

In winter Hammerfest is a thirty-hour ride by bus from Oslo, though why anyone would want to go there in winter is a question worth considering.

Through winter, rides between Oslo and Hammerfest use thirty hours up in a bus, though why travellers would select to ride there then might be pondered.

One of these sentences is drawn from Bill Bryson’s travel book Neither Here Nor There (1991) about his trips around Europe and is indeed, in some respects, the first sentence of the book (if we discount a quotation from Bertrand Russell and introductory material). The other is best seen as a translation from Bill Bryson’s English into my altogether less fluent English. I have attempted to maintain the meaning of the original and the sentences share a number of words and lemmas in common – winterHammerfestthirtyhour(s)ride(s)busOslothoughwhywouldtothere. Yet I assume few readers would hesitate in assigning the first sentence to Bill Bryson and the second to me. The first is natural; the second is clumsy. However, according to the theories of the lexicon that have dominated linguistic thought for the past 200 years there is no reason to regard the natural- ness or clumsiness of the sentences as being of any importance. Both sentences are, after all, grammatical. Both use words in reasonably acceptable ways; though the second sentence contains an unfamiliar image of ‘using up’ hours, it draws upon a familiar enough metaphor (Lakoff and Johnson 1980), namely that time is money. Both sentences are textually appropriate as well; there is no apparent reason why either should not begin a text. Both are meaningful. I want however to argue that what distinguishes Bryson’s sentence from my version is that his is made up of normal collocations and mine is made up of what Partington referred to as ‘unusual collocations’.

The naturalness of the first sentence and the clumsiness of the second are not immaterial. They are properties of those sentences and are as much in need of explanation as any other feature of language. There is no reason why linguistic theory should not be as much concerned with naturalness as with creativity, as has been recognised for some years (e.g. McCarthy 1988). Indeed, as I shall argue in Chapters 8 and 9, accounts of creativity in language need to take account of naturalness if they are properly to explain creativity. One of the obvious ways in which the two sentences differ, I am claiming, is in respect of their collocations. In my corpus, the words in and winter occur together 507 times; this means that 1 in 15 instances of winter occurs in the word sequence in winter in my data. And 1 in 6 instances of hour occurs with a number, and there are 35 cases of thirty or 30 occurring with hour(s). The words bus and ride occur in the same environment 53 times (though usually as bus ride), the words ride and hour occur together 12 times and ride and from occur together 121 times. The combination by bus occurs 116 times and the three-word combination by bus from occurs 7 times.

The collocations just listed interlock. So hour collocates with thirty but it also collocates with ride. Likewise ride, in addition to collocating with hour, collocates with by and busBus also collocates with by. Both ride and bus collocate with from.

The same kinds of point can be made about the second clause of Bryson’s sentence. The combination though why occurs 24 times, why anyone would occurs

28 times, why anyone would/should want to occurs 23 times, want to go occurs 355 times and want to go there occurs 15 times.

Diagrammatically the interlocking produces the following:

thirty – hour – ride – by – bus – from

though – why – anyone – would – want – to – go – there

Compare this with the picture for my contrived rewritten version. The com- bination of through and winter occurs 7 times (as opposed to 507 instances of in winter), rides between occurs once, in a bus occurs 15 times (as opposed to 116 instances of by bus) and use x hours up (where x stands for any number) is not attested at all. The same is true of the second half of the sentence. The combination travellers would occurs 13 times (as opposed to 122 instances of anyone would) and would select occurs 21 times (as opposed to 573 instances of would want). (The latter frequencies are of course affected by the comparative rarity of travellers and select, as opposed to anyone and select, but this does not alter the point and in any case is not true of the earlier combinations.)

It is worth noting that even my rewritten version still makes use of existing collocations; it is hard to construct a meaningful sentence without calling upon them. My version has fewer of them, though, and those it does have are weaker and do not interconnect.

Priming as an explanation of collocation

I imagine many readers will not have needed convincing of the pervasiveness of collocation; it has been much noted in the literature and Sinclair (1991), in particular, has teased out some of its less obvious and more interesting propert- ies. The subversiveness of collocation has however rarely been given much attention. The reason that it is subversive of existing descriptions of the lexicon is that the pervasiveness requires explanation and many current theories cannot do this. Butler (2004) argues for a greater awareness in corpus linguistics of the need for a more powerful and cognitively valid theory, while showing that existing theories have an even greater obligation to test and modify their claims against corpus data. A good starting point for a cognitively valid theory would seem to be the pervasiveness of collocation.

As we have seen, any explanation of the pervasiveness of collocation is re- quired to be psychological because, as we have seen, collocation is fundament- ally a psychological concept. What has to be accounted for is the recurrent co-occurrence of words. If they were stored in our minds separately or in sets, the kinds of collocational naturalness displayed in the Bryson sentence would be inexplicable. The most appropriate psychological concept would seem to be that of priming, albeit tweaked slightly. As discussed in the psycholinguistic

literature (e.g. Neely 1977, 1991; Anderson 1983), the notion of semantic priming is used to discuss the way a ‘priming’ word may provoke a particular ‘target’ word. For example, a listener, previously given the word body, will recognise the word heart more quickly than if they had previously been given an unrelated word such as trick; in this sense, body primes the listener for heart. This has an obvious connection with word association games. The word body sets up a word association with heart, which the word trick does not (at least for me). The focus in psycholinguistic discussion is on the relationship between the prime and the target, rather than on the priming item per se. In the discussion that follows, however, priming is seen as a property of the word and what is primed to occur is seen as shedding light upon the priming item rather than the other way round.

We can only account for collocation if we assume that every word is mentally primed for collocational use. As a word is acquired through encounters with it in speech and writing, it becomes cumulatively loaded with the contexts and co-texts in which it is encountered, and our knowledge of it includes the fact that it co-occurs with certain other words in certain kinds of context. The same applies to word sequences built out of these words; these too become loaded with the contexts and co-texts in which they occur. I refer to this property as nesting, where the product of a priming becomes itself primed in ways that do not apply to the individual words making up the combination. Nesting simplifies the memory’s task (Krishnamurty, personal communication; see also Krishnamurty 2003). Necessarily the priming of word sequences is normally a second phase in the priming; occasionally, of course, a child acquires the primings of a combina- tion first and the primings of the individual words later (e.g. all gone). There is no difference in principle between acquiring the word (or word sequence) and acquiring the knowledge of its collocations, though presumably recognition of the word must notionally precede recognition of recurrent features, in that the word has to have occurred twice (at least) for the latter process to begin. Chomsky (1986) distinguishes the study of linguistic data, which he terms ‘E-Language’ (externalised language), from ‘I-Language’ (internalised language), the language found in the brains of speakers. Lexical priming is intended as a bridge between the two perspectives.

The notion of priming is entirely compatible with Giddens’ (1979) discussion of the relationship between human agency and social structure, where each individual action reproduces the structure and the structure shapes the individual action; indeed, Giddens applies his theory to language. Priming in the fullest form, as described in this book, might be seen as the explication of Giddens’ claims. Stubbs (1996: 56) notes, preparatory to a discussion of Giddens’ work: ‘Speakers are free, but only within constraints. Individual speakers intend to communicate with one another in the process of moment to moment inter- action. The reproduction of the system is the unintended product of their

routine behaviour’. The crucial phrase here is ‘only within constraints’. The notion of priming completes the circle begun here by Stubbs. Priming leads to a speaker unintentionally reproducing some aspect of the language, and that aspect, thereby reproduced, in turn primes the hearer. It is not necessary to assume, though, that what is reproduced is a system as usually understood. Indeed, as we shall see in subsequent chapters, priming can be seen as reversing the traditional relationship between grammar as systematic and lexis as loosely organised, amount- ing to an argument for lexis as systematic and grammar as more loosely organ- ised. This position is similar to that of Hanks (1996, 2004). My argument here also follows a similar line to that of Hopper (1988, 1998), who argues that grammar is the output of what he calls ‘routines’, collocational groupings, the repeated use of which results in the creation of a grammar for each individual. He terms this process ‘emergent grammar’ and importantly every speaker’s grammar is different because every speaker’s experience and knowledge of rou- tines is different; Hopper also makes use of the notion of priming, though as a less central notion.

Some of the properties of priming

In this section some of the characteristics of priming are considered. Necessarily, since we have so far only considered collocation, these characteristics are formu- lated in terms of their application to collocation, but, importantly, the claims here are made for all types of priming as discussed in the remainder of the book.

Priming need not be a permanent feature of the word or word sequence; in principle, indeed, it never is. Every time we use a word, and every time we encounter it anew, the experience either reinforces the priming by confirming an existing association between the word and its co-texts and contexts, or it weakens the priming, if the encounter introduces the word in an unfamiliar context or co-text or if we have chosen in our own use of it to override its current priming. It follows that the priming of a word or word sequence is liable to shift in the course of an individual’s lifetime, and if it does so, and to the extent that it does so, the word or word sequence shifts slightly in meaning and/or function for that individual. This may be referred to as a drift in the priming. Drifts in the priming of a word, occurring for a number of members of a particular community at the same time, provide a mechanism for temporary or permanent language change. Again, Stubbs (1996: 45), drawing on Halliday’s (1991, 1992) analogy between linguistic systems and weather systems, puts it well: ‘Each day’s weather affects the climate, however infinitesimally, either maintaining the status quo or helping to tip the balance towards climatic change’. It will be observed that I have referred to contexts as well as co-texts. This is because it is demonstrable that collocations are limited in principle to particular

domains and genres, and even these are fluid. Baker (forthcoming) warns:

approaches that focus on different discourses need to acknowledge that the concept of discourses as discrete and separate entities is problematic. Dis- courses are constantly changing, interacting, merging, reproducing and splitting off from each other. Therefore a corpus-based analysis of any discourse must be aware that it can only provide static snap-shots that give the appear- ance of stability but are bound to the context of the data set.

An example of contextual limitation is the collocation of recent and research, which is largely limited to academic writing and news reports of research. Re- expressed in terms of priming, research is primed in the minds of academic language users to occur with recent in such contexts and no others. The words are not primed to occur in recipes, legal documentation or casual conversation, for ex- ample. In short, collocational priming is sensitive to the contexts (textual, generic, social) in which the lexical item is encountered, and it is part of our knowledge of a lexical item that it is used in certain combinations in certain kinds of text. This is not a new idea, though it may be expressed here in unfamiliar terms.

Firth referred in 1951 to ‘more restricted technical or personal collocations’. The only difference between his ‘restricted technical collocations’ and domain- specific primings (apart from the psycholinguistic focus of the latter) is that I would argue that the latter are the norm, rather than the exception. Firth’s notion of ‘personal collocations’ is still closer to that of priming, in that it is an inherent quality of lexical priming that it is personal in the first place and can be modified by the language user’s own chosen behaviour in the second place. Firth comments on personal collocations that:

The study of the usual collocations of a particular literary form or genre or of a particular author makes possible a clearly defined and precisely stated contribution to what I have termed the spectrum of descriptive linguistics, which handles and states the meaning by dispersing it in a range of tech- niques working at a series of levels.

(Firth [1951]1957: 195)

The position I am advocating here is also related to that of reading theorists such as Smith (1985), who talks of the importance to the learner-reader of their having experience of a word in a variety of contexts – intertextual, extratextual, intratextual. These contexts are important in that without them the word will not be appropriately primed. This said, it does not follow that priming may only occur in specific domains and/or genres. It does however follow that we should be wary of over-generalising claims about primings. I shall return to this point on several occasions.

Primings nest and combine. For example, winter collocates with in, producing the phrase in winter. But this phrase has its own collocations, which are separate

from those of its components. So in winter collocates with a number of forms of the word BE (i.e. iswasarewere, etc.), which as far as I am aware neither in nor winter do. This then is an instance of a nesting that might be represented as follows:

in   winter

  BE

Or to take a more complex example, the word word collocates with say, say a word in turn collocates with against, and say a word against collocates with won’t. (We shall return to this example in subsequent chapters.) In this way, lexical items (Sinclair 1999, 2004) and bundles (Biber et al. 1999) are created.

Primings may crack, and one of the causes of cracking is education. If, for example, a word is primed for someone as collocating with a particular other word and a teacher tells that person that it is incorrectly primed (e.g. you and was) the result is a potential crack in the priming. Cracks can be mended either by rejecting the original priming or by rejecting the attack on the priming. Better still, they can be healed by assigning the original priming to one context (e.g. family) and the later priming to another context (e.g. the classroom, science, public speaking). Not all cracks get healed and the result can be uncer- tainty about the priming, a codification of the crack, leading to long-term lin- guistic insecurity. We will return to this issue in Chapter 10.

As the possibility of cracking suggests, one of the implications of lexical priming is that each individual’s experiences of language, and the primings that arise out of these experiences, are unique. Since our experience of language suggests that communication takes place, there must be harmonising principles at work to ensure that each individual’s primings do not differ too greatly from those of others. Education is one of these, but there are others as important, including the property of self-reflexivity. The harmonising principles are dis- cussed in the final chapter after we have reviewed the full range of semantic and grammatical facets of priming.

The notion of priming as here outlined assumes that the mind has a mental concordance of every word it has encountered, a concordance that has been richly glossed for social, physical, discoursal, generic and interpersonal context. This mental concordance is accessible and can be processed in much the same way that a computer concordance is, so that all kinds of patterns, including collocational patterns, are available for use. It simultaneously serves as a part, at least, of our knowledge base.

Primings can be receptive or productive. Productive primings occur when

a word or word sequence is repeatedly encountered in discourses and genres in which we are ourselves expected (or aspire) to participate and when the speakers or writers are those whom we like or wish to emulate. Receptive primings occur when a word or word sequence is encountered in contexts in which there

is no probability, or even possibility, of our ever being an active participant – party political broadcasts, interviews with film stars, eighteenth-century novels – or where the speaker or writer is someone we dislike or have no empathy with – drunken football supporters, racists, but also sometimes stern teachers and people of a different age group.

Although productive primings are more interesting, receptive primings have their importance too. It is as a result of these that we recognise allusion, quotation and pastiche, and indeed just as collocation requires priming as an explanation, so do these recognised literary properties. Our ability (sometimes) to recognise plagiarism may possibly arise from the same mental concordance. A person’s encounter with lexical items in the plagiarised text, I would hypoth- esise, sometimes results in the new instances of the items being stored near to the items from the original and a consequent recognition of the similarity/ identity of the two texts (though other factors come into play as well, such as incongruities of style).

The existence of allusion in the above list may also suggest that our mental concordance is tagged for the importance of the text in which a word or word sequence is encountered. Thus the claimed greatness of a literary work or the centrality of a religious text may ensure that an encounter with a word in such writings has a bigger impact on the priming than a similar encounter with the word in a less valued work. The same may be true of words encountered in conversation; words spoken by a close friend are likely to affect our primings more directly than those spoken by someone to whom we are indifferent.

Primings can be transitory or (semi-)permanent. Speakers or writers may combine certain words repeatedly in a discourse and this repeated combination may become part of the cohesion of the text. The listener or reader will grow to expect these words together in the text in question, but unless subsequent texts reinforce the combination it will not become part of the permanent priming of either of the words. Emmott (1997) discusses priming in these terms where a reader is primed to construct a frame which permits them to process more effectively the text they are reading.

Priming as an explanation of other linguistic features

In the above discussion, I have talked as if words and word sequences are primed for collocation only and all the examples I have so far given have played along with this assumption. However, once we accept that collocation can only be accounted for in terms of priming, the possibility opens up that priming will explain other features of the language. Indeed it is the argument of this book that priming is the driving force behind language use, language structure and lan- guage change. I shall therefore conclude this chapter with a statement of the hypotheses that the remainder of the book will be concerned with exploring.

Priming hypotheses

Every word is primed for use in discourse as a result of the cumulative effects of an individual’s encounters with the word. If one of the effects of the initial priming is that regular word sequences are constructed, these are also in turn primed. More specifically:

  1. Every word is primed to occur with particular other words; these are its collocates.
  2. Every word is primed to occur with particular semantic sets; these are its semantic associations.
  3. Every word is primed to occur in association with particular pragmatic functions; these are its pragmatic associations.
  4. Every word is primed to occur in (or avoid) certain grammatical positions, and to occur in (or avoid) certain grammatical functions; these are its colligations.
  5. Co-hyponyms and synonyms differ with respect to their collocations, se- mantic associations and colligations.
  6. When a word is polysemous, the collocations, semantic associations and colligations of one sense of the word differ from those of its other senses.
  7. Every word is primed for use in one or more grammatical roles; these are its grammatical categories.
  8. Every word is primed to participate in, or avoid, particular types of cohesive relation in a discourse; these are its textual collocations.
  9. Every word is primed to occur in particular semantic relations in the dis- course; these are its textual semantic associations.
  10. Every word is primed to occur in, or avoid, certain positions within the discourse; these are its textual colligations.

Very importantly, all these claims are in the first place constrained by domain and/or genre. They are claims about the way language is acquired and used in specific situations. This is because we prime words or word sequences, as already remarked, in a range of social contexts and the priming, I argue, takes account of who is speaking or writing, what is spoken or written about and what genre is being participated in, though the last of these constraints is probably later in developing than the other two. One reason why some of the features described in this book have been given only limited attention is that traditionally descriptions of language have treated the language as monolithic. Even corpus linguists have characteristically worked with general corpora. But certain kinds of feature only become apparent when one looks at more specialised data.

Returning to the list of claims, the first has already been argued for and will not be further discussed in this book. Claims 2 and 3 are explored in Chapter 2,

claim 2 in some detail. Claims 4, 5 and 6 are discussed in Chapters 3, 4 and 5 respectively, with claim 7 being given briefer attention in Chapter 8. The textual claims (8, 9, 10) are explained in Chapters 6 and 7, with preliminary supportive evidence. Chapters 8 and 9 consider the implications of lexical prim- ing for discussions of creativity, and the final chapter considers some of its implications for L1 and L2 learning.

Primings can be studied from two perspectives. We can study their operation from the perspective of the primed word or word sequence. Thus we might, for instance, look at all the primings associated with the word consequence. Altern- atively, we can observe their operation in combination. So we might look at all the primings that contribute to the production of a sentence such as the one cited earlier from Bill Bryson’s Neither Here Nor There. In each of the following chapters I shall do both (and indeed the word consequence and the Bill Bryson sentence will both be examined), though the weighting will differ as the chapters progress. Thus, initially most of the attention will be on the individual word, but in later chapters, where the focus is on textual priming, we will be more concerned with the ways primings combine.

The status of the corpus as evidence of priming

I have talked of the language user as having a mental concordance and of the possibility that they process this concordance in ways not unrelated (though much superior) to those used in corpus linguistic work. However, it does not automatically follow that exploration of the nature of priming can be achieved through the study of computer corpora. A corpus, whether general – like the British National Corpus or the Bank of English – or specialised – such as the Guardian corpus used in this work – represents no one’s experience of the language. Not even the editor of the Guardian reads all the Guardian, I suppose, and certainly only God (and corpus linguists) could eavesdrop on all the many different conversations included in the British National Corpus. On the other hand, the personal ‘corpus’ that provides a language user with their lexical primings is by definition irretrievable, unstudiable and unique. We have there- fore a problem: we have a posited feature of language acquisition and use, one of whose characteristics is that it is differently actualised for every language user.

If my analogy between the mental concordance and the computer concord- ance is correct, the computer corpus cannot tell us what primings are present for any language user, but it can indicate the kinds of data a language user might encounter in the course of being primed. It can suggest the ways in which priming might occur and the kinds of feature for which words or word sequences might be primed. In other words, it can serve as a kind of laboratory in which we can test for the validity of claims made about priming. If in subsequent chapters I sometimes write as if words or word sequences have

priming independently of individual speakers, this should be regarded as no more than a convenient shorthand. Words are never primed per se; they are only primed for someone (and, as we shall see in Chapter 8, it is not only, or even primarily, the word that is primed). All that a corpus can do is indicate that certain primings are likely to be shared by a large number of speakers, and only in that sense is priming independent of the individual. As already noted, in the final chapter I shall return to the issue of how it comes to be that primings are shared.

16 Meaning

2 Lexical priming and meaning

What collocation will not account for: semantic association

If lexical priming only operated with regard to collocations, it would be an anomalous but not especially interesting characteristic of language. It would have nothing to say about linguistic creativity and be of little or no theoretical import- ance. However, a glance at the Bill Bryson sentence shows that there is more to say about the way it has been constructed than can be accounted for in terms of collocation alone. Take the word hour in the word sequence thirty-hour ride. For most speakers it is likely to be primed to collocate with ride, but there is no evidence in my corpus of its being likely to collocate with thirty. On the basis of my corpus evidence, hour is likely to be primed for many speakers of English to collocate with half anonetwothreefour and twenty four, but thirty only occurs once in my data. It is not to the point to argue that a larger corpus might show it to reach the threshold of collocability, both because it will always be possible to find a number that has not yet been shown to collocate and because it is nonsense to suppose that any user of the language would feel they were breaking new linguistic ground if they used a number with hour that they had never heard anyone else use. More subtly, the same goes for the collocation of hour with ride in that it also collocates with driveflight and journey. Listing such collocates is theoretically trivial and unrevealing about the possibilities of its occurring with other ‘journeying’ words such as meanderslog or odyssey, for example. If how- ever we assume that the priming is operating at a more abstract level, we can say that for most speakers of English the word hour is likely to be primed for semantic association with NUMBER and JOURNEY. Thus thirty-hour ride belongs to a pattern that (in my corpus) also includes:

half-hour drive four-hour flight two-hour trip three-hour journey

two-hour hop three-hour slog

The relative banality of this observation supports the argument that this is the way the word hour is typically primed for native speakers; the claim is that when we formulate what we want to say, primings like the above shape the wording we use. But whereas collocation can only account, by definition, for the routine, the notion of semantic association can account for some aspects of creativity. There is no instance of a 27-hour meander in my corpus but if it ever occurred the semantic associations just described would account for it without in any way detracting from its novelty. If the distances between planets or stars were being discussed, other time units would be used – weeklight year – and in principle expressions such as 27-week flight or 150 light-year odyssey could be created, still on the basis of the same semantic associations.

As noted in Chapter 1, primings nest. Here, as a first step, we note that the NUMBER-hour-JOURNEY (or NUMBER-TIME-JOURNEY) combination collocates with a. The resultant word sequence may in turn form an association with VEHICLE:

a three-hour car ride a 12-hour bus ride

a five-hour coach ride a two-hour ferry ride a half-hour train ride

a two-hour ride by four-wheel drive vehicle

Note that the word sequence a 27-hour meander by sledge is as readily explained by this combination of primings as are the attested examples listed above. Note, too, that such a word sequence (were it ever to occur) would be impossible to explain by collocation alone. We would have to assume that although hour was primed to occur with two and bus, a totally unconnected range of factors led to its occurrence with 27 or sledge. It is more elegant to assume that words and word sequences are also primed for semantic association. Of course, for particular speakers, because of particular communicational needs or because of particular linguistic experiences, a particular word may be primed to occur with another without there being corpus evidence to support the priming. Nevertheless, despite the unpredictability of priming and the uncertain status of corpus evidence with regard to its presence or absence, there will always be co-occurrences that cannot be accounted for in terms of collocation. Semantic association is a necessary generalisation and appears to reflect a regular kind of lexical priming. It is probable, but not theoretically necessary, that collocations are primed first and that the semantic commonality between collocates produces the more abstract priming, whether as a result of self-reflection or because of

encounters with co-occurrences that share the semantic feature(s) of the already recognised collocates. The primings move outwards from specific words to the semantic set, and in so doing permit creative choices to be made that in them- selves reinforce the more general priming. (However, the possibility must be allowed for that the semantic associations of a word are primed first, with the collocates arising from a person repeatedly making the same selection from the semantic set.)

If the claim that primings move out from collocations to semantic associations is correct, it does not follow that a corpus will necessarily reflect the colloca- tions that led to the semantic association, since these may differ from person to person. Names are the obvious instance of this. In my childhood, the word nanny collocated with Hoey as in Nanny Hoey and the word nan with Robinson (as in Nan Robinson). As a consequence of this, I am now primed to recognise Nanny or Nan as titles primed to occur with surnames when reference is being made to a grandmother. No corpus will ever reflect my personal primings, though, and every other adult who uses the titles, or understands them, will have been differently primed (apart from minor points of overlap, where children’s liter- ature makes use of these words). An instance in the Bryson sentence of the way names get primed is in the following semantic association:


My corpus includes the sentences:

  1. Ntobeye is a two-hour ride by four-wheel drive vehicle from the vast refugee camp at Ngara.
  2. The village is a four-hour drive from London.
  3. Pamuzindo is an hour’s drive from Harare.
    Substitute Hammerfest for Ntobeye in sentence 1, thirty for twobus for four-wheel drive vehicle and Oslo for the vast refugee camp at Ngara and the first half of Bryson’s sentence appears.Though a large corpus will attest a fair number of sentences containing mentions of Oslo, it would be a corpus of enormous magnitude that would contain more than a handful of references to Hammerfest. Clearly only northern Norwegians are likely to have the word primed for any collocations, and these will presumably be with Norwegian words as a rule. Although it is possible, even probable, that Bill Bryson researched his journey to this small town, we do not have to assume that his sentence was the product of collocational primings arrived at as a result of his researches. It would be sufficient that as a child he heard parents and fellow town-people talk of how far Des Moines, the small town where he lived, was from the nearest big city, when, for example, theywere asked where exactly it was that they lived. The point here is that while there may indeed be semantic associations that on the basis of corpus evidence do not have corresponding collocations, these general semantic associations (i.e. associations primed for many speakers of the language) may be based on local collocations (i.e. collocations primed for only a few speakers) of the kind that the average corpus is unlikely to detect.Good examples of the operation of semantic association, which also illustrate the way primings nest, are provided by Baker (forthcoming) and Bastow (2003), though they do not use the terms outlined here. Baker notes that when daylight collocates with broad, it is usually in the context of in. The word sequence in broad daylight is then further primed to occur with ‘something bad happening, usually connected to crime or violence’. Examples from his data include:
    . . . having been abducted and then stabbed in broad daylight . . .. . . was snatched off a bus in broad daylight . . .. . . a ‘Mirista’ who was captured in broad daylight . . .
    Reporting on a study of US defence speeches, Bastow notes firstly that men and women collocate with each other in this domain. The resulting binomial word sequence men and women then collocates with young, or, to put it in the terms of this book, in the domain of US defence the typical speech writer is primed to collocate men and women with young. (I hope, however, that I may be permitted the shorter and more convenient formulation hereafter without compromising my original claim.) The word sequence young men and women then has a semantic association with COMPLIMENTS:
    bright young men and womenvery capable young men and women dedicated young men and women finest young men and womenhigh-quality young men and women the finest of young men and women outstanding young men and womenspecial-gifted, serious-minded young men and women superb young men and womentalented young men and womenBut as impressive as those young men and women are fit, well-adjusted young men and women(data from Bastow 2003)
    Bastow’s data also provide support for the point made in the previous chapter that priming occurs, in principle, within specific domains and/or genres. As hehimself notes, the behaviour of the word sequence men and women in a general corpus is rather different, having, for example, a collocation with between, which does not manifest itself in his US defence corpus.
    Constant/variable patterns in semantic associationAlthough familiarity with concordances may blunt awareness of the fact, it can hardly be missed that there is a great measure of parallelism in data such as that provided above. Such parallelism is not, though, only a property of con- cordances; it is also a property of spoken and written discourse. Winter (1974, 1979) noted that one of the basic relations in text was that of the matching relation, where two clauses in a text are matched for similarity or difference. Examples of matching relations are contrast relations, for example:
  4. Seven or eight were arrested, but I was the only one charged and compatibility relations, for example:
  5. My husband was furious, so was I.
    Matching relations are, according to Winter, characteristically established by the repetition of key clausal elements and the replacement of others, which can be talked of in terms of patterns of constant and variable (Hoey 1983). An example is the sentence you read four sentences ago, namely Such parallelism is not, though, only a property of concordances; it is also a property of spoken and written discourse. This can be represented in tabular fashion as shown in Table 2.1. Tables of this kind can however be used to represent corpus data with equal facility, as shown in Table 2.2.It is not an accident that clause relations and concordance output can be represented in similar ways. Semantic association, when it occurs in a precisely repeated textual context, functions as a kind of intertextual matching. The speaker/writer, primed to associate a word or word sequence with a particular semantic context, recognises the similarities between what they want to say at a particular point and what they have heard or read at other times, and (re)produces
    Table 2.1 The parallelism of constant/variable in a matching clause relationshipSuch parallelism is not only a property of concordancesit is also a property of spoken and written discourse Constant: such parallelism is a property of multiple utterances/sentencesVariable: – – [correlative] – whether or not coherentTable 2.2 The parallelism of constant/variable in a sample of concordance data for hour

    half-hourdrivefour-hourflighttwo-hourtripthree-hourjourneytwo-hourhopConstant:numberhour(type of ) journeyVariable:which–which type
    the priming. Thus the person responsible for writing a US defence speech (in Bastow’s data), recognising that what they want to say about US troops is much the same as what others have previously said, comes up with an utterance that is (partly) in a relation of matching compatibility with those earlier utterances (Hoey 1983).In the same way that data illustrating semantic association can be regarded as forming textual relations, so text can be regarded as generating data for semantic associations. We can interpret matching relations of compatibility or contrast in text either as textual exploitations of existing semantic associations or as creations of ‘nonce’ primings for a brief textual moment. In the former case, the writer/speaker makes use of the priming of a word or word sequence by drawing on that priming twice in quick succession and thereby making it visible and available for interpretation (e.g. as contrast). In the latter case, the juxta- position is not licensed by the primings available for the writer/speaker (or, more accurately, not by the primings for semantic association – it is likely that other primings are conformed to), but the presentation of the juxtaposition creates for the reader/listener a temporary priming such that the matching is interpreted in terms of that priming.Darnton (2001) notes that the repetition of single words in writing for chil- dren is of limited efficacy in encouraging the development of reading skills, but that the use of repetition as part of matching relations is of an altogether greater value. New words are more readily understood by children because of the repeated context in which they occur. We can interpret this as indicating that successful stories for children encourage them to make use of their priming experience, to take a step in abstraction from collocational priming to semantic associational priming or to rediscover semantic associations in writing that have previously only been encountered in speech. Many of the ‘nonce’ primings created by the matching relations become permanent for the child (and subsequent adult), despite their original temporary status. For children brought up in English or American homes and exposed to traditional folk tales, ‘I’ll huff and I’ll puff ’ and ‘What big teeth you’ve got!’ represent permanent primings for huffpuff and teeth.Semantic prosody, semantic preference and semantic associationThe concept of semantic association is not my own, although the label is. It grows out of two different concepts, sometimes confused with each other (including by me, e.g. Hoey 1997b). It was early noticed that the collocations of a word or word sequence often group in interesting ways. Sinclair (1991: 112) notes that ‘many uses of words and phrases show a tendency to occur in a certain semantic environment. For example, the verb happen is associated with unpleasant things – accidents and the like’. So when John Travolta’s character in Pulp Fiction utters the words ‘Shit happens’, he is summing up an important characteristic of happens, that it often occurs with bad things, especially when the subject is fleshed out. For example:
  6. . . . until food runs out or some other disaster happens
  7. . . . and as a result of his action something unpleasant happens to him.
    This phenomenon was labelled semantic prosody (Louw 1993), by analogy with Firth’s view of the sound system as prosodically organised. Firth (1957) argued that when we pronounce a word such as /1p/ our mouth is already shaping the [1] sound even as it makes the [] sound. There is a sense then in which the [1] sound has spread over its neighbours, a fact which conventional phonetic script representations of the sound system disguise. He therefore favoured a phonetic description that did not treat each sound as a discrete element to be combined with other discrete elements but recognised that certain features are spread over the conventionally recognised units. In the same way, according to Louw, certain features of a word’s meaning are to be found already present in its surrounds. Its influence is spread around so that it affects and limits the choices available to the user, a fact which conventional representations in thesauri and dictionaries disguise and which most grammars ignore. Stubbs (1995) illustrates semantic prosody with the item cause, which, to an even greater extent than happen, carries bad news around with it; cancers are ‘caused’ much more fre- quently than cures. It is worth quoting his discussion in a little detail since his example is insightful. He states the prosody, gives sample collocates supporting the prosodic statement and then provides sentence illustrations:
    1 CAUSE: A cause is something that makes something happen. To causesomething means to make it happen.1a Most frequently, 90%, the circumstances are unpleasant. Typically, what is caused is: an accident, cancer, concern, damage, death, disease, pain, a problem, problems, trouble.1b The circumstances can include a wide range of unpleasant things, mostly expressed as abstract nouns, such as: alarm, anger, anxiety, chaos, commo- tion, confusion, crisis, delay, difficulty, distress, embarrassment, errors, explosion, harm, loss, inconvenience, nuisance, suspicion, uneasiness.1c Frequently, the unpleasant collocates are medical: Aids, blood, cancer, death, deaths, disease, heart, illness, injury, pain, suffering, symptoms, stress, virus . . .1g . . . typical examples are:
    • the rush hour causes problems for London’s transport
    • dryness can cause trouble if plants are neglected
    • considerable damage has been done to buildings
    • I didn’t see anything to cause immediate concern
    • some clumsy movement might have caused the accident
    (Stubbs 1995: 247)
    The term semantic prosody, however, is inappropriate as a way of describ- ing the processes operating in the Bryson example and in Baker’s and Bastow’s data on a number of counts. In the first place, Louw and Stubbs both seem to limit it to positive and negative effects. Thus happens and cause have in these terms negative prosody. But while Bastow’s example of young men and women’s association with what I have labelled COMPLIMENT is easily interpreted as positive prosody, the examples of NUMBER and JOURNEY do not admit of such interpreta- tion. In several papers, I sought to extend the term to cover such cases (e.g. Hoey 1997a, 1997b, 1998), but I now recognise that this was unhelpful and would ask readers of those papers to interpret my references to semantic prosody as references to semantic association.In the second place, the prosodic claim has come under attack. Although he is comfortable with the notions of collocation and colligation (see Chapter 3), Whitsitt (2003) challenges the claim made for semantic prosody that words are coloured by their characteristic surroundings, arguing that this position is flawed both philosophically and in terms of its ability to cope with readily available counter-evidence, particularly where language functions metaphorically. His argu- ments are convincing, though perhaps the difference is not as great as Whitsitt thinks between saying that cause (for example) is negative because its collocates are characteristically negative (a position which Whitsitt correctly identifies as unsustainable) and saying that because the co-texts of cause are characteristically negative we may interpret negatively those co-texts that are on the face of it neutral. O’Halloran and Coffin (2004), for example, present evidence that a succession of negative co-texts for a word encountered in a particular newspaper text permits a negative reading of an otherwise apparently neutral co-text.The third reason for not persisting with the term is one of clarity; quite simply, the term has another, rather different sense. Louw (1993) ascribes hisuse of the term to personal communication from Sinclair, but Sinclair himself (1999) uses the term to refer to the meaningful outcome of the complex of colloca- tional and other choices made across a stretch of language. For the phenomenon I am talking about in this chapter he uses the term semantic preference.The terms semantic preference and semantic association may be seen as inter- changeable. My reason for not using Sinclair’s term is that one of the central features of priming is that it leads to a psychological preference on the part of the language user; to talk of both the user and the word having preferences would on occasion lead to confusion. Accordingly, the term that is used here, as will already be apparent, is the bland but transparent one of semantic associa- tion. The change of term does not represent a difference of position between Sinclair and myself.The definition of semantic association that we have arrived at is that itexists when a word or word sequence is associated in the mind of a language user with a semantic set or class, some members of which are also collocates for that user.
    The semantic associations of consequenceAt the end of Chapter 1, I noted that one can look at linguistic phenomena either as they apply in a particular piece of text or as they apply in the use of a particular word. I want now to consider in a little detail the operation of semantic associations in connection with a particular word. Stubbs (1995) looked, as we have seen, at the verb cause; to complement this, let us look at what is caused, i.e. at the noun consequence, in its meaning of ‘result’ (as opposed to ‘significance’). From a concordance of 1,817 lines drawn substantially from the Guardian corpus but supplemented by data from the Bank of English, I found 456 instances of consequence premodified by an adjective and sought to classify the adjectives according to their semantic similarities.The first and largest of the semantic associations of consequence identified in my corpus was a class of adjectives that alluded to the underlying logic of the process that consequence was describing; this semantic association comprised 59 per cent of all premodifying adjectives. Examples are:
  8. Whatever his decision, it will be seen as a logical consequence of a steady decline in influence.
  9. . . . it is the ineluctable consequence of having been in power for ever . . .
  10. Mr Haughey’s support for liberal reform is a direct consequence of the election of President Mary Robinson last November.
  11. What is certain is that the results of Milosevic’s experiment will be under intense scrutiny in Moscow with the probable consequence that asuitable scion of the Romanov family is crowned Tsar by the Patriarch of All the Russians.
  12. . . . disability is not a natural and inevitable consequence of old age.
    Unsurprisingly, perhaps, given the size of the group, the ‘logic’ adjectives can, with a little ingenuity, be further divided into three sub-classes (rather as cause has a semantic association of ‘disease’ which is a sub-class of the association of ‘bad things’ noted by Stubbs). The distinction between the sub-classes is not watertight but has, I hope, value all the same. The first sub-class refers to necessity (unavoidableinevitableinexorableinescapable, and so on); instances 8 and 9 illustrate this sub-class. The second sub-class refers to the directness or the stages of the logical process (directultimatelong-termimmediateknock-on and so on); instance 10 exemplifies this. The third sub-class is concerned with the naturalness or expectedness of the process (likelypredictablepossibleprobablenatural and the like); this use is illustrated in instance 11. A coupling of members of two of these sub-classes can be seen in instance 12.The ‘logical’ association accounts for a clear majority of the adjectives premodifying consequence in my sample. The next largest semantic association exemplifies the insight originally expressed by Louw in that it is made up of adjectives that evaluate negatively the consequence to which they are attached. This association accounts for 15 per cent of the cases examined and includes such items as awfuldireappallingsad and regrettable. (One ‘logic’ adjective, inexor- able, was also included in the count because its connotations are so negative.) Examples are:
  13. The doleful consequence is that modern British society has been in- tensely politicised.
  14. Yet another disastrous consequence, Smallweed assumed, of having the Tories in power for so long.
  15. The Mecca tragedy was the grisly consequence of a deep antagonism.
  16. . . . the affair was the latest ludicrous consequence of the 1983 pro-life amendment.
    The third semantic association suggested by the data is that consequence associ- ates with adjectives expressing a view as to the seriousness of the consequence. This category accounted for 11 per cent of the adjectives examined and included items such as seriousimportantsignificant and modest. Examples are:
  17. One important consequence of this obsessive militarism has been a silent and undiscussed brain drain.
  18. The most serious consequence of this crime has been the effect on the children.
  19. . . . not every significant consequence of an action which is known to an individual will be equally important to the morality of the action.
  20. The most prominent consequence of this is that Americans shoot each other in industrial quantities.
    Adjectives referring to the seriousness of a consequence overwhelmingly outweigh those (like modest) claiming that the consequence was of no great importance.The final association observed in the data concerned adjectives that referred tothe UNEXPECTEDNESS of the consequence; these account for 6 per cent of the instances considered. Examples are unintendedodd and strange. This association complements the third sub-category of the first group, in that consequences that conform to the logic of the situation are expected ones. Interestingly, though, adjectives referring to the unexpectedness outnumber those referring to the expectedness in a ratio of 2:1. Typical occurrences are the following:
  21. But this ascent from gut hatred to a plateau of sweet reason has had anunforeseen consequence.
  22. But that vastness, and the sheer sparseness of matter, has anothercurious consequence: extraterrestials are rare.
  23. Yet that very process brought its own surprising consequence.
  24. String theory – the idea that all the bits that add up to matter are just different modes of vibrations of infinitesimal bits of string – has an odd consequence.
    When all the semantic associations of consequence are grouped together, they account for 90 per cent of all adjectives that premodify the word. What this means is that if the corpus reflects an individual’s experience of reading the Guardian (or perhaps other) newspaper, then the word consequence will be primed for the reader in such a way that they will expect it to occur with such associa- tions. It may also mean that writers for the Guardian are productively primed to use consequence with these associations.
    Pragmatic associationJust as a word or word sequence may be primed for semantic associations, so it may be primed pragmatically as well. Pragmatic association occurs when a word or word sequence is associated with a set of features that all serve the same or similar pragmatic functions (e.g. indicating vagueness, uncertainty). The bound- aries between pragmatic association and semantic association are not going to be clear cut, because priming occurs without reference to theoretical distinctions of this sort. An example, though, of the operation of pragmatic association is thatthe word sixty, in addition to being typically primed for semantic association with UNITS OF TIME, UNITS OF DISTANCE and AGE, is typically associated with expres- sions of VAGUENESS. Thus I attest in my data:
    about sixtyover sixtyaround sixtymore than sixtyan average of sixtysome sixtyalmost sixtynearly sixtyfifty to sixtybetween fifty and sixtyfifty or sixtyup to sixtymaybe sixtygetting on for sixtya good sixtysixty or more sixty-odd sixty-some sixty plus sixty or so
    In writing, the pragmatic association just illustrated only applies to the literal form of sixty, not to the numeral form 60. They are therefore differently primed, despite being apparently no more than alternative orthographic versions of the same word. In speech, the distinction of course does not operate; my spoken corpus is too small to allow me to make confident claims about the operation of the above pragmatic association in conversation, but I would predict that most speakers are primed to use sixty with VAGUENESS in a range of types of spoken discourse (but not perhaps courtroom discourse or parliamentary debates). Where exactness is needed, I predict that it would often be made explicit, i.e. exactly sixty, but that this would not be true of, say, sixty-four. My spoken data, meagre as they are, partly support these predictions, in that I have four instances of sixty- foursixty-nine etc., none of which are marked for VAGUENESS, and four instances of sixty, two of which are definitely marked for VAGUENESS:
  25. It’s fifty, sixty or more.
  26. There are maybe fifty, sixty, I’ve lost count of the number.A third is arguably marked for VAGUENESS:
  27. Sent out more than fifty. I did sixty and Caroline copied even more.
    The word reason provides us with another instance of priming for pragmatic association. The relation of affirmation-denial has been shown (Winter 1979; Williames 1985; Hoey 1983, 2001) to be a pivotal feature of writer/reader and speaker/listener relationships, and reason is typically primed to associate with acts of DENIAL – denial either that something is a reason, or that the reason is known, or that the reason matters. Just as LOGIC could be divided into three sub-categories of semantic association for consequence, so DENIAL can probably be sub-divided along the lines just mentioned. However, here I group them together. Examples include:
  28. Mahathir sees no reason to tinker with success
  29. Unless you have any reason to suspect a murder, I’d . . .
  30. But there was no reason on God’s earth why I . . .
  31. There is no reason to suppose that our stay here . . .
  32. Really I see no reason why I should be obliged to . . .
  33. They’d have no reason to come to the surface
  34. That’s not the reason why . . .
  35. There’s no medical reason why a baby needs to change as well as a number of idiomatic word sequences such as:for some reasonfor no obvious reasonfor some unknown reason for some reason or other whatever the reason
    Statistical support for this pragmatic association comes in the form of an analysis I undertook of 7238 instances of reason with postmodification. Of these, 4747 were found to affirm the reason and 2491 either denied the reason or denied knowing what it was. It has been shown (Halliday 1993; Halliday and James 1993) that the ratio of positive to negative clauses in general English is 9:1. Here however we have a ratio of close to 2:1. What this means is that when speakers or writers use reason, they are typically primed to use it as part of a pre-emptive move in their dealings with their audience. The listener/reader may, in the particular textual context, have an expectation that the speaker/writer will answer the question ‘Why?’ (see e.g. Hoey 1983, 2001). Using a construction with reason of the kind considered above, the speaker/writer can override such an expectation.The issues raised by these data for sixty and reason are similar to those we have already considered. We have the same posited relationship between collocation and association and, to some extent, the same possibility of intertextual match- ing, though there are grounds for seeing this as a potential point of contrast with semantic association. Pragmatic association can be studied from two directions. On the one hand we can look at the operation of pragmatic factors on data of particular kinds, as do Partington and Morley (2002), Partington (2003, 2004), Garcia and Drescher (2003) and Pinna (2003). This work is throwing up valu- able evidence for the discourse considerations under which pragmatic associa- tions operate. On the other hand, pragmatic markers can be looked at as items in their own right with their own priming; see for example the work of Marín- Arrese (2003) and her colleagues in Spain. It is quite possible that this will prove a readier route to the study of pragmatic associations, since it is not yet clear how often pragmatic associations are attached to individual words as opposed to word sequences and clauses that may themselves be the product of various collocational, semantic associational and (as we shall see) colligational primings. In Chapter 1, I showed how the expression won’t say a word against is built out of collocational nesting. That discussion necessarily simplified the picture some- what. Firstly, say belongs to the semantic set COMMUNICATIVE INTERCHANGE, other members of which include hear and spoke. Secondly, and more pertinently in this context, the nested combination COMMUNICATIVE INTERCHANGE a word against is characteristically primed to have pragmatic association with both denial and hypotheticality – for example:
  36. And I would never say a word against him.
  37. . . . it is difficult to find anyone prepared to say a word against him.
  38. They looked rough, but Esther would not hear a word against them.
    Preliminary evidence suggests that this is the norm and that nested combinations of words are more likely to be primed for pragmatic association than individual words.
    The role of intuition in identifying semantic and pragmatic associationI have been arguing that for Guardian readers and writers, consequence is primed to participate in a number of semantic associations and that sixty is primed to participate in a particular pragmatic association, in the same way that both words participate in a number of collocations. But it may be objected that I have relied unduly on intuition with regard to what is included or excluded in the lists. There is no avoiding the fact that I have used intuition and that my intuition, like everybody else’s, has no privileged status. Another analyst might havecategorised the semantic and pragmatic associations differently. However, the relationship between intuition and priming is a complex one. On the positive side, it is implicit in the notion of priming that every language user’s experience of the language(s) they use is unique to them. It follows in principle that the semantic and pragmatic associations recognised by one user might differ in detail from those of another. (However, for reasons explained in Chapter 10, the differences will usually be less great than the principle might admit.) Intuition, then, may sometimes usefully reflect and give us insights into priming differ- ences. On the negative side, intuition may sometimes be distorted by the needs of the researcher (Labov 1975) or by the education of the informant, who may have attempted, quite possibly unsuccessfully, to modify their primings. Given the former kind of distortion, the case for a particular semantic or pragmatic association cannot rest on anyone’s say-so, but given the latter kind of distor- tion, the case cannot rest on the judgements of informants either. There is, however, one good source of considered intuitions that is not geared to pro- viding the answer one wants to hear and that, oddly enough, is a dictionary prepared without the use of corpora. (Corpus-based dictionaries are too linguis- tically aware to act as independent support of a corpus linguist’s intuitions!) Lexicographers describe each word separately and are not obliged to group them in sets (whatever they do in practice). If therefore their definitions highlight common semantic features in a set of words, these might be seen as supportive of the analyst’s intuitions regarding a posited association.For this purpose, I take the Collins Dictionary of the English Language (CDEL) (1979), both because it was probably the last major dictionary produced without major input from the computer and because its editor, Patrick Hanks, is a highly respected lexicographer (he went on to edit, with John Sinclair, the first corpus- based advanced learners’ dictionary – the Collins COBUILD English Language Dic- tionary, 1987). The definitions for unavoidableinevitableinexorable and inescapable, all included in my intuitively identified semantic set of LOGIC (NECESSITY), are as follows in the CDEL, albeit much abridged here:
    1. unavoidable unable to be avoided; inevitable
    2. inevitable unavoidable; sure to happen; certain
    3. inexorable not able to be moved by entreaty or persuasion;relentless
    4. inescapable incapable of being escaped or avoided
      The definitions of (a), (b) and (d) show their relatedness of meaning in that they all utilise the lexeme AVOID. The word inexorable on the other hand is more subtly related to its neighbours: the scenario underlying its definition is that X is unable to prevent Y doing Z, which is a special case of the scenario underlying the other definitions – X is unable to avoid Z.In the same way, consider the definitions for directultimatelong-term and immediate, all included in my intuited semantic set of LOGIC (DIRECT) (there is no definition given in the CDEL for knock-on in the sense used with consequence):
    5. direct without delay or evasion; straightforward; without turn- ing aside; uninterrupted; shortest; straight; without in- tervening persons or agencies, immediate
    6. ultimate conclusive in a series or process; final; the highest or mostsignificant; elemental, fundamental, basic or essential
    7. long-term lasting, staying, or extending over a long time
    8. immediate taking place or accomplished without delay; closest or most direct in effect or relationship; having no inter- vening medium; direct in effect; contiguous in space, time, or relationship

    Although the connections among these words are less obvious than in the pre- vious set, they are still there to be found. To begin with, the definition of immediate includes ‘direct’ as partial synonym and the definitions of both words include the word sequence ‘without delay’; long-term on the other hand specifies delay – ‘extending over a long time’. The definition for ultimate emphasises ‘conclusive in a series . . . final’, while that for immediate includes ‘closest’ and ‘having no intervening medium’, both definitions being compatible with the definition of ultimate if the series referred to is a series of two. A definition of knock-on, if we had one, might likewise include reference to stages in a series. Note, too, the connection with the first sub-category in the first definition: ‘evasion’ is a type of avoidance.Obviously reference to dictionary definitions does not eliminate the intuitive character of the allocation of items to categories, both because the definitions are themselves the product of intuitions (albeit well-informed ones) and because the very process of relating definitions that I have been demonstrating can be used to justify more than one grouping. But it does suggest that reference to semantic (or pragmatic) association is not fatally flawed and that these categories of priming can be operated with a degree of reliability.
    The (lack of) grammatical flexibility of semantic associationIf we look at the data used to illustrate pragmatic association, it is apparent that there is no tight syntactic relationship between the marker of vagueness and sixty or between markers of denial and reason. However, the data for semantic association suggest a different kind of relationship; both the closely parallel structures found in connection with the Bill Bryson sentence and the data forconsequence point to semantic association often operating under tight constraints. But should we limit the description of semantic associations to particular gram- matical positions? Although most of the cases of semantic association operating in the Bill Bryson sentence do pivot about single words or specified word combinations and are dependent on grammatical positioning, there are a couple of cases where the associations seems less dependent on a fixed word ordering and do not manifest the kinds of parallelism apparent in the examples given in the opening section of this chapter.As examples of semantic association not being dependent on a particular grammatical ordering, consider the following data. The word sequence by bus occurs 110 times in my corpus. Of these 65 (59 per cent) are associated with location, but the location is not always provided in the same structural form:
  39. We were taken by bus to another camp
  40. One of my staff was going home by bus
  41. Railway passengers were ferried by bus between Radlett and St Albans
  42. A traveller who had reached the border town of Myawadi by busfrom Rangoon yesterday . . .
  43. While crossing Quito, Ecuador, by bus, I noted the following message . . .
    Ten (9 per cent) are associated with measurement of time, again with some variation in the wording of the measurement dominating:
  44. Once she had got to Bassi, four hours away by bus . . .
  45. This is unlikely to help the half hour she has to spend each morning getting to work by bus
  46. . . . the Czech Republic, who set off by bus from Prague on Sunday afternoon with the intention of arriving in time for last night’s opening ceremony
    Both these associations are of course reflected in Bill Bryson’s sentence. Perhaps unexpectedly, only one instance of by bus occurs with a measurement of distance (except in so far as time is used indirectly to measure distance) and, again, there is no measurement of distance in the Bryson sentence.Sticking with the same sentence, in winter seems to have a semantic association with ‘timeless truths’ as opposed to reports of specific events. It is instructive as a way of demonstrating this to compare the distribution of in winter in my corpus with that of similar wordings such as in the winterduring the winter and that winter (see Table 2.3). Percentages refer to the proportion of instances of the wordTable 2.3 Distribution of temporal expressions of winter across propositions of different generality

    in winterin the winterduring the winterthat winter(226)(331)(203)(26)specific2917913026event(13%)(54%)(64%)(100%)timeless19715273–truth(87%)(46%)(36%)

    sequence in question. I have emboldened those percentages that seem to indicate likely strong semantic associations.At first sight, one might seek to attribute this distribution to the absence of a marker of definiteness in the word sequence in winter and its presence in the other three temporal expressions in the table. However, that would be to misread the table. In the first place, the distribution shows that the word se- quence in winter can be used in the reporting of a specific event. In the second place, it shows that two of the other temporal expressions are regularly used in timeless truths. Indeed, in the winter, which only differs from in winter in its inclusion of the definite article, occurs with timeless truths nearly half the time. Markers of definiteness correlate with the reporting of specific events; they do not provide grounds for denying that in winter is primed for Guardian readers and writers to occur with ‘TIMELESS’ TRUTH.In the instances just discussed the evidence appears to suggest that semantic associations are not grammatically restricted when they are primed. We should not, however, be in too much of a rush to assume grammatical freedom for semantic associations. If the case for the relationship between collocation and semantic association has been correctly made, we would expect to find that the properties of collocation are reflected in semantic association. Closer inspection of the data for representation of location and measurement of time in association with by bus shows in fact that though there is variation in the representation of this semantic association, certain structures dominate. For location it is the to PLACE and from PLACE structures; for measurement of time it is the NUMBER hours (away) from structure.With regard to grammatical restriction, we find that though there are indeed some collocates that appear to be primed so that they have a degree of positional freedom, many are tied to one position and one grammatical relationship. Con- sider, for example, the following data generated by WordSmith (Scott 1999), given in Tables 2.4 and 2.5, which list the top 20 collocates of consequence and of consequences occurring immediately prior to the node word in order of decreasingTable 2.4 The collocates of consequence identified by WordSmith as occurring immediately prior to the node word

    frequency. The analyses were derived from 1,764 instances of consequence and 3,611 instances of consequencesWordSmith does not identify collocates of one or two letters’ length; thus and in are missing from the first list, for example, although inspection of the concordance shows both words to be very frequent collocates. (Also, not all of the items identified by WordSmith as collocates would necessarily be so identified if other measures were used.)The first column in each table indicates the collocates identified by the pro- gram. The second column gives the total number of occurrences of the putative collocate in the environment of  5 words prior and subsequent to the node word. The third and fourth columns, fairly obviously, indicate the broad distri- bution of the collocate on either side of the node word. The remaining columnsindicate the exact positions prior to the node word in which the collocate occurs. It will be seen that some of the collocates are highly position-specific. Thus in Table 2.4, inevitable, the most common lexical collocation of consequence, occurs 85 times in the data, and 82 of these are immediately prior to the node word (conventionally indicated in such tables as being to the left of the node word, that being the spatial position in the concordance from which these data were derived); there are no instances whatsoever of its occurrence subsequentTable 2.5 The collocates of consequences identified by WordSmith as occurring immediately prior to the node word

    to the node word (shown as RIGHT in the table), despite the apparent plausibility of an utterance such as the consequence was inevitable. Similarly, 47 out of 52 instances of direct and all 11 occurrences of immediate occur immediately prior to consequence, and 87 out of 115 instances of one and 28 out of 32 instances of another occur either one or two places prior, with immediately prior position being the heavily preferred option. Most strikingly of all, in the same table, logical occurs 15 times with consequence and all 15 occurrences are in the same position vis-à-vis the node word. The same general pattern will be seen to repeat itself for a number of other items in Table 2.4.The same goes for consequences, as can be seen in Table 2.5. Of 156 instances of serious, for example, 143 occur prior to consequences and of these 122 occur immediately prior. Similarly, the 130 occurrences of political in the environment of consequences include 112 before the node, and 87 of these occur as (part of ) the word sequence political consequences. Again, the pattern repeats itself through Table 2.5. It will be noted, by the way, that these lists confirm the conclusions reached by Renouf, Sinclair, Stubbs and others, discussed in Chapter 1, that the collocational behaviours of grammatically different instances of a lemma may overlap very little; we might therefore expect to see the same lack of overlap in semantic association.As a way of addressing the possible grammatical restrictions on the primings of semantic association, and given the above data for the collocations of conse- quence (Table 2.4), let us look again at the distribution of key collocates associ- ated with the four main semantic associations of consequence, discussed above. For example, inevitable is the most common collocate belonging to the semantic set of LOGIC (NECESSITY), with which consequence has a semantic association. If inevit- able is restricted in position as a collocate, is there a similar restriction on the semantic association? And do cognate forms of inevitableinevitablyinevitability collocate with consequence, in which case is there a similar freedom for the semantic association of LOGIC (NECESSITY)? The question then is whether the association is primed to occur in structures as various as:
  47. Inevitably the consequence was that . . .
  48. The consequence was inevitable.
  49. The inevitability of this consequence was . . .(all fabricated)
    Investigation suggests that such fabrications do not have their match in actual performance. I examined 1817 examples of consequence, looking for the four associations occurring in grammatical positions other than that of the modifying adjective to consequence. In particular I looked for them in the structures rep- resented by the examples given above (47–9). However, despite repeatedly re-sorting the concordance in order to highlight different possibilities and despite the intuitive naturalness of fabricated examples such as 47, the four major semantic associations described above as being formed by consequence were virtu- ally never found in structures other than premodifying adjective. For instance, there was only one case of the structure represented by example 47 that con- tained one of the recognised associations, namely:
  50. Unfortunately the consequence of this is that the stigma stays with me . . .
    There were likewise very few examples of the combination of association and grammatical structure represented by fabricated example 48. Indeed the three examples below are all there were in 1817 lines:
  51. …a hazardous consequence is perhaps unavoidable.
  52. The second consequence is more relevant to the newspapers themselves.
  53. The consequence was evident in the state of housing, schools, hospitals.
    Not one example occurred in my data of the possibility represented by ex- ample 49, despite the intuitive naturalness of fabrications such as:
  54. The severity of the consequence caught them all by surprise.
  55. The importance of this consequence was that . . .

The armchair linguist in me may smile sweetly on these examples and pro- nounce them good, but such structures involving the word consequence do not occur in my data and I have therefore to say that the Guardian reader is not primed to expect them in the newspaper (or, I suspect, elsewhere – but that needs to be investigated). In short, the semantic associations of consequence appear to be grammatically tied. As this book develops, we shall be looking at other examples of semantic association and will find that consequence is not odd in this respect and nor do the examples of such associations previously cited in the literature seem to challenge the claim that grammatical conditions have to be met before a semantic association can operate.

If a word’s semantic associations indeed depend upon certain grammatical conditions being met, it follows that a word may be primed to operate under certain grammatical conditions directly without there being an implication of semantic association; that possibility is explored in Chapter 3.

38 Grammar

3 Lexical priming and grammar

Revisiting the Bill Bryson sentence and its ‘translation’

We saw in the previous chapter that when a word or word sequence is primed for semantic association the priming may involve grammatical constraints. In this chapter I want to consider the claim that a word or word sequence may be primed to occur in (or avoid) certain grammatical environments irrespect- ive of its priming for semantic association or collocation (though semantic association and collocation will not of course be entirely absent from my description).

We saw in Chapter 2 that in winter has a semantic association with ‘timeless truth’. Timeless truths have a tendency to be reported in the present tense, so it seems worthwhile to ask whether in winter is primed to occur in clauses using the present tense, again comparing in winter with in the winterduring the winter and that winter.

Because they represent distinct choices for the user, I have treated present perfect and modal auxiliaries separately; they have not therefore been included in the figures for the present (or past) tenses. Past perfect uses, of which there were very few, are however incorporated in the figures for the past tense. By pure (and highly convenient) coincidence, there were exactly 305 instances of clauses in the present tense and 305 in the past tense, which means that the distribution of tense in clauses referencing winter is exactly that predicted by Halliday and James (1993) for the tense system as a whole.

The first and most obvious comment is that the tenses distribute differently for the four word sequences. Table 3.1 gives the raw figures for tenses associ- ated with the word sequences. To explore the significance of these figures, we can choose to look at the distribution of the tenses across the word sequences or the distribution of the word sequences across the tenses. Tables 3.2 and 3.3 represent each of these possibilities. Thus Table 3.2 shows the proportion of instances of each word sequence that occurs with a particular tense choice, and Table 3.3 represents the proportion of occurrences of each tense that occurs

Table 3.1 Distribution of tense (and other) choices in clauses containing winter

prepositional phrases

in winterin the winterduring the winterthat winter
Present tense13311161
Past tense401657426
Present perfect5519

Table 3.2 Percentage of each word sequence occurring with each tense (or other verbal) choice

in winterin the winterduring the winterthat winter
Present tense59%34%30%
Past tense18%50%36%100%
Present perfect2%2%9%

Table 3.3 Percentage of each tense (or other verbal) choice occurring with each word sequence

in winterin the winterduring the winterthat winter
Present tense44%36%20%
Past tense13%54%24%9%
Present perfect17%17%66%

with the various word sequences. I have emboldened those percentages that seem to deserve comment.

What Table 3.2 seems to confirm is that in winter is indeed likely to be primed for Guardian readers to occur in clauses with the present tense, with 59 per cent of cases of in winter occurring with this tense. The word sequence in the winter on the other hand is likely to be primed to occur in clauses with the past tense; expectedly, that winter appears to allow of no other possibility, though with different data one might speculate that modal auxiliaries might also occur.

Table 3.3 shows the same data from the perspective of the distribution of the tenses (and other verbal group choices) across the word sequences. It presents the same picture as Table 3.2 but with slightly different emphases. If the present tense is chosen and a winter word sequence is to be used, there is an 80 per cent chance of its beginning with in, with in winter being the most likely optionIf the past tense is chosen under the same conditions, in the winter is as likely to occur as all the other options put together. If the present perfect is chosen, it is the word sequence during the winter that is most likely to occur.

What these data indicate is that one’s choice of tense or aspect may be made

at the same time as one’s choice of temporal expression. The word winter is primed to occur with inin theduring and that (as opposed, for example, to within or inside). The nested combinations in winterin the winterduring the winter and that winter are then primed to occur with the different tense and aspect possibilities. Of course, the priming also works the other way round. So the nesting of winter and present tense is typically primed to produce in winter, while

the nesting of winter and have  VERB is typically primed to produce during

the winter.

An analysis of the kind just reported throws up other observations. In the course of my investigation, I noticed that when the winter word sequences were in initial position, they occurred with particular types of clause process. Two of these seemed especially to be distributed in a non-random fashion: Material processes and Relational processes. Halliday (1994) says of Material processes that they are ‘processes of “doing” ’. They express the notion that some entity ‘does’ something – which may be ‘done’ to some other entity (p. 102). They have an obligatory actor and an optional goal (the ‘directed at’ entity in the clause). Of Relational processes he says that they ‘are those of being . . . The central meaning of clauses of this type is that something is’ (p. 112). They have either an identifier and an identified or a carrier and an attribute.

Analysed with regard to these two kinds of process, the distribution was as shown in Table 3.4. Excluding that winter from the picture, because there are too few data, the distribution of each word sequence across the processes and

Table 3.4 Distribution of temporal expressions with winter across Material and Relational processes

in winterin the winterduring the winterthat winter
Material process (84)2134254
Relational process (43)241351
Other processes (12)13532 as subject

Table 3.5 The distribution of the winter word sequences across clauses organised round Material, Relational and other processes

in winter (46)in the winter (50)during the winter (35)
Material process46%68%71%
Relational process52%26%14%
Other processes2%6%14%

Table 3.6 The distribution of Material and Relational processes across clauses containing the winter word sequences

in winter (46)in the winter (50)during the winter (35)
Material process26%43%31%
Relational process57%31%12%

of each process across the word sequences is as in Tables 3.5 and 3.6. As pre- viously I have emboldened the percentages that seem worth commenting on. Table 3.5 shows how the winter word sequences are distributed across clauses containing Material, Relational or other processes. Table 3.6 shows how Material and Relational processes are distributed across clauses containing the winter word sequences. Because we are looking only at clause-initial instances of the winter expressions, the data are fewer and any conclusions drawn must be tentative. However, Tables 3.5 and 3.6 suggest that, for readers of the Guardianin the winter and during the winter are likely to be quite strongly primed to occur with Material process verbs and that in winter will probably be primed to occur with Relational process verbs, though here the priming may be less strong. It will be noticed that in Bill Bryson’s opening sentence to Neither Here Nor There the thematised word sequence in winter occurs in a clause in the present tense that makes use of a Relational process. It is no wonder that it seems so natural; he has instinctively followed his primings.

When I constructed my alternative version of Bill Bryson’s sentence in Chapter 1, I did more than avoid the characteristic collocations and semantic associations associated with the vocabulary of the sentence. I also, in several cases, avoided the characteristic grammar associated with that vocabulary. An example of this is my use of pondered in the clause ‘though why travellers would select to ride there then might be pondered’. There are 1,057 uses of the lemma PONDER as a verb in my corpus, and only 8 of these are passive. The word pondered is generally comparatively rare, being used 22 times as part of the perfect aspect and 142 times as a past tense verb. The lemma PONDER appears to avoid

both the past tense and the passive in my data. While the former may be specific to newspaper English, the latter is hypothesised to be true of many genres and domains. My ‘translation’ of Bryson’s sentence therefore introduced a rela- tively infrequent word in a highly infrequent grammatical structure (for that word).

Priming for Relational processes might have semantic implications but they are not of the same kind as those we were considering in Chapter 2; the same may be said for tense choice. The preferred avoidance of isarewaswerebe,

am, being, been, get, got  pondered can hardly be handled as a collocational

matter. Collocations are not formulated in terms of combinations that are avoided (though they could be) and such an approach would involve the same inelegance of formulation that we considered in Chapter 2 in connection with semantic association. What we have in each of these cases is a kind of grammatical ‘collocation’ (though it is a different phenomenon from collocation between a lexical word and a grammatical word). The label that has been given such a relationship is ‘colligation’, as briefly mentioned in Chapter 1.

A brief history of the term ‘colligation’

The notion of colligation has its origin in Firth, who introduced it thus:

The statement of meaning at the grammatical level is in terms of word and sentence classes or of similar categories and of the inter-relation of those categories in colligation. Grammatical relations should not be regarded as relations between words as such – between ‘watched’ and ‘him’ in ‘I watched him’ – but between a personal pronoun, first person singular nominative, the past tense of a transitive verb and the third person singular in the oblique or objective form.

(Firth [1951]1957: 13)

This formulation makes it hard to distinguish from grammar. However, when Halliday used the notion in his study of the Secret History of the Mongols, he used it in an importantly different way:

The sentence that is set up must be (as a category) larger than the piece, since certain forms which are final to the piece are not final to the sentence. Of the relation between the two we may say so far that: 1, a piece ending in liau or j˘e will normally be final in the sentence; 2, a piece ending in s˘i2,

a, heu or san geu2 will normally be non-final in a sentence; 3, a piece

ending in lai or kiu may be either final or non-final in a sentence. (Halliday 1959: 46; cited by Langendoen 1968 as an example of

Halliday’s use of colligation)

It will be seen that Halliday is here using ‘colligation’ to mean the relation holding between a word and a grammatical pattern, thus creating a midway relation between grammar and collocation, and this is the sense in which the term will be used in this and subsequent chapters.

The last five years have seen something of a resurgence of interest in colliga- tion (see e.g. Sinclair 1996, 1999, 2004; Hoey 1997a, 1997b; Hunston 2001; Partington 2003). Sinclair’s work and mine developed independently but since we were colleagues for many years at the University of Birmingham and we worked closely together on the Collins COBUILD English Language Dictionary, it is more than possible that I acquired the concept during discussions with him. Or it may simply be an idea whose time has come.

Having said that, the idea was in fact in play before the resuscitation of Firth’s label. Colligation is implicitly illustrated in Sinclair (1991) in a number of places and Francis (1993), with its focus on corpus-driven grammar, also makes use of the concept, if not the word. In Hoey (1993), I show some of the colligational patterns associated with the word reason, again without making use of Firth’s term.

A definition of colligation

The basic idea of colligation is that just as a lexical item may be primed to co- occur with another lexical item, so also it may be primed to occur in or with a particular grammatical function. Alternatively, it may be primed to avoid appearance in or co-occurrence with a particular grammatical function.

So far we have talked about colligation as the grammatical associations a word or word sequence is primed to favour or avoid, but it is significant that Halliday, in the quotation given above, also formulates the colligational relationship in terms of sentential position. This is an important extension. It means that colligation may be interpreted as going beyond traditional grammatical relations and embracing such phenomena as the positioning of a word or word sequence within the sentence or paragraph and even its positioning within the text as a whole, as I shall argue in Chapter 7.

For current purposes, I suggest that colligation can be defined as:

  1. the grammatical company a word or word sequence keeps (or avoids keep- ing) either within its own group or at a higher rank;
  2. the grammatical functions preferred or avoided by the group in which the word or word sequence participates;
  3. the place in a sequence that a word or word sequence prefers (or avoids).
    There are two things to note about this formulation. Firstly, colligational state- ments can be negative as well as positive. So it is a legitimate colligational statement to say of a particular lexical verb that it does not occur with theprimary auxiliaries or that it avoids sentence-final position. We have already noted that PONDER is typically primed to avoid the passive voice. We might add that in winter appears, on the basis admittedly of scant data, to be primed to avoid processes other than Material and Relational (see Table 3.6).In the remainder of this chapter I shall seek to show how a colligational description might proceed and the kind of insights into the nature of lexical priming that might be gained from this kind of description. With this in mind I turn once again to the long-suffering word consequence whose living conditions will once again be submitted to detailed inspection.
    A colligational description of consequenceIn pursuit of the potential colligations of consequence I examined 1,809 instances of consequence in total, drawn from the Guardian-dominated corpus used in the previous chapter. As before, we have to be careful about assuming that patterns represented in my corpus (or any corpus, come to that) are indicative of primings that any individual may have. I reiterate that the corpus can only indirectly show us the kinds of ways that it is likely that a reader of the Guardian may be primed to use or recognise the word.Consequence has two meanings, one of which, ‘logical outcome’, we were considering in the previous chapter. The other meaning, ‘importance’, is much rarer, only occurring 169 times in 1,808 lines. The relationship between the two uses of the word will be discussed in Chapter 5. In this chapter it is again the more common use to which I wish to give attention.Consequence in its more usual sense is undeniably a common word in the language (at least in its written form); we would expect it to appear in every grammatical position and so it does. The question is whether consequence with this meaning is characteristically primed to occur in certain positions rather than others or to avoid certain grammatical contexts in favour of others.A noun will always be part of some group or other word sequence and that group or word sequence will normally perform some function in a clause. One can therefore look at the distribution of any noun in terms of its occurrence within clause or group. In the next two sections we will examine the distribu- tion of consequence in both clause and group, comparing its distribution with that of other nouns.
    The colligations of consequence in the clauseThe question I choose to address first is whether consequence may be typically primed to have a preference for (or an aversion to) certain grammatical functions within the clause or whether its use in all grammatical positions is exactly thatwhich we would expect of any noun. To this end 1,619 instances of this use of the word were analysed to see whether they occurred as part of the Subject, as part of the Object, as part of the Complement or as part of a prepositional phrase functioning as Adjunct.Obviously the raw figures or percentages of occurrence in each grammatical position will by themselves tell us little about the colligational preferences of consequence. We need to compare the grammatical distribution of this noun with that of other apparently similar nouns. In the first sentence of this section there are six singular nouns – questionpreferenceaversionclauseuse and noun. For purposes of comparison then I have taken four of these (noun and clause are excluded to avoid using linguistic terms, which might be taken to operate at a different degree of abstractness to the other words in the comparison) and examined their grammatical distribution. Three hundred instances were con- sidered of each of the comparison words (except for aversion for which only 203 instances could be found), though the full 1,615 instances of consequence were analysed. Senses that were clearly separable and idiomatic uses that did not retain the word’s primary function were not included in the sample of 300; thus, for example, instances of the X in question and references to preference shares were excluded from the analysis.My use of the terms Subject, Object, Complement and Adjunct is in line with normal use, except that, unlike Halliday (1994) for example, I distinguish Object from Complement (as do Sinclair 1972; Quirk et al. 1972, 1985; and Biber et al. 1999, 2002). Following these linguists, I define Object as having a different referent from Subject (unless it is filled by one of the self-reflexive pronouns such as himself ) and as characteristically following transitive verbs, for example:
    1. I urge you to commute the death sentences that have been passed on them[Object].
      Complement on the other hand is defined as having the same referent as Subject (again excepting cases of the self-reflexive pronouns) and typically fol- lows the verb BE and other equative verbs such as BECOME and SEEM, for example:
    2. Both are guilty of the vilest crimes [Complement].
      My analysis treated it as immaterial at this stage whether consequence appears as head of the nominal group or as part of the pre- or postmodification. So both 3 and 4 were picked up as examples of consequence functioning as Subject:
    3. A consequence of writing biography, even of the interim sort that I have just produced, [Subject] is preoccupation with the topic.Table 3.7 A comparison of the grammatical distribution of consequence in the clause with that of four other nouns
      Part of Subject Part of Object Part of Complement Part of Adjunct Other
      consequence 24% (383)4% (62)24% (395)43% (701)5% (74)question 26% (79)27% (82)20% (60)22% (66)4% (13)preference 21% (63)38% (113)7% (21)30% (90)4% (13)aversion 23% (47)38% (77)8% (16)22% (45)8% (17)use 22% (67)34% (103)6% (17)36% (107)2% (6)
    4. . . . the danger of impregnation as a consequence of a split condom [Subject] was vastly less than the chance of picking up a sexually transmitted disease . . .
      Instances of consequence occurring within a subordinate clause were treated as belonging to the Subject, Object, Complement or Adjunct of that clause unless the clause was itself postmodifying. Anything that did not fit the four basic grammatical categories was simply analysed as other (though the exclusions in fact mask another important colligation of consequence as we shall later see). Given these definitions, you might like to speculate how you might expect the word consequence to be distributed across the different grammatical functions.As anyone who attempts the grammatical analysis of authentic data knows, one encounters rather more cases where a correct analysis is problematic than one might anticipate on the basis of conveniently simple, made-up examples. It is not always possible to distinguish postmodification, particularly of an adject- ive, from a prepositional phrase functioning as Adjunct; Adjuncts and postmodifying prepositional phrases are not quite as neatly separable as one might imagine. Particles following a verb are another area where existing criteria do not always let one arrive at an intuitively satisfying analysis.Nevertheless, the analyses of the six words were largely unproblematic and the results are to be found in Table 3.7. As previously, I have emboldened those results worthy of attention. You will see that consequence is quite strikingly different from the other words in the table in its distribution among the gram- matical functionsOnly in the case of Subject is the distribution of consequence the same as that for the other nouns in our sample. For all the other clausal functions, there are positive or negative colligations for which the word is likely to be primed, and these deserve attention.
      1. There is a clear negative colligation between consequence and the grammatical function of Object. The other nouns occur as part of Object between a sixth and a third of the time. Consequence on the other hand occurs within Object in less than 1 in 20 cases.
      2. To compensate, there is a positive colligation between consequence and the Complement function. Only one of the other nouns – question – comes close to the frequency found for consequence. The others occur within Com- plement four times less often than consequence.
      3. There is also a positive colligation between consequence and the function of Adjunct, consequence occurring here nearly half the time. The other nouns in our sample occur between around a quarter and a third of the time.

      Whether or not these colligational findings were in line with your expecta- tions, they are amenable in part to explanation in terms of communicative need. We are inclined as speakers and writers to characterise states of affairs as having been caused by something else. It is obvious, then, that we would have regular need of using consequence in the Complement, since that is one of the normal ways available to us for expressing a characterisation. The colligational prefer- ence for the Adjunct function is explained by the prevalence of the two phrases as a consequence and in consequence. Nevertheless, these are explanations after the event; after all, for example, one might have imagined that there would have been sufficient instances of have a/the consequence to undermine the negative colligation of consequence as regards Object.There are complex issues here concerning the status of the grammatical categories I am using here and elsewhere. I am of course claiming that they do not exist independently of the primings that give rise to them. And yet any colligational statement of the kind I have been making both depends upon and appears to affirm the validity of pre-existing grammatical categories. In fact, though, the functions of Object and Complement are dependent on the verb choices that are made. Each verb is separately primed to be followed by certain, often fairly specific, nominal groups. We saw in Chapter 2 how language users might generalise out of the specifics of the primed collocates to a more general and in some respects more abstract category, which in turn would permit them to make creative choices that were still compatible with the general priming. In the same way, the language user may generalise out of the primed words and word combinations to create a ‘grammatical’ category that will permit them to make unexpected choices while conforming to the generalised priming. The grouping of primings leads to a degree of patterning and to linguistic creativity.Crucially, though, the extent to which this happens and the ways in which it happens may vary from language user to language user. There is not, I claim, a single grammar to the language (indeed there is not a single language), but a multiplicity of overlapping grammars that are the product of the attempt to generalise out of primed collocations. However, those primed collocations are the result of others’ utterances, and the source utterances will have been tempered by the grammars that the speakers or writers had created for themselves; fur- thermore, few priming utterances will be entirely unaffected by the harmonisingeffects of education, social pressure and the media, and many will be hugely affected. So the degree of overlap between users’ grammars is normally substantial. The categories used here and throughout this book are assumed to have some priming reality for most (but not necessarily all) users of the language, though relatively few of them would recognise the terminology used (often unsatis- factorily and incompletely) to label their primed categories. We will return to this issue later in the chapter.
      Characteristic colligational primings of consequence within the nominal groupWe have seen that consequence is colligationally quite distinct from (at least some) other nouns, both in terms of its grammatical preferences and its grammatical aversions at the level of the clause. How about at the rank of the group or phrase? There are in principle three grammatical possibilities here: consequence could occur as head of the nominal group in which it appears, as premodifier or as part of the postmodification, for example:
    5. He says consequence of the fires is that pressure throughout the field will fall [consequence as head].
    6. … and consequence modelling and risk estimates and risk contourscan be produced [consequence as premodifier].
    7. If the talk of bad blood as a consequence of the B&H quarter final washout fiasco is true [consequence as part of the postmodification] . . .
      Again, you are invited to speculate whether consequence occurs with higher or lower frequency than normal in any of these three positions. To answer this question, I undertook an analysis of all the nominal groups within which the word appeared in the 1,615 citations out of my corpus. As before, its syntactic behaviour was compared with that of the four nouns questionpreferenceaversion and use (300 of each, except for aversion, of which, as already noted, there were only 203)It would have been interesting to see whether the patterns that emerged were likely to be affected by the grammatical function of the noun within the clause, but the samples of the words used for comparison would have become unrepresentative at the level of the individual functions.Table 3.8 shows that, as before, consequence is clearly different in its distribu- tion from the other nouns in our sample, though just as question was the only noun to come close to consequence in its colligational preference for the Com- plement function, so also here question differs less from consequence than do the other nouns. In the first place, it will be noted that while all the nouns occur most frequently as heads of their own nominal groups, in the case of consequence the tendency is so overwhelming as to effectively rule out any other grammaticalTable 3.8 A comparison of the grammatical distribution of consequence in the nominal group with that of four other nouns
      Head of nominal groupPart of the postmodification of the nominal groupPremodifier of nominal groupconsequence98% (1,588)2% (26)0.06% (1)question92% (275)8% (25)–preference84% (253)13% (39)3% (8)aversion82% (167)12% (24)6% (12)use75% (226)24 % (72)1% (2)
      position in the group. Even question occurs proportionally four times as often as consequence in postmodification and the other nouns all occur in postmodification much more frequently – between an eighth and a third of the time. What this tells us is that consequences are never used to narrow down other noun-heads – they are always the centre of attention. Secondly, all the nouns except question show a small tendency to occur as premodification. Consequence shows no such tendency at all. (My intuition is that your intuitions were more reliable on this matter than on clause functions, but then, perhaps, my intuition is untrustworthy.) Since all the nouns occur more often in head function than as (part of the) pre- or postmodification, it is probably better not to formulate the colligational association for consequence in positive terms but in negative terms. Thus theposited typical primings are as follows:
      1. consequence is typically primed to colligate negatively with premodification;
      2. consequence is typically primed to colligate negatively with postmodification.

      The characteristic primings of consequence with respect to ThemeExamination of our 1,615 examples of consequence reveals that 43 per cent of these (698, to be exact) were found to be part of the Theme of the clauses in which they appeared and 518 of them were sentence-initial as well. Theme is defined by Halliday (1994: 37) as ‘the element which serves as the point of departure of the message; it is that with which the clause is concerned’ and here it is operationalised as any textual material in a clause up to and including the Subject, where the Subject precedes the main verb of the clause. In those cases where the Subject follows the main verb, Theme is taken to be any textual material preceding the main verb. In this I broadly follow Davies (1988) and Berry (1989).Characteristic examples of consequence being used as part of Theme are the following:
    8. In consequence, the draw proved all important.
    9. One consequence of this will be less flexibility in the choice of text.
    10. You showed utter irresponsibility and as a consequence a human life was lost.
    11. If nations disintegrate, the consequence will not be a further shift towards integration.
      The Subject Themes are not of special interest. Table 3.7 shows that there is nothing unusual about the frequency of occurrence of consequence as Subject. However, the high proportion of thematised Adjuncts containing consequence is perhaps more surprising. Almost half the thematised instances of consequence are Adjuncts: 45 per cent of the sentence-initial themes are Adjuncts, and 48 per cent of non-sentence-initial Themes. (The remainder are all Subjects.)Just because it seems surprising it does not mean that it is surprising. In order to get a sense of how unusual this proportion of thematised instances of consequence in Adjuncts might be, I analysed the Themes of three Guardian features from my corpus, one dealing with insurance matters, one drawn from the regular series ‘Face to Faith’ and one a book review. Because the phrases as a consequence or in consequence might be considered nested combinations in which the noun-ness of consequence has been to some extent suppressed (like order in the word sequence in order to in the previous sentence), I categorised the initial Theme of each clause in these three features grammatically and calculated the proportions of Adjuncts to Subjects for both clause-initial and sentence- initial position. In this calculation, fronted prepositional phrases and conjuncts like thenthus and however were all treated as thematic Adjuncts. The conjunc- tions (andor and but) were not included. The results are given in Table 3.9. These figures suggest that in a normal distribution one in four clauses char- acteristically begin with an Adjunct or conjunct, and that supports the view that there is a higher preponderance of consequence in thematised Adjuncts than would occur on the basis of normal distribution. However, we are looking at a noun and there is quite a lot of evidence, despite my remarks above, that consequence can be treated as retaining its noun-ness in the phrases as a consequence and in consequence. Consider the following examples from my corpus, for instance:
    12. But as a consequence of past neglect, this ‘recovery’ is different.
    13. As a direct consequence of the Nazi-Soviet pact, the leaders of the PCF refrained from organising armed resistance.
    14. . . . treatable conditions which sufferers were otherwise likely to put up with as simply the inevitable consequence of old age.
    15. In consequence of that article Marconi and GEC-Marconi broughtproceedings for libel against the Guardian and the author of the article.Table 3.9 The proportions of initial Themes in sentence-initial and non-sentence-initial clauses

      SubjectAdjunctOther clausal functions or no clausal functionSentence-initial clauses (240)60%35%5%Non-sentence-initial clauses (151)88%9%3%All clauses (391)71%25%4%
    16. She’s the nearest thing to Garland that’s still going and in grateful consequence most of the good seats at her concerts will be filled by impeccably turned out male couples.
    17. In a perverse consequence of industrial decline, more than 1,000 jobs will be generated in the next century in order to decommission the plant.It will be seen that both putative idioms show some degree of freedom; we are able to confirm on the basis of the examples listed above that consequence still retains the properties of a noun. As a consequence can have consequence postmodified (always by a prepositional phrase beginning with of in my data) (e.g. 12, 13, 14). It can also be premodified with an adjective, always in conjunction with the postmodification just mentioned (13, 14). In one instance the indefinite article is even replaced by the definite article (14). A similar freedom is available to in consequence. We find postmodification (17) and adjectival premodification (16, 17). We even have one example of the introduction of the indefinite article, along with both the previous features (17). None of these features denies either phrase idiomatic status; what they do, however, is show that even the strongest primings can be overridden. Intuition might lead one to expect that the collocations of consequence with as a and in would be so powerful that here at least there would be no freedom to do differently. But as must be abundantly apparent by now, all primings, whether grammatical or idiom-creating, are matters of probability not requirement. (Sinclair 1991 shows clearly enough the ways in which idioms may be varied.)What these examples also show is that consequence retains the qualities of a noun within those idioms. It is even possible to break away from the idioms altogether and still use consequence in an adjunct, as is shown by the following unusual example from my corpus:
    18. But even in that show, which I saw at Norwich, and by consequence in this, there was much achievement, albeit traditionally anecdotal.It is partly because the nominal primings for consequence interfere with those for as a consequence and in consequence that the variations demonstrated in examples 12–18 are possible.Table 3.10 Grammatical distribution of first noun in the sentences of three Guardianfeatures

      SubjectAdjunctOther clausal functionsNo clausal function or no noun in Theme1st noun in Theme of sentence (240)50% (121)19% (45)3% (7)28% (67)1st noun in Theme of sentence excluding 3rd and 4th categories72%28%––
      The persistent noun-ness of consequence means that the figures in Table 3.9 are only partially relevant. Accordingly I also considered the first noun of each sentence within the Theme and categorised it according to whether it was found within Subject or Adjunct (or elsewhere). I did not look at non-sentence-initial clauses. Pronouns and existential there were not counted, which explains the high proportion of sentences with no noun in Theme. The results are presented in Table 3.10 in two forms, firstly with the figures included for other clausal functions (and for no grammatical function, in instances where one does not have a complete clause), and then without. It will be seen that the results are comparable to those in Table 3.9.By either of the measures used, on a normal distribution we could have expected Adjuncts to have comprised between a quarter and a third of instances of consequence in Theme. So the near 50:50 division of instances of thematised consequence into Subjects and Adjuncts suggests that consequence is typically primed to occur as part of a thematised Adjunct. As we shall see in Chapter 7, the colliga- tion of consequence with Theme is the tip of an iceberg of textual positioning that a word may be primed for. In anticipation of the discussion in Chapter 7, I will label the thematic colligation of consequence textual colligation.I have laboured the analytical processes in this section because I want to makeit clear that the kind of corpus investigation necessary to establish plausible primings needs to be cautious and thorough. Elsewhere in this book, because of the desire to produce an accessible text and because of the exigencies of space, I have not always shown the background analyses in such detail; their absence from the text should not be taken as evidence of their non-existence.In the next two sections we will look more closely at the two dominant Adjunct forms involved in the colligation we have just established.
      Typical colligational primings of in consequenceThe first thing to note about in consequence is that it has positive priming for use in Theme rather than Rheme (‘Rheme’ being here defined as anything occurringafter the Subject in the sentence). Out of 216 examples found in my data, 58 per cent are unambiguously in Theme (as in examples 8, 15, 16 and 17, though of these only example 8 of course is entirely characteristic). A further 14 cases occur post-Subject but pre-verb and would be regarded as Theme by Berry (1989) and some other linguists, as in 19 below; some of these occur in first position in clauses with ellipsis of the Subject, as in 20:
    19. Everyone, in consequence, was on their very best behaviour.
    20. We were late getting the flock sheared this year and, in consequence, paid the price for our inevitable delay . . .
      With these 14 cases included, the proportion of instances of in consequence in Theme rises to 65 per cent.Connected with this textual colligational priming is a simple positional one. In consequence is apparently primed to take first position within Theme and conse- quently first position in clause or sentence. So 41 per cent of all instances of in consequence occur in the very first position in the sentence and a further 17 per cent take first position in a non-initial clause.The next colligational point to make is that in consequence is rarely used with postmodification in thematic position. Of the 126 uncontroversial instances, only 9 are postmodified (as in 15 and 17). Even if the definition of Theme is extended to include the extra 14 cases, there are still only 10 postmodified cases, a mere 8 per cent. This, then, is another case of negative colligational priming. In this particular structure and in this particular position the noun appears to have an aversion to being postmodified. Postmodification is more common when in consequence is used in the Rheme (there are 22 cases) but even under such circumstances, postmodified cases account for only 1 in 4. As we shall see when we look at as a consequence in its clause final position, this is not an aversion that consequence forms in all situations.The typical colligational primings of in consequence are then:
      1. it has a strong textual colligation with Theme;
      2. it has a strong textual colligation with first position in the sentence (and clause);
      3. it has an aversion to being postmodified in any position, but especially in initial position.

      Typical colligations of as a consequenceAs far as thematisation and sentence position are concerned, the colligational picture for as a consequence is very similar to that for in consequence. It is used in thematic position almost 50 per cent of the time (49 per cent to be exact), andoccurs in first position in the sentence a third of the time (34 per cent ). Again, like in consequence, in initial position as a consequence tends to occur without any form of (pre- or post-) modification, though the avoidance of postmodification is less marked than for in consequence. In 194 instances of as a consequence as thematised Adjunct, 77 per cent (149) of the instances of consequence are not postmodified. Characteristic examples are:
    21. As a consequence the future of current affairs in prime time is in jeopardy.
    22. As a consequence, it created fear and suspicion amongst a people to whom such feelings hardly seemed to come naturally.
    23. As a consequence, debt is common.
      Example 10 (see p. 50 and below) illustrates the structure in a non-initial clause.The situation regarding postmodification is however strikingly reversed when as a consequence appears at the end of the clause. There are 188 instances of this happening in the data, and of these over two thirds (69 per cent) are postmodified. Typical examples are:
    24. . . . there is no doubt that there are large areas of the countryside which have been preserved as a consequence of hunting.
    25. Another club veteran lamented the decline in grassland flowers as a consequence of the fertilisers, the drainage and all the other techniques that go with ‘clean’ farming.
      The explanation for this sharp difference in the characteristic priming of as a consequence depending on whether it occurs at the beginning or end of the clause appears to be that writers have a choice. They can either put the cause first and the consequence second, as in example 10 (repeated here):
      10 You showed utter irresponsibility [CAUSE] and as a consequence a human life was lost [CONSEQUENCE].
      Or, they can put the consequence first (in the preceding clause) and the cause in the postmodification of consequence, as in 26:
    26. . . . there is no doubt that there are large areas of the countryside which have been preserved [CONSEQUENCE] as a consequence of hunt- ing [CAUSE].
      Needless to say, this is not a free choice. The structure used is partly governed by whatever has been said in the previous sentence. The fact that there is a choice, however, provides evidence for questioning the assignment of as aconsequence to the class of conjuncts. (A similar argument applies for in conse- quence.) It also means that the option that a user has of thematising the cause is not usually taken up when this combination is used and the option of giving end- weight to the fact that one’s proposition is a consequence is likewise avoided. The phrase as a consequence is, in short, textually primed; textual priming will be returned to in Chapters 6 and 7.The typical colligational primings of as a consequence are therefore:
      1. the phrase has a strong association with Theme;
      2. it has a strong association with first position in the sentence (and clause);
      3. it has an aversion to being postmodified in initial position;
      4. it has, on the other hand, a strong tendency to be postmodified in final position.

      Colligational primings of consequence when SubjectAlthough we have seen that the use of consequence as part of the Subject is not colligationally interesting, there are still a good number of such instances. The word is primed neither to avoid nor favour the Subject compared with other abstract nouns, but this is itself a priming, and since there are as many instances of consequence occurring as part of the Subject of a clause in my data as there are of its occurring as part of a thematised Adjunct, they deserve closer attention in case there are other, more distinctive, primings associated with them. There is then a basic choice available to the speaker or writer between thematising consequence in a prepositional phrase and thematising it as part of the Subject. We look now at the latter option with a view to seeing whether there are any colligational primings associated with consequence in this position.One of the basic choices we make whenever we use a nominal group is between definiteness and indefiniteness. A natural first step in looking at con- sequence as Subject is therefore to consider the patterns of definiteness associated with it. As before, it is important to consider the behaviour of consequence in comparison with other abstract nouns. Failure to do so may lead to wrong conclusions being drawn. Indeed in my earlier paper on the colligations of consequence (Hoey 1996) I drew incorrect conclusions about its colligations with respect to definiteness, simply because I neglected to check whether the associations I was finding existed for other abstract nouns. (Forgive me, for I have sinned.)For the purposes of finding whether it is likely that writers and readers of the Guardian will form colligational primings for consequence associated with (in)definiteness, I have used the same set of abstract nouns as before, i.e. ques- tionpreferenceaversion and use. To make the comparison exact, I have eliminated instances where the noun is not head of the nominal group functioning asTable 3.11 The distribution of markers of definiteness and indefiniteness for consequenceand four other abstract nouns

      DefiniteIndefiniteconsequence249 (67%)125 (33%)question72 (92%)6 (8%)preference40 (78%)11 (22%)aversion26 (76%)8 (24%)use33 (67%)16 (33%)
      Table 3.12 A comparison of consequence and use in respect of indefiniteness markers

      a(n)anotheroneeveryconsequence28 (22%)20 (16%)76 (61%)1use33 (67%)16 (33%)––
      Subject; this explains the discrepancy between the totals here and in previous tables.When consequence is head of a nominal group serving as Subject it appears to colligate with non-specific deictics more often than do all the other abstract nouns apart from use, as can be seen in Table 3.11. The ratio between definite and indefinite subjects with consequence is 2:1, as opposed to 3:1 for preference and aversion and 12:1 for question. (There is no difference, though, between conse- quence and use in this respect.) This affinity for non-specific deictics is of course all of a piece with the indefiniteness of as a consequence. It implies the frequent importance of thematising the unexpectedness of the consequence to be described in the Rheme. Although Table 3.11 suggests that consequence and use are likely to be primed in identical ways as regards indefiniteness, closer attention to the data suggests significant likely differences in the priming. Looking more closely at the way the indefiniteness of consequence is realised, we find that there are further differences between consequence and use. Table 3.12 shows that the most com- mon marker of indefiniteness with use is the indefinite article a, with another as second choice. Strikingly, for consequence the most common marker of indefinite- ness is one with the indefinite article trailing some way behind.Another colligation can be detected if we look more closely at how definite- ness is realised in the Subject nominal groups. There are three main ways in which a nominal group may be made definite: with the definite article, with a possessive expression and with a determiner. The nouns divide into two quite distinct groups in this respect. Consequence and question overwhelmingly favour the as a marker of definiteness, while preference and aversion favour a possessive construction; use falls somewhere between the two. (Because of low frequency, the figures are of course untrustworthy as more than rough guides to theTable 3.13 A comparison of consequence and four other nouns in respect of definiteness markers
      the Possessivethis, thatconsequence247 (99%)2 (1%)–question67 (96%)3 (4%)–preference10 (25%)28 (70%)2 (5%)aversion7 (27%)19 (73%)–use21 (64%)12 (36%)–
      Consequence as head of nominal group functioning as Subject 368
      BE347 (94%)Other verb 21 (6%)Figure 3.1 A map of the priming of verb choice of consequence as head of a nominal group functioning as Subject
      colligational preferences of the comparator nouns.) Consequence occurs with a possessive construction less than any of the other nouns. Put more precisely, consequence is likely to be primed negatively for occurrence with possessive constructions (see Table 3.13).When consequence appears as head of the nominal group functioning as Subject, the clause it is part of follows quite predictable lines. This can perhaps be best indicated in a series of diagrams. Figure 3.1 shows the verb choices associated with consequence as Subject. In the calculations, only lexical BE has been counted, not auxiliary BE in progressive and perfect constructions. As can be seen, conse- quence is strongly primed for collocation with BE. The resulting structures are quite predictable as Figure 3.2 shows: 79 per cent of all instances of consequenceas Subject conform to one of two structures: consequence  BE  that clause andconsequence  BE  nominal group, with the latter frequently being a nominalisationof a clause. The figures for nominalisations here are very much on the cautious side, with only clear-cut cases being included, usually with residual clausal elements attached. Thus 27 was counted but 28 was not:
    27. But the consequence could be the retention of large numbers of alter- native syllabuses in the subject.
    28. The consequence has been an austerity drive.
      Even where the priming is apparently weaker, with the consequence  BE to clause, there is further priming to be identified in that this structure
      Consequence as head of nominal group functioning as Subject 368
      BE347 (94%)+ BE + clause 192 (55% of BE)+ BE + other 155 (45% of BE)+ that clause 15280% of BE + clause 41% of all instances+ to clause 36+ nominal group 14191% of BE + other 38% of whole set+ adjectival group 14
      + clause nominalisation 61+ other NG 79Figure 3.2 A map of the key colligational primings of consequence as head of a nominal group functioning as Subject
      favours postmodification of consequence (29 out of 36 instances, i.e. 81 per cent). If the more common consequence  BE  that clause is chosen, the propor- tion of instances of postmodified consequence drops to 77 out of 191 (i.e. 40 per cent).
      Colligational nestingI have commented in a number of places on the phenomenon of nesting whereby a combination of words will have priming separate from (though built up from) the primings of the individual words. It will be apparent from our consideration of consequence that nesting does not only take the form of the building of word sequences and lexical items. It can also be the case that when a word or word sequence combines with a particular colligational priming (positive or negative), this nesting in turn has further primings, which may be of any kind – collocational, semantic associational or colligational. So Figure 3.2 shows that the nesting of consequence with Subject is primed to collocate with the lemma BE and then this nesting is further primed to colligate with that clauses. (The objection that consequence was not found to be primed to occur as Subject either positively or negatively will be addressed in Chapter 8.)The property of the nesting of primings is an important one in that it allows us to go some way beyond certain kinds of grammatical description. In particular it helps us to explain the existence of grammatical structures in apparent free variation. To illustrate this, let us return to the word reason, which was discussed in Chapter 2 in connection with pragmatic association. In my corpus and for most speakers reason is positively primed to colligate with postmodification, and this colligational priming takes five unequally favoured forms. These are:
      reason  clause (without connector) (1,006 instances in my data)
    29. One reason events have moved so fast is that there is a powerful new player on the international scene.reason  that clause (129 instances in my data)
    30. The only reason that there has not been a serious accident is the provision of hustle alarms on the trains . . .reason  why clause (1,531 instances in my data)
    31. But the reason why they are limiting the number of childrenremains a matter of dispute.reason  for  non-finite clause or nominal group (often a nominalisation) (2,608 instances in my data)
    32. Her reason for opposing it relies on the fact that women of every ethnic group are mainly at risk from men of their own ethnic group.reason  to  non-finite clause (2,005 instances in my data)
    33. But the main reason to doubt Mr Yeltsin’s summit strategy is domestic, rather than foreign.
      On the face of it, there would appear to be considerable freedom of choice among these structures. Sentence 33 is capable of being rewritten as sentences 34, 35, 36 and 37 without obvious violation of naturalness:
    34. But the main reason one might doubt Mr Yeltsin’s summit strategy is domestic, rather than foreign.
    35. But the main reason that one might doubt Mr Yeltsin’s summit strategy is domestic, rather than foreign.
    36. But the main reason why one might doubt Mr Yeltsin’s summit strategy is domestic, rather than foreign.
    37. But the main reason for doubting Mr Yeltsin’s summit strategy is do- mestic, rather than foreign.
      The question then is: is each of the five nested primings illustrated above itself primed for a different textual purpose? Note the specificity of the question. I am not exploring whether different kinds of postmodifying clause have different functions in general; I am investigating whether different kinds of postmodifyingclause (or nominal group) are primed for different purposes in the specific circumstance of their occurring with reason.Given that we have seen that consequence has variable distribution across the func- tions of Subject, Object and Complement and that there are colligational choicesthat are dependent on which function is chosen, it makes sense to investigate whether the different nestings behave differently as regards clause function. We also noted in Chapter 2 that reason has a pragmatic association with DENIAL and I referred there to the 2:1 ratio between affirmations and denials of reason, in contrast with the general ratio of 9:1 of positive and negative sentences, which we would expect to map closely onto affirmations and denials (though the map is not exact). It would seem worthwhile to check whether the nestings connect with this priming. With this in mind all instances of reason  postmodification were examined for clausal function (a small number that occurred in non-clausal contexts were excluded both from the count of instances and from the analysis.) The clauses in which they appeared were then examined to see whether theywere affirming a reason or denying (or denying knowledge of ) a reason. Examples 29, 30, 32 and 33 above are all affirming; 31 on the other hand denies knowl- edge of a reason.Other instances of denial (in clauses that respectively use reason for  nominal group, reason to  non-finite clause and reason why  clause in Complementfunction) are:
    38. . . . there’s no reason for the anxiety.
    39. There was no reason not to inform me beforehand.
    40. There must have been a good reason, somewhere at the screenplay level, why people like Jeff Bridges, Tommy Lee Jones, Suzy Amis (so memorable in The Ballad of Little Jo) and Forest Whitaker decided to appear in Blown Away. But . . .

It will be noticed from the last that denial of (knowledge of ) reason need not involve the use of the recognised negative markers.

The results of the analysis are shown in Table 3.14. Here, the ratio of affirmation/denial is markedly skewed from 9:1, so we can assume that we are looking at evidence of priming of the nested combination; such cases have been highlighted. In some cases it was not clear whether what was being denied was the reason or something else in the clauses. The figure in brackets represents the total of instances that would result from including such doubtful cases; a few

instances of other problems of allocation are included here as well. It is perhaps of interest that all doubtful cases occur when reason  postmodification occurs as Subject. These figures are not included in subsequent calculations.

Inspected closely, the table shows that when reason is used as (part of ) Subject, it is primed for affirmation (1,895:66, an affirmation-denial ratio of 29:1), with

Table 3.14 The distribution across Subject, Object and Complement of reason 

postmodification in clauses that affirm or deny (knowledge of ) the reason
Subject reason affirmedSubject reason deniedComplement reason affirmedComplement reason deniedObject reason affirmedObject reason denied
reason  clause 69817 (38)21042144
reason  that clause774093
reason  for x1,09136 (49)610392305161
reason  why clause710 (17)59462961223
reason  to V223286536732426

affirmation over three times as common as might be predicted on the basis of the positive-negative clause ratio. When reason is used as (part of ) Complement on the other hand, it is primed for denial (1,740:1,608, not far off a 50:50 ratio). (With Object, reason is more weakly primed for denial.) From the pragmatic perspective, the choice of affirming a reason would seem to invite the simul- taneous choice of Subject. The choice of rejecting a reason (or saying that it is unknown or is unimportant) invites use of the Complement (or Object).

The nesting of reason and Subject function and of reason and Complement (or Object) function may permit the priming just described, but it offers us no clues as to why particular kinds of postmodification might have been chosen. Indeed I have no evidence that the nesting of reason and Subject or Complement would not operate equally well without the presence of postmodification. However, if we look at the rows of Table 3.14, rather than at the columns as above, we find that the different postmodifying structures with which reason appears also dis- tribute themselves differently between affirmation and denial. The nesting of reason  clause without connector is apparently primed for affirmation (in an affirmation-denial ratio of 15:1). No other nesting approaches this ratio; the relatively infrequent reason  that clause is the closest, with an affirmation-denial ratio of 10:1, which is too close to the norm of 9:1 to be of any interest. On the other hand, the nesting of the reason  why clause seems to be primed for denial, irrespective of the grammatical function to which it is being put. Indeed, the priming of the reason  why clause for denial appears to override the priming of

reason  Subject. It will be noted, though, that the conflict is resolved by simple avoidance of Subject function when a reason  why clause is being used.


In this chapter we have witnessed (in what has probably seemed exhausting detail) the way a word’s patterns of use are characteristically controlled by its colligations and the way these patterns of use, through nesting, are in turn

primed for particular purposes. Not every one of these colligations will occur in every domain and genre and not every speaker/writer will be primed for these colligations in newspaper text, but every domain and genre will have its own characteristic colligations (which may well overlap with the ones found for newspaper text) and every speaker/writer will be primed in some way for the domains and genres with which they are familiar. In Chapter 8 I shall argue that colligation, with appropriate modification to take account of morphology and phonology, can be used to construct grammars rather different from those we are accustomed to considering. The fundamental claim made in the first three chapters of this book has been that the semantic and grammatical relationships a word or word sequence participates in are particular to that word or word sequence and do not derive from prior self-standing semantic and grammatical systems, though they do contribute to the posterior creation of those systems.

Lexical relations 63

4 Lexical priming and lexical relations


I have been arguing that words can be primed for collocation, semantic associa- tion and colligation and that the notions of priming and nesting permit, in principle, the formulation of quite complex representations of naturalness without jeopardising our ability to account for creativity in language. We have seen, however, that it may sometimes not be the word or word sequence that is being primed but the semantic set, created by the operation of abstraction from a variety of individual primings. For example, in Chapter 2, I implicitly assumed that semantic sets might themselves participate in lexical primings, when we considered the combination SMALL PLACE is a NUMBER-TIME-JOURNEY-(by VEHICLE)- from LARGER PLACE. This gives rise to the possibility that our focus on the way that words or word sequences form semantic associations may be insufficiently generalised and that instead of formulating semantic association as ‘item has a semantic association with semantic set Y, represented by items aand c’ we should formulate the association thus: ‘semantic set X, represented by items pand r, has a semantic association with semantic set Y, represented by items aand c’. Such a reformulation would imply that the SMALL PLACE is a NUMBER- TIME-JOURNEY-(by VEHICLE)-from LARGER PLACE combination was the norm and that therefore the starting point for much priming description should be the semantic set. This would be in line with (though still different from) work congruent with the position proposed in this book, such as Pattern Grammar (Hunston and Francis 2000), the schema-based approach of Michael Barlow (e.g. Barlow 2000) and construction grammar (Goldberg 1995). It would mean that priming as so far presented was only the tip of the iceberg and indeed insuffi- ciently generalised.

If, however, priming description were to centre on the semantic set, it would

need to be the case that members of a semantic set should share the great majority of primings. Although some of the instances of semantic association we have considered have involved sets with uncertain memberships (e.g. LOGIC

(NECESSITY)), others, like JOURNEY and NAMED PLACE, have drawn their memberships from the hyponyms of a particular superordinate. (The hyponym-superordinate relation is the relationship of instance to general, illustrated in the relationship of spanielpoodleAlsatian [co-hyponyms] to dog [superordinate] ). Since co-hyponyms are usually readily identifiable, it seems sensible to start by exploring whether they share the great majority of primings. If they do share their primings, then we will need to reformulate our claims about semantic association along the lines suggested above.


A suitable example of semantic association, for our purposes, is that formed by the lemma train, analysed in considerable detail by Campanelli and Channell (1994) (cited by Stubbs 1996). Train is primed to collocate with as a in news- paper data and the nested combination trainas a (where train* stands for traintrainstrained and training) is typically primed to associate with SKILLED ROLE OR OCCUPATION. My corpus has 292 instances of trainas a, and of these 262 are followed by an occupation or related role. Examples from corpus are:

TRAIN* as a teacher (25) TRAIN* as a doctor (12) TRAIN* as a nurse (11) TRAIN* as a lawyer (11) TRAIN* as a painter (8) TRAIN* as a dancer (7) TRAIN* as a barrister (5) TRAIN* as a chef (5)

All the above are clear collocates, but my data also include such words or word sequences as cobblerconcentration camp guard and Braille shorthand typist. For most users of the language, Braille shorthand typist will not be primed as a collocation of TRAIN as a (though for someone who is blind or someone who works for the Royal Society of the Blind it may indeed be so primed), but it is likely to be explicable in terms of their priming of TRAIN as a as having a semantic association with SKILLED ROLE OR OCCUPATION. Put the other way round, the semantic set SKILLED ROLE OR OCCUPATION can be said to be primed to have a collocation with the word sequence TRAIN as a. The question then is: do the members of this set share other primings?

For the purposes of answering this question, I took, as my sample of hyponyms of SKILLED ROLE OR OCCUPATION, the words accountantactoractressarchitect and carpenter. The words actor and actress were chosen to see whether hyponyms differing only in terms of gender would differ in any other way. (My data are not

new enough to reflect the recent change in the use of actor from male-specific to gender-neutral; this is an interesting case of priming drift, presumably given a conscious push at the beginning.) I looked at 1,045 instances of accountant, 3,194 instances of actor, 1,710 instances of actress, 2,020 instances of architect and 245 instances of carpenter.

One might reasonably have predicted that SKILLED ROLE OR OCCUPATION words like architect and accountant would share many collocates; employ(ed)work(ed) and good seem reasonable candidates, for example. Yet WordSmith’s (Scott 1999) collocation calculation facility throws up very few shared collocates. The words actor and actress both collocate with directorbestfilmsinger and former (the last a telling reminder of the transitory nature of the acting profession for many). Otherwise there is little that is shared. Architect shares the collocate Sir with actor, perhaps reflecting the relative frequency with which architects and actors are honoured in the UK as opposed to accountants or carpenters (or researchers in English language). The other major lexical collocates of architect are designednew and chief, none of which it shares with the others in the list. The main collocates of architect are charteredyearpounds and said, the last of which it shares with carpenter. It shares with actor and actress the collocate former. The major collocates of carpenter are agedfather and son, which do not relate to the job in the way that those of its fellow hyponyms do.

All of this suggests that the various hyponyms of SKILLED ROLE OR OCCUPATION are typically primed quite differently from each other, at least as far as colloca- tion is concerned. However, we cannot draw any large conclusions from this. After all, walk collocates with minute and minutes, which flight does not, and ride collocates with taxi, which walk does not. Both ride and walk collocate with bus, but with ridebus usually precedes it or, as in the Bill Bryson example, appears in the word sequence by bus. With walk, on the other hand, bus almost always follows walk and walk is never connected to the word sequence by bus. Yet, as we saw in Chapter 2, all these JOURNEY words are usually primed to participate in the patterns NUMBER-TIME-VEHICLE-JOURNEY and NUMBER-TIME-JOURNEY-by-VEHICLE:

the real test of whether co-hyponyms behave the same way will be with respect to primings for colligation and semantic association.

There are a number of basic colligations we would expect our chosen set of co-hyponyms to share. As countable concrete nouns sharing a common superordinate, we might expect them all to take definite and indefinite articles (e.g. the architecta local architect). We might expect them all to take classifiers (e.g. the ornamentarian architect) and possessives (not just another developer’s architect). We might expect them also to be themselves possessors, either as possessive determiner or as a postmodifying of-phrase (the architect’s briefthe skills of an architect). We might expect them to occur in parentheses (Sir Robert Smirkethe architect of the British Museum) and apposition (the Viennese architect Adolf Loos). Despite the obviousness of these expectations, investigation of the colligations of

Table 4.1 The colligations of the co-hyponyms accountant, actor, actress, architect and

carpenter (adapted from Hoey 2000)

Indefinite article26%22%18%16%42%
‘Possessor’ construction i.e. ’of noun phrase (NP)




‘Possessed’ construction10%1%0%5%2%

accountant, actor, actress, architect and carpenter shows that they differ grammatically among themselves. In other words, despite their being co-hyponyms they are each primed in their own way. Table 4.1 picks up each of the constructions just mentioned and shows how distinctively these features are distributed across the five co-hyponyms. Figures in bold suggest a positive priming; underlined figures suggest a negative priming.

To begin with the word with least absolute frequency, carpenter, is apparently quite strongly primed to occur with an indefinite article or in a parenthesis; both constructions are illustrated in this sentence:

  1. Her father, a carpenter, became a permanent invalid when she was three . . .
    Given that parenthesis is a relatively rare construction even in the parenthesis- rich waters of newspaper English, the fact that one in four instances of carpenter participates in such a structure is striking. It also appears to be primed to occur in possessive constructions, for example,
  2. The carpenter’s benches were well-lit by rooflights.
    A possible explanation for this is that carpenters have distinctive tools and equipment that they use in their work, unlike, say, accountants or actors.On the other hand, accountant is strongly primed to occur with a classifier and less strongly to occur with a possessive construction. One in four instances of accountant occur with a classifier in my data and one in ten architects are pos- sessed! Although not in itself a high percentage, 10 per cent is twice as frequent proportionally as the co-hyponym next most likely to occur with a possessive, architect. Both uses are illustrated in 3:
  3. Perhaps the chef had put his back out, or had been called to task by his turf accountant.
    For the writers of newspaper text, it would appear that actress is primed to occur in apposition (the Czech actress Anny Ondra). This is one priming that is unlikely to move from receptive to productive for the majority of Guardian readers and is a particularly clear instance of the way primings are constrained by the social/generic context. Like carpenteractress is primed to occur in pos- sessive constructions about a sixth of the time. However, the table disguises a difference between the two co-hyponyms. Whereas carpenter appears on a roughly 50–50 basis in ’s constructions and of constructions, actress avoids the ’s con- struction, appearing almost exclusively in the postmodifying of construction. Both the positive primings of actress are illustrated in 4:
  4. Digs were notoriously bad in London and in desperation the mother of the actress Fay Compton founded a hostel called The Theatre Girls’ Home in Greek Street, Soho.
    The only hyponym not to be strongly primed to favour or avoid one of the grammatical patterns mentioned is architect though, as I shall argue in Chapter 8, this is not to say that it is not primed at all colligationally. However, the word is distinguished from its fellow co-hyponyms all the same in that, as Table 4.2 shows, it is alone in being frequently used as a metaphor (He was the main architect of the peace plan), with actor the only other word with any record of metaphorical use. We will return to the metaphorical use of architect briefly and in passing in Chapter 8, where the more general issue of creativity is handled.Cumulatively, the evidence seems to suggest that co-hyponyms do not in fact share a good proportion of their primings, and this in turn suggests that it would be mistaken to expect the majority of semantic association statements to be formulated in terms of one semantic set having a semantic association with another set. The particularity of our account of priming appears to be justified. The collocational and colligational behaviour of the co-hyponyms we considered
    Table 4.2 The distribution of metaphorical uses across the co-hyponyms accountant, actor, actress, architect and carpenter (adapted from Hoey 2000)
    instances)instances)instances)instances)instances)Metaphor0%5%0%23%1%are too variable in their characteristic priming for them to routinely allow generalisation in terms of the priming of a whole semantic set. This does not, however, mean that co-hyponyms never so group. Apart from the NAMED PLACE BE x hours JOURNEY from BIGGER NAMED PLACE example that triggered this investiga- tion, we have also seen that ‘occupation’ co-hyponyms group for the purposes of collocating with TRAIN as a. Normally, though, the evidence suggests that we should continue to articulate such statements in terms of the more specific nested combination’s priming with the semantic set rather than the other way round.
    SynonymyIn any semantic set, there may be members, often co-hyponyms, so close in meaning that we label them ‘synonyms’ (or ‘similonyms’: Bawcom 2003). They seem to have a psychological reality (Cruse 1986) and in continuous text they have long been noted as a cohesive device (Halliday and Hasan 1976; Hoey 1983), sometimes defended or criticised under the label ‘elegant variation’. In particular genres and domains, such as the traditional liturgy of the Anglican Church, synonyms are coupled together in a regular way and therefore of course share the same context, for example, trouble and adversitythe anguish and the griefdear and precious (all examples taken from The Treasury of Devotion: Carter [1869]1957). It cannot therefore be assumed that because non-synonymous co-hyponyms do not share all their collocations, colligations and semantic asso- ciations, synonyms such as beneath/underdistribute/hand round and consequence/ result will also typically not share such primings. Indeed the attractive prospect arises that perhaps the existence of characteristically shared primings will provide the conditions for a trustworthy definition of synonyms.The question then is: do synonyms share all their primings for most users? If they do, we would need to argue that because of their similarity of use synonyms get primed the same way and that the primings then transfer from the individual items to the small semantic set which contains them, presumably with some tidying up of discrepant primings in the process.To explore this, I shall return to the description of consequence again, because so many of the semantic associations and colligations have already been described.There is an example from the ‘logic’ association of consequence that provides tentative support for the view that synonyms may share primings for users. Among the one-off items that occurred with consequence in the ‘logic’ association was the item knock-on. This item only occurs once with consequence:
  5. . . . with the knock-on consequence of lower benefit upratings and public expenditure savings.
    consequence knock-oninevitable logical…LOGICknock-oneffect impact…OUTCOMEconsequence
    Figure 4.1 The semantic associations of consequence and knock-on
    This of course further illustrates the fact that not all manifestations of semantic association are also common collocates of the word with which they occur. The important feature of this word, however, is that it is primed for many users to occur with other items with a similar meaning to consequence. In particular, it occurs with effect and effects very frequently. Out of 280 examples of knock-on in its non-sporting sense, 251 accompany one of these words; a further 9 accom- pany items like benefits (i.e. positive consequences) and impact (‘an effect or influence’ – Macmillan Essential Dictionary 2003).What this means is that knock-on is strongly primed for semantic associationwith ‘logical outcome’, which its combination with consequence illustrates, though not prototypically. In other words, we have a situation that can be represented diagrammatically as shown in Figure 4.1, where unbroken lines represent instances of the association that also qualify as collocations and broken lines represent instances of the association that do not.Such an interweaving of two semantic associations does not however involve an extension of the notion of semantic association. We have as yet no evidence that the semantic sets ‘logic’ and ‘outcome’ are in a direct prosodic relation with each other, only that two of the associations around consequence and knock-on intersect. To examine whether semantic sets interact, we must consider another item with broadly the same meaning as consequence (candidate items include resultoutcome and effect). If it can be shown that such an item shares the associa- tions of consequence, then we can say that we have evidence of the situation described earlier, viz: ‘semantic category X, represented by items pand r, has a semantic association with semantic category Y, represented by items aand c’. The item I chose to investigate for this purpose was resultResult is a near synonym of consequence and for most language users they share a number of collocations (e.g. directinevitablelikelyone and as a). They also, for most language users, share the colligational primings:
    The result/consequence was that This was a result/consequence of
    In a concordance of 15,952 lines (result is greatly more common than consequence) there were 14,307 nominal uses of the word, of which 1,758 were immediatelypreceded by an adjective. As in Chapter 2, I sought to group the adjectives according to semantic similarity, using the categories established for consequence as a starting-point. I found that there were points of similarity but also of difference in the associations of the two words. Of the four associations associ- ated with consequence, only two appeared to operate for result. The first of these was the LOGIC association: 37 per cent of the adjectives associated with result commented on the logic of the process, for example:
  6. It was as a direct result of Britten hearing Vishnevskaya that he con- ceived the idea of having a soprano soloist singing with the choir in the Latin settings of the liturgies.
  7. The end result was world domination.
  8. The immediate result of their collaboration was the hit single Jumping Jack Flash.
    We can say therefore that result and consequence share the association of LOGIC, though it is a less dominant association for the former item (37 per cent as against 59 per cent).The second association to be shared by both items was the minor association of UNEXPECTEDNESS, although, for result, it is more minor still, accounting for only 4 per cent of the adjectives accompanying the noun. Still, UNEXPECTEDNESS is slightly over twice as likely to be encoded as EXPECTEDNESS, a situation similar to that pertaining to consequence.That exhausts the common associations of result and consequence, despite theirobvious similarity of meaning and apparent similarity of contexts of use (and as we shall see below, there are differences between the two items even in the shared association of LOGIC). The other associations identified for consequence are missing. To begin with, while we found that 15 per cent of adjectives accom- panying consequence were negative in tone, for result the proportion has halved (8 per cent). More significantly, the proportion of positive adjectives has risen from a stingy 3 per cent accompanying consequence to 22 per cent accompanying result. Put another way, the ratio of negative to positive adjectives for consequence is 5:1; for result it is 2:5. So result has a positive association, not a negative one. The picture is the same for the other association linked with consequence. SERIOUS- NESS accounts for 11 per cent of instances of premodified consequence but for only 2 per cent of such cases of result. In their place, result has other minor associa- tions: ACCURACY (4 per cent, as opposed to 0.7 per cent for consequence) and SAMENESS/DIFFERENCE (5 per cent, as opposed to no occurrences for consequence). Even the area of closest similarity hides difference. While both result and consequence have strong associations with LOGIC, they differ considerably with regard to which sub-categories they favour. For consequence much the most com- mon of these is INEVITABILITY, accounting for almost half of the LOGIC association,a fact also reflected in the fact that the most common adjectival premodifying collocate of consequence is inevitable. In the order of INEVITABILITY, (IN)DIRECTNESS and (UN)EXPECTEDNESS, the three sub-categories occur with consequence in the proportions 5:3:2. The proportions for result are quite different. By far and away the most common sub-category of the LOGIC association is the one reporting on the directness of the logical process being described or the stages involved in it; again this is reflected in the fact that the two most common adjectival collocates of result are direct and end. The (IN)DIRECTNESS sub-category accounts for 78 per cent of all LOGIC adjectives occurring with result. The proportions of the three sub-categories for result, in the same order as before, are 1:8:1. Thus even where there is a shared association, at a greater delicacy it is found to be only partly shared.It would seem then that extension of the notion of semantic association to cover synonymous relations should proceed cautiously. The evidence thus far is that synonyms are primed differently.The same pattern reveals itself with colligation. As well as sharing some colligations, consequence and result differ in important respects in their use in the Guardian (and, therefore, presumably for many language users). We saw in Chapter 3 that consequence favours indefiniteness compared with other abstract nouns. If we now compare consequence with result, we see that result colligates strongly with definiteness (see Table 4.3).It would seem that if a language user wants to talk of an outcome that is both positive and definite, they will typically be primed to choose result. If on the other hand they want to talk of a negative, indefinite outcome, consequence is likely to feel the more natural choice.It is important, however, not to overstate the position with regard to the typical primings of consequence for indefiniteness and result for definiteness. It is true that, proportionally, result is far more likely to be definite. In terms of absolute numbers, though, because of the much greater frequency of result in newspaper writing (and, one suspects, in speech and many other types of writing), there are as many instances of indefinite result in my corpus as there are of indefinite consequence. Secondly, though consequence is proportionally far more likely to co-occur with indefinite markers than is the case for the other abstract nouns examined, it is still in absolute terms more likely to occur with the definite article (or other markers of indefiniteness) than with indefinite markers.
    Table 4.3 The distribution of markers of (in)definiteness for consequence and resultDefinite Indefinite
    consequence249 (67%)125 (33%)result3,508 (94%)214 (6%)consequence28 (22%)20 (16%)76 (61%)1–result76 (37%)12 (6%)105 (51%)111 (5%)Table 4.4 The distribution of markers of indefiniteness across consequence and result a another one every any
    There is no contradiction in the above points. Colligations, collocations and semantic associations may be weak or strong, and their strength is measured against the frequency of the choice in the language as a whole. (I am aware of statistical problems with this formulation, which arise both from the fact that an underlying assumption of lexical priming is that there is no single, monolithic ‘language’ and from the fact that the multiplicity of factors that affect the possibility of the choice include the varying colligational and semantic association factors that I am here trying to describe and distinguish, but the point can at least be made validly with regard to near-synonyms and perhaps co-hyponyms.) The explanation therefore for the raw figures lies in the fact that indefiniteness is rarer in text than definiteness. The colligations of result reflect that rareness; those for consequence to some extent challenge it.Because of the overall greater frequency of result in my data, there are suffi- cient instances of its use with indefinite markers to permit a further comparison between the two synonyms, with respect to the markers of indefiniteness that the two words occur with. As before, we are only concerned with consequence and result in Subject function. Obviously the numbers for any and every would shoot up with reason in Complement and Object functions and for with conse- quence in Adjunct function (see Table 4.4).At first sight, the synonyms do not differ greatly in their priming for inde- finite markers, when used as Subject. They are alike in favouring one as the most common marker of indefiniteness, then a, followed by another, with any and every as the least common. But this disguises several interesting differences. Firstly, one is almost three times as likely to occur with consequence as the next most frequent option, a. Compare this with the frequencies for result, where one is barely more than a third more likely to occur than a. Secondly, another is nearly as frequent as with consequence, but only occurs one sixth as often with result. Finally, any does not occur with consequence in its outcome sense, a point we shall return to in the next chapter when we consider polysemy. Figures here, however, are not large and any conclusions drawn can only be tentative pointers to possible patterns of difference. And of course it must once again be reiterated that a corpus cannot determine what the primings of any individual will be; it can only suggest the kinds of primings that might occur (and, in the case of a text-type-specific corpus such as mine, indicate the receptive primings that aTable 4.5 The distribution of markers of definiteness across consequence and resultthe Possessive this/that None
    consequence247 (99%)2 (1%)––result3,278 (93%)97 (3%)118 (3%)13
    regular user of such texts might receive). A corpus can only tell us what the primings of an individual would be if that corpus was their exact and only linguistic experience.It was noted above that consequence in absolute terms occurred with a marker of definiteness two thirds of the time. This means that there is sufficient data to permit a comparison of the two near-synonyms in respect of their co-occurrence with the definite article, possessive constructions and the determiners this and that when they are serving as (part of the) Subject. As can be seen from Table 4.5, there are no examples in my data of this consequence or that consequence as Subject, whereas this result and that result do occur in Subject function, albeit relatively rarely. We can therefore postulate that consequence is also colligationally primed to avoid the demonstrativesWhat this means is that we appear to be primed never to choose to characterise an earlier statement as being a conse- quence, whereas we are comfortable about characterising a previous statement as a result. The reasons for this are not immediately apparent, but I would suggest that consequences may be unintended or unexpected and therefore un- predictable, whereas results may be expected and planned for (scientific results, football results, election results). It is easier to recognise a planned-for outcome and it may be more natural to want to discuss such outcomes, whereas if an outcome is unexpected, it will not be recognised by a reader or listener as being an outcome until this is pointed out to them. If this explanation holds water, it is evidence for an interlocking of textual decision and lexical decision, such that the combination has a direct but subtle effect on the grammatical patterns in which it can appropriately (as opposed to acceptably) appear.Cumulatively, with all caveats and cautions in place, the evidence suggests that synonyms are typically not identically primed. There are indeed shared primings, and in so far as there are, they reflect the close similarity of sense. But they also differ in important ways, the differences marking variations in use and context and providing a reason for the existence of the synonyms in the first place. We are arriving at a position where even small semantic sets comprising words with near identical meanings do not behave as sets often enough to warrant starting a description of priming with such sets. Words are individually primed – this is a central premise of priming theory – and it would seem that they remain individually primed.Synonymous expressions sharing a wordWe are left with one last possibility to explore. Sometimes we have expressions with the same meaning that share lexis but differ in their construction. An example of such a pair of synonymous expressions is round the world and around the world. Since they share the words the world and the morpheme round, the issue is no longer one of whether we should abstract from particular primings to semantic sets. Instead the issue is whether the primings of the nested combina- tion of round the world and the primings of the nesting of round with a- in prior position and the world in subsequent position are the same or different. If they are alike, we will have found the limiting case for distinguishing primings. If they are not alike, we will be looking at priming differences arising from a single sound/single letter morpheme (a-). (The possibility of morphological priming is discussed in Chapter 8.)On the face of it, we would expect around the world and round the world to be in free distribution. Intuition suggests they mean the same thing and it is easy to find examples which are closely parallel:
  9. . . . getting communities in Britain and round the world to really particip- ate in their own development.
  10. . . . as long as the green Heineken logo continues to appear on bars in Britain and around the world.
  11. Towns and cities around the world notched up record-breaking temper- atures last week.
  12. Cities round the world try to market themselves by presenting their best features while glossing over the worst.
  13. She seems to have spent a long time travelling round the world,. ..
  14. And there is certainly nothing new about people travelling around the world, then returning to Britain.
    It will be noticed that in each of these pairs the word sequences are not just being used in very similar ways; they actually co-occur with the same lexis. The existence of such parallel expressions suggests that they are primed for some users in similar ways. Of course, it could be that for such co-occurrences one speaker has round the world (and not around the world) primed while a second speaker is primed to use around the world (and not round the world). A corpus, after all, merges the primings of many different writers. However, I have no evidence to suggest that this is the case in this instance and shall operate on the assumption that the corpus is giving us a reasonable approximation of the way these word sequences are typically primed for most language users.The synonymous expressions round the world and around the world may be similarly primed, but they differ in frequency in my data. There are 448 of theTable 4.6 Frequency of potential collocates for round the world and around the world
    round the world (448)around the world (1,798)Ratioall a/round the world22461:2from a/round the world51991:20halfway a/round the world2869:2markets a/round the world5301:6people a/round the world3341:11race around the world2141:7SAIL a/round the world28142:1TRAVEL a/round the world10125:6a/round the world for4241:6a/round the world in26594:9a/round the world in 801892:1a/round the world in x days1792:1
    former and 1,798 of the latter, making around the world almost exactly four times as frequent as round the world. Furthermore, the figure for round the world is heavily distorted by the presence of 112 instances of the name Whitbread Round The World Race. All but one of these was removed, leaving the figures for the two expressions as 336 and 1,798 respectively with around the world being slightly more than five times as common in my data as round the world. (Four other instances with Whitbread were retained because they differed in what followed world – namely fleet (2) and yacht race (2).)Table 4.6 supports the initial impression that the synonymous expressions may be similarly primed for most users, showing as it does that they share a number of collocations. With the exceptions of people and race immediately prior to round and of for immediately after world, all the items listed in the table reach the threshold of recognition as collocates in WordSmith for both word sequences; in the case of these three potential collocates, the spread across the two expres- sions mirrors their overall relative frequency. Thus far, they behave as we originally predicted synonyms might. However, a glance at the table reveals a whole series of differences in terms of relative frequency. Cases where the distribution differs markedly have been emboldened. (The counts for SAIL and TRAVEL are for the lemma, rather than for the individual word forms.)The table suggests that there are differences in the strength of priming of the two expressions as regards their collocates. So around the world is strongly primed in the Guardian to occur with from and people, while round the world is primed to occur with halfway and the lemma SAILThe raw figures of occurrence for the lemma TRAVEL are more or less the same, but proportionally the collocation is much stronger for round the world. Curiously, given that the English title of Jules Verne’s book (and of the subsequent films) is Around the World in Eighty Days, itis 80 rather than eighty that is the primed collocate for both word sequences and when 80 days are being talked about, it is round the world that is the more common expression. Indeed it occurs twice as often as around the world with 80 days, despite being five times less likely to occur in general. (Was the original inappropriately translated, I wonder?)Further inspection of the data shows that the expected semantic associations for our synonymous word sequences follow the same lines as the collocations. So both word sequences have a semantic association with in  NUMBER, but round picks up five cases and around four. Examples are:
  15. . . . in their quest to sail round the world in 77 days.
  16. . . . Enza New Zealand’s record attempt to sail around the world in under 79 days.
    Around the world has a semantic association with in  MEASUREMENT OF TIME, with four instances of units other than days; there is no evidence of such an association for round the world. An example with both NUMBER and MEASUREMENT OF TIME associations in operation is the following:
  17. …a satellite link to take visitors ‘around the world in 8 minutes’.
    A second association that the word sequences share is that of JOURNEY, as might have been predicted from the collocations we considered above. Examples are 15 and 16 above and 18–20 below, which have been chosen to illustrate non-collocational instances of JOURNEY:
  18. He trudged around the world in his subject’s footsteps.
  19. The idea is deceptively simple: bum round the world, go to football matches . . .
  20. . . . is it really necessary to slog halfway round the world to watchseabirds killing one another?
    In my data there are 142 instances of JOURNEY  around the world (and around the world  JOURNEY) and 139 instances of JOURNEY  round the world (and round the world  JOURNEY). This suggests that once again the two expressions share the same priming but differ in the strength of that priming. Of the two, round the world is much more strongly primed, with 41 per cent of all instances of round the world conforming to this semantic association as opposed to a lowly 8 per cent of instances of around the world. (The proportion of round the world would have been still higher, had I not disallowed the 111 instances of Whitbread Round The World Race.) The difference between the synonymous expressions suggests an explanation as to why round the world occurs with 80 days twice as often as doesaround the world, despite the title of Jules Verne’s book in English. Nevertheless, in raw figures there are still slightly more instances with around the world.A minor semantic association that appears to belong only to round the world is that of MEASUREMENT, as in:
  21. Removing and disposing of the tape later was itself a problem, inciden- tally, since it is calculated that the Israelis bought enough to go seven times round the world.
    This occurs six times (as opposed to once with around the world ). Not important in itself, perhaps, but in conjunction with the previous semantic association and the collocations of SAIL and TRAVEL it points to a significant tendency that separates the otherwise synonymous expressions. Round the world is more literal with 171 occurrences (51 per cent) referring to the act of circling the globe, as opposed to 187 uses (10 per cent) of around the world.What we are really looking at with these data is a partial limitation on the uses of round the world. The expressions are synonymous as regards circling the globe and occur with almost exactly the same frequency. But around the world is clearly being used extensively in other ways. The most obvious of these is contexts where around the world means something like ‘all over the world’, there being no suggestion of direction, ordering or movement. Examples are:
  22. Inside their house is filled with curios from around the world.
  23. Fuji Garuji had caused the building of 20 pagodas around the world, often located in places of great historical significance and beauty.
    There are 1,611 instances of around the world with this looser sense, compared with 165 instances of round the world. So around the world is ten times more likely than round the world to be chosen to express the scale, scope or spread of a phenomenon. Crudely, if you are being literal, you are likely to be primed to go for round; if you’re being vague, your priming is probably to go for around. But – and this needs underlining – we are talking about biases, we are not talking about either expression monopolising a sense.The pattern, perhaps predictably in the light of our other comparisons in this chapter, is the same for the characteristic colligations of the two prepositional phrases. In this context a number of features were examined. In the first place I looked at the behaviour of both word sequences as postmodification of a noun head. Where it was unclear whether the word sequence was postmodifying or not, I erred on the side of caution and included the case in my count whenever the postmodification alternative made sense, even if it seemed the less likely interpretation. (Of course, the existence of such an endemic ambiguity, whichTable 4.7 Distribution of markers of (in)definiteness between round the world andaround the worldround the world around the world(336 instances) (1,798 instances)
    Definiteness (excl. possessives)751Possessives959
    seems never to trouble writers and readers, suggests that the problem lies in the grammar we are using.) On the other hand, all cases of the common construc- tion there is/are x a/round the world were treated as not postmodifying; they will be discussed separately below.Looking then at the use of round the world and around the world as postmodification, we immediately find that there is a big difference between the two word sequences in terms of their likelihood of occurring as postmodification. Close to half (49 per cent) of all instances of around the world are postmodifying as opposed to just over a fifth (21 per cent) of all instances of round the world. It is around the world, in other words, that is primed to occur as a postmodifier.Whichever word sequence is chosen, it is likely to occur with a plural noun head. Put the other way round, both word sequences seem to be primed to avoid singular noun heads, though the priming is again slightly stronger for around the world. Only 13 per cent are singular with around the world, as opposed to 21 per cent with round the world. The same pattern occurs with indefiniteness. Both word sequences typically occur in indefinite nominal groups. Again putting the point negatively, they seem both to be primed to avoid the definite article (see Table 4.7). On the basis of its crude frequency in the language, we might have expected the to have predominated but it is in fact much the rarer option for both word sequences; indeed possessives occur in each case as often as the definite article and demonstratives.So far the evidence for seeing around the world and round the world as free variants or, alternatively, as clearly distinct like consequence and result is ambival- ent. There are shared collocations, semantic associations and colligations, on the one hand; on the other, there are differences of weighting of priming and there is a cluster of collocations and an association that are effectively only associated with one of the expressions. The question, then, is: are there any colligations that clearly belong to one of the expressions and not to the other? The answer is that there is one such case.The colligation that distinguishes the two expressions is the clearest evidence we have that they are not in free variation. The word sequence round the world occurs 29 times as a premodifier, as in:
  24. . . . make you feel so guilty that you sign up as sponsor for a charityround the world sack race.
  25. Students’ round the world scam costs BT dear [headline]
  26. . . . Ffyona Campbell, 27-year-old round the world walker who com- pleted her 11-year, 19,586 mile trek on Saturday and now plans to raise a family.

Around the world occurs only once in such a grammatical role.

In conclusion, it would seem as if the synonymous word sequences we have been considering are primed similarly but distribute themselves differently across the lexical, semantic and grammatical terrain. Thus both expressions collocate with halfway and markets, but one of them is far more strongly primed than the other for such collocates. Both expressions can be used vaguely or to describe the circumference of the earth, but one is favoured for the first use and the other (proportionally) for the second. Both expressions can occur as postmodification or as premodification, but one occurs much more often as postmodification and the other is used almost exclusively when premodification is needed. The situation is similar therefore to the one we considered for consequence and result. The shared meaning means that there is overlap in the primings, but ultimately it is the difference in (the weighting of ) the primings that justifies the existence of the alternatives. The morpheme a-, which is all that distinguishes the word sequences, is as significant a difference as any other that we have considered.

All the evidence in this chapter supports the view that primings are distinctive to the word. Tucker (1996) holds a similar position with respect to antonyms. He shows how antonymous items such as like and dislike share some structures (e.g. I like/dislike dressing for dinner) but, crucially, differ in others (e.g. I like/

*dislike to dress for dinner). Likewise, Krishnamurty (2002) shows that antonyms have quite distinct collocational profiles. The assumption in this book has of course been that what is primed is the word, not the meaning of the word, and while semantic sets may, through abstraction from parallel primings, be themselves primed, the discovery that semantic sets, whether or not they make use of synonymy, co-hyponymy or antonymy, share only a limited range of collocations, semantic associations and colligations is simply confirmatory of that original assumption.

In the light of the above discussion, we may hypothesise that synonyms differ in respect of the way they are primed for collocations, colligations, semantic associations and pragmatic associations and the differences in these primings represent differences in the uses to which we put our synonyms.

But if we accept that it is the word (or word sequence or syllable) that is primed, not the sense, a new question comes into view. All my discussion and examples have glossed over the fact that, for example, consequence can mean

‘importance’ as well as ‘result’ or that reason can mean ‘logical faculty’ as well as ‘explanation’. We need to ask what happens with polysemous (or, more rarely, homonymous) items. Do the same primings apply, irrespective of the use to which a word is being put? And if they do not, how are they kept apart? That will be the subject of the next chapter.

Polysemy 81

5 Lexical priming and polysemy

Polysemy and definition

Sinclair (1987), commenting on the development of the Collins COBUILD English Language Dictionary, notes that each meaning of a word can be associated with a specific collocation or pattern. Much subsequent lexicographical practice has indeed been informed by this observation.

Sinclair’s observation is positively formulated. The point he is making is that a distinctive colligational or collocational pattern indicates a separate use of the word. As formulated, though, it would be possible in principle for two polysemous uses of a word each to have their own distinctive patterns for, say, 30 per cent of the time. Such a distribution would be clear-cut grounds for a lexicographer to make allowance for the polysemy thereby demonstrated. It would also reflect itself in the characteristic priming of the language user, with each polysemous use being primed for the patterns in question with which it was associated. It would still, however, leave undescribed the 40 per cent of cases which fell into neither set of characteristic patterns. For these 40 per cent of cases, presumably ambiguity would be an ever-present possibility.

Experience suggests (and Sinclair’s work elsewhere supports the view) that ambiguity in language in use (as opposed to decontextualised and fabricated examples) is a rarity. This suggests that Sinclair’s claim can be couched contrast- ively, such that the patterns of one use of a polysemous word always distinguish it from those of the other uses of the word. We are, I want to suggest, primed to recognise these contrastive patterns and to reproduce them. More precisely, I shall argue in this chapter that the collocations, semantic associations and colligations a word is primed for will systematically differentiate its polysemous senses and that ambiguity (or humour) will always result from our use of a word in ways not in accordance with these primings. If this position convinces, the meanings of a word will have to be interpreted as the outcome of its primings, not the object of the primings.

The drinking problem hypotheses

In the film Airplane, we are told of a pilot who is no longer permitted to fly because he has a ‘drinking problem’. The next shot shows him spilling a non- alcoholic drink all over himself; his problem is in fact that he misses his mouth when he tries to drink. The joke depends on the order of the words. If we had been told he had a ‘problem drinking’ it would have been sad rather than funny; on the other hand, if he had been described as having ‘a problem with drinking’ the joke would be back in place again. In other words, although the collocation between drinking and problem is the same in each case, there is only one gram- matical combination that can mean that someone has a problem getting liquid into their mouth or throat. It is not, I hypothesise, an accident that on the one hand we often need to talk about alcoholism and have a number of ways of doing so and on the other rarely need to talk about the physiological disorder and have only one way of doing so. The more common meaning of alcoholism in effect drives the rarer meaning into a grammatical corner. This observation leads to the following hypotheses, which I somewhat whimsically have termed the ‘drinking problem’ hypotheses (Hoey 1997b, 2004b):

  1. Where it can be shown that a common sense of a polysemous word is primed to favour certain collocations, semantic associations and/or colligations, the rarer sense of that word will be primed to avoid those collocations, semantic associations and colligations. The more common use of the word will make use of the collocations, semantic associations and colligations of the rarer word but, proportionally, less frequently.
  2. Where two senses of a word are approximately as common as each other, they will both avoid each other’s collocations, semantic associations and/or colligations.
  3. Where either (1) or (2) do not apply, the effect will be humour, ambiguity (momentary or permanent), or a new meaning combining the two senses.
    Drinking problem hypothesis 1 – the effect of the primings of a common sense on a rare sense: consequenceWe begin by looking at hypothesis 1, and for this purpose I have taken two words that are polysemous but have one use that is far more common than the other(s). The first of these is that of consequence (discussed here for the last time, you will no doubt be relieved to learn) and the second is that of reason. The more common use of consequence occurs 91 per cent of the time in my data, meaning that the two uses of this polysemous word occur in a 10:1 ratio. The situation with reason is still more extreme, with the more common use account- ing for 96 per cent of all occurrences of the word. The ratio of common to lesscommon use of this particular word is therefore 24:1. This means that both words are appropriate for testing the first drinking problem hypothesis, which of course predicts that the sense of these polysemous words that respectively occurs only 9 or 4 per cent of the time will be primed to avoid the collocations, semantic associations and colligations associated with the much commoner sense.
    Consequence (result) versus consequence (importance)Consequence has two senses – ‘result’ and ‘importance’, as noted at the end of Chapter 4. To explore these, I examined 1,809 instances of consequence in total, drawn from the Guardian-dominated corpus used in the previous chapter. The ‘importance’ meaning is much the rarer, only occurring with certainty 169 times in 1,809 lines. (One line was ambiguous between the two senses and one was counted as an instance of the ‘importance’ sense, though it was possible to read it the other way.) Furthermore, the ‘importance’ sense occurs in only one regular grammatical structure of consequence (though a number of other structures occur on single occasions) and that is the one I have contrived to use in the first clause of this sentence. We will give the ‘importance’ use a little attention first. The first colligational statement to make about consequence in its ‘importance’ sense is a negative one: there are next to no examples of it functioning as the noun head of a nominal group anywhere other than in a prepositional phrase.Only five examples occur in a non-prepositional phrase position, and three of these follow the verb HAVE, for example:
    1. Booth’s predicament would have little consequence had it not added a further molehill to the mountain of trouble and doubt established before.
      All five instances of non-prepositional consequence ( importance) occur in Object function, which we saw in Chapter 3 was the function that consequence ( result) avoided. There are therefore, self-evidently, no instances of consequence ( im- portance) functioning as noun head in a nominal group serving as Subject in a clause. What this actually means is that we never formulate our clauses with consequence ( importance) as their topic. This is a fact about the word, not about the word’s sense. The word’s closest synonym, importance, occurs quite naturally as topic. In the first 100 instances of the word importance extracted from my corpus, 15 occur as head of a nominal group functioning as Subject in the clause. A second colligational fact about consequence ( importance) is that it seems never to occur with a specific deictic. I can only attest one instance in my data and that is the ambiguous case already mentioned. This occurs in an article abouta legal dispute over pensions with the European Union and quotes the word sequence ‘the consequence thereof for other benefits’ from a Treaty of Rome directive. Without access to the original it is impossible to determine whetherTable 5.1 Items functioning as non-specific deictics (adapted from Halliday 1994: 182)

      UnmarkedTotalPositive Negativeeach every
      neither (not either)
      no (not any)PartialSelective Non-selectiveone another a(n)eithersome any
      ‘effect’ or ‘importance’ is meant – neither fits entirely comfortably. I will return to this example at the end of my discussion of consequence.Consequence ( importance) does occur with non-specific deictics, but onlywith a restricted subset of them. In the language taken as a whole the full range of choices available to the user are as shown in Table 5.1.Out of this range of possibilities, consequence ( importance) occurs quitefrequently with all the unmarked non-specific deictics (i.e. the final column, emboldened in the table) and, apart from just two examples of co-occurrence with a, with none of the others. We saw in Chapter 3 that consequence ( result), by contrast, avoids any; it also avoids some and no. This is an example of the drinking problem hypotheses in operation, since, as we saw, consequence avoids combination with any despite its colligational preference for indefiniteness. The reason for this, we can now see, would appear to be that the other sense of consequence, where it means ‘importance’, has a collocational preference for occurring with any. This turns out to be part of a systematic pattern of distin- guishing preferences and avoidance for the two senses of consequence.It is worth stepping aside from our argument for a moment to make two points. Firstly, the possibility of making a colligational statement of the kind we have just been observing also serves to provide cautious authentication of the grammatical categories/classification used in its formulation. Thus in this instance Halliday’s apparently complex grammatical classification tidies up the contrast between the two senses of consequence and will be seen to do the same for reason below. It does not however demonstrate that such categories have a prior or independent status from the colligations that they enable us to report. Indeed it will be argued in the next chapter that all grammatical systems bring together, and generalise out of, a multitude of colligational likelihoods, in the same way that colligational statements bring together and abstract from collocational ones. The second aside I want to make at this juncture is a point made already in Chapter 3 but worthy of reiteration – namely that that colligational primings are not grammatical rules, whatever importance they may be proved to have in the formulation and validation of such rules. This means that there is no such thing as a counterexample to a colligational statement. It is quite possible to encounter a sentence that is an exception to one or more colligational primings. Forexample, we saw above that consequence ( importance) rarely occurs other than as part of a prepositional group (five cases out of 169 examples) and that when it does it occurs with HAVE on three out of five occasions. We also noted that it virtually never occurs with any deictic other than unmarked ones (two cases out of 169 examples). Yet here is a sentence, not conspicuously unnatural, that manages to use consequence ( importance) as head of a nominal group, with a singular deictic and following a verb other than HAVE, in other words running contrary to all but one of my colligational claims:
    2. They long to do something, to run a town, to enjoy a decent smallconsequence.
      Importantly, though, it still conforms to one colligational priming – the one that says that if it occurs outside a prepositional phrase it will occur as Object.Turning now to a pragmatic association of consequence ( importance), we findthat if it is used with of to postmodify another noun, which happens 55 per cent of the time (93 instances), it may be used to affirm or deny the importance of the event or entity referred to in the clause, but it is much more likely to be denying it:
    3. Some were people of no great consequence.
    4. Shareholders have a right to expect that nothing of consequence is missing from the prospectus.
    5. This means there will be only one league match of consequence this weekend.
      Denial of importance is, in fact, three times as common as affirmation. The most common way in which importance is denied is through the inclusion of a negative unmarked deictic (as in 3 above) or a negative noun head (as in 4). Negativity is also sometimes attached to the verb:
    6. Joseph B. Vasquez’s Hangin’ With The Homeboys (Cannons, Haymarket etc., 15) hardly takes itself seriously enough to be considered a black movie of great consequence.
      A more indirect expression of denial takes the form of the use of the deictic any, for example:
    7. . . . one of the few remaining corners of any consequence to be found on the race tracks of the world.
      Here the implication is that almost all other corners are of no importance.This denial is even more pronounced when of consequence is used as an Adjunct, which it is in 64 cases in my data. Only 9 of these affirm importance. Almost all of the rest deny it within the prepositional phrase, for example:
    8. The minister’s cut off date, it adds, is of little consequence.
    9. The money she paid was of no great consequence.
    10. But my foibles are of no consequence.
      What all this means is that consequence ( importance) is primed pragmatically for denial, this occurring in the data examined 79 per cent of the time. There are however no instances in my data of consequence ( result) being denied. It is perhaps worth adding that there is no obvious reason why this should be so. After all, the related word reason, for example, is, as we have seen, frequently denied: That’s not the reason why . . . And result, which we have discussed as a near-synonym of consequence, can be denied (though it is admittedly not a common option):
    11. This Japanese achievement is not the result of working longer hours.
      This pragmatic association links up interestingly with two of the colligational primings of consequence ( importance). The first of these is that consequence ( importance) is primed never to occur as Subject (and therefore of course it never occurs as Theme). The second is that, when used as Adjunct (i.e. not as postmodification to some other noun), the phrase of consequence is likewise primedto avoid Theme position, at least in newspaper English. There are in fact only five instances of this use of the phrase in Theme position in the whole of my data (and one of these is genuinely ambiguous between the two uses of consequence, as I shall show at the end of this section). The explanation for both colligational phenomena may lie in the fact that, as an implication of the pragmatic association with denial, consequence ( importance) is much more likely to be used to describe the unimportance of something than its importance.When of consequence is used as postmodification, the incidence of occurrence in Theme increases but is still fairly low, only occurring there a quarter of the time. In Hoey (1996), on the basis of a much smaller set of data, randomly selected from the Bank of English, I hypothesised that the reason why of conse- quence did not occur in Theme as Adjunct was that we rarely need to thematise the unimportance of something (whereas the importance of something is fre- quently thematised). It is therefore interesting to note that, of the ten positive Adjuncts found in the larger data set, half are thematised, for example:
    12. Of more consequence may be two proposals aimed at giving spectators a full ration.
    13. Of even greater consequence, the participation of adolescents in society was of special interest in the latter part of the war.
      Given that none of the 55 negative cases are thematised, the results suggest that this explanation holds water. But of course the other explanation, and the one in focus here, is that this is an instance of the operation of drinking problem hypothesis 2 (see p. 82). According to this hypothesis the reason consequence( importance) avoids thematised position may be to avoid potential confusionwith consequence ( result), which we have seen occurs frequently in Theme.
      A summary of the priming differences between consequence(importance) and consequence (result)Table 5.2 couples what we have learnt about consequence ( importance) with what we learnt in previous chapters about consequence ( result), and shows that on a whole range of characteristic primings, the two uses of consequence system- atically differ, thereby supporting the first drinking problem hypothesis. On the basis of the evidence summarised in Table 5.2 it would be safe to argue thatGuardian readers are likely to be receptively primed in such a way as to distinguish the two senses along the lines suggested. They may of course not distinguish the two senses productively this way. For many users of the language the productive priming of consequence ( importance) will simply be that it collocates in preposi- tional phrases with of; the other features may not occur in their primings. Indeed for some users there may be no contrast in primings at all, for the simple reason that either consequence ( importance) does not occur in their speech, being reserved only for the reception of the writing of others, or it is not included in their vocabulary, either productively or receptively. This will not however pre- vent them from avoiding the collocations, semantic associations and colligations of consequence ( importance) when using consequence ( result), because all the
      Table 5.2 The contrasting collocations, semantic associations, pragmatic associations, colligations and textual colligations of the two uses of consequence

      consequence( result)consequence( importance)Collocation with anyCollocation with ofColligation with subject and complementPositiveNegativeSemantic association with LOGICSemantic association with NEGATIVE EVALUATIONPragmatic association with DENIALTextual colligation with themePositiveNegativeinstances they will have read (or heard) of consequence ( result) will have themselves avoided such features.
      Drinking problem hypothesis 1 – the effect of the primings of a common sense on rare senses: reasonIt may be felt that I chose myself an easy option with consequence. After all, it is overwhelmingly used only with of and it is quite possible that for many users of the language this is its one and only priming. I therefore now turn to another polysemous item – reason. This word has a number of senses. The first and much the most common sense is the one considered in the last chapter when we looked at the different postmodifying options available for reason and the primings that each nested combination had. The other senses are various but two dom- inate – ‘logic’ and ‘rationality’ – illustrated by the following two sentences (both authentic, though they look fabricated!):
    14. When they’re older, you can use reason.
    15. His ego has finally taken over his reason.
      Between them, reason ( logic), reason ( rationality) and an assortment of idiomatic uses account for 703 occurrences of reason in my data, just over 5 per cent of the 13,556 instances of reason I examined. All the remainder, needless to say, apart from 48 instances of reason as a verb, were cases of reason ( cause). According to the first drinking problem hypothesis, the rarer senses of reason should avoid the primings of the common sense. Since all the rarer senses haveto avoid the same collocations, colligations, and semantic and pragmatic associa- tions, I have for the time being treated these senses together.
      Some key primings of reason (cause)In this section, we will look at some of the characteristic primings of reason ( cause). From these we will generate a series of possible primings that we would expect reason ( rationality, logic) to avoid.The first set of such primings to be considered relate to the use of reason ( cause) as head noun in a nominal group functioning as Subject. The choice of Subject in itself is about average for nouns. This therefore is not a priming special to reason ( cause), though it is still a priming, for reasons that will be discussed in Chapter 8. Once the Subject has been selected, however, a number of strongprimings come into operation. These are found in Figure 5.1 which provides a great deal of information about the way that primings shape and, to some extent, restrict the choices we make when we use reason ( cause). However, my intention here is to focus on the effects that these primings have on reason ( rationality,
      reason as head of Subject 3,496 (out of 12,805) (27%)unmodified 1,649 (13%, 47%)modified 1,847 (14%, 53%)numbering 927 (7%, 50%)epithet 840 (7%, 45%)
      the reason496 (4%, 30%)the reason + BE428 (432)(3%, 86%)the reason (+ postmodification) 1,153 (9%, 70%)for this (x) 123 (1%, 11%)main263 (2%, 31%)real133 (1%, 16%)
      the reason+ BE + clause 209 (2%, 49%)
      the reason + BE + nominal group 103 (1%, 24%)the reason + BE + adjective 117 (121) (1%, 27%)the reason + BE + simple53 (0.4%, 45%)
      Figure 5.1 Clause patterns associated with reason ( cause) as head of a nominal group in Subject function
      logic). A number of key primings of reason ( cause) are apparent from our analysis. In the first place, we note that the triggers a range of further primingsOur first point of comparison must therefore be with regard to the deictics:
      1. On the basis of the drinking problem hypothesis, it is predicted that reason ( rational faculty, logic) will avoid the in any circumstance where the further primings of reason ( cause) might be expected.
        Secondly I note that reason is often thematised. In addition to the 3,496 cases in the figure, a further 994 prepositional phrase constructions are thematised,making reason Theme 35 per cent of the time. Although this is not an excep- tional proportion, it indicates that reason ( cause) has no aversion to Theme and may indeed be weakly primed for occurrence in Theme. Given the absolute frequency of reason ( cause) in my newspaper data, this means that it occurs with this textual function a great number of times.
      2. On the basis of the drinking problem hypothesis it is predicted that reason ( rationality, logic) will have an aversion to Theme or occur in Theme under distinctly different conditions, i.e. not with the or with any of the prepositions associated with reason ( cause).Thirdly, I draw attention to the priming of reason ( cause) for combinations with BE. As already noted, the diagram understates the frequency of these, but even as it stands, it is clear that we need to look at the way reason ( rationality, logic) combines with BE (or otherwise).
      3. On the basis of the drinking problem hypothesis it is predicted thatreason ( rationality, logic) will avoid Subject with BE.
        We need now to look at the patterns that are associated with the use of reason ( cause) as Complement. In my data there are 2,114 instances of reason ( cause) as the head of a nominal group functioning as Object but there are 3,620 cases ofits occurring as head of a nominal group in Complement function. While these figures do not suggest any aversion to Object, they certainly support the view that, in the Guardianreason is strongly primed for Complement function, and the vast majority of these occur with BEAs with the Subject, once the function of Complement has been selected, further primings come into view (see Figure 5.2). The figures for which as Subject do not discriminate between relative clause and question use. The figures for that as Subject, interestingly, include only five instances of the relative clause use.It would appear that reason ( cause) in Complement function strongly favoursthere and pronouns (apart from the personal pronouns she, he, I, you and they, of which there are only 38 in my data):
      4. On the basis of the drinking problem hypothesis, it is predicted that reason ( rationality, logic) will avoid the PRONOUN/there  BE  reason structure. We would expect that the pronouns this and that would be particularly avoided.
        We also note the strong association with denial with there, which both con- tributes to and reflects the association of Complement with denial noted in the previous chapter. The drinking problem hypothesis would accordingly lead us to
        reason as head of Complement 3,620 (out of 12,805) (28%)there as Subject 1,904 (15%, 53%)pronoun as Subject 908 (7%, 25%)denial of reasonwhich/whatit demonstrative1,307 (10%, 69%)162 (1%, 15%) 129 (1%, 14%)pronouns 562 (4%, 62%)this236 (<2%, 42%)that325 (>2%, 58%)Figure 5.2 Some key primings of reason ( cause) functioning as Complement
        predict that reason ( rationality, logic) will avoid co-occurrence with denial. Two of the most common forms of denial with there and reason ( cause) take the forms there is/was no reason and there isn’t/wasn’t any reason:
      5. On the basis of the drinking problem hypothesis, it is predicted that reason ( rationality, logic) will not co-occur with the unspecific deictics no and any.
        Having examined reason ( cause) functioning as (part of ) Subject and Com- plement, we now turn to consider the primings associated with another of its colligations, namely its association with prepositional phrases. Table 5.3 shows the distribution of reason ( cause) across the range of prepositions. The figures are for all prepositional phrases in which reason is head of the nominal group
        Table 5.3 Occurrences of prepositions preceding nominal groups with reason( cause) as head

        preceded by the preposition. No care has been taken to eliminate the effects of verb choice (e.g. the effect of turned on turned it into a reason for . . . ), since such effects are largely overridden by the quantity of data considered.As can be seen, the prepositions it is particularly primed to occur with are for and as. Examples of the uses of reason ( cause) with for and as are the following:
    16. The first ‘political’ Chancellor for ages; and hailed by political pundits, on appointment, for precisely that broad brush reason.
    17. For the same reason, all signs to Belgrade have been blotted out.
    18. For this reason, the backers need to make the first approach.
    19. However, broken blood vessels were given as the reason for his disap- pointing performance.
    20. But when things go wrong, commentators who have nothing more constructive to offer pick on T-shirts and stubble chins as a reason.These data would lead us therefore to expect reason ( rationality, logic) to avoid such prepositions.
      1. On the basis of the drinking problem hypothesis, it is predicted that
      reason ( rationality, logic) will avoid occurring after for and as.
      In subsequent sections we will test each of these hypotheses (though, for reasons of clarity of exposition, I will take prediction (e) out of order).
      Prediction (a): reason, the and the other specific deicticsWe predicted above that reason ( rationality, logic) would avoid co-occurrence with the. Given, however, that the deictics proved a fruitful source of compar- ison between the two senses of consequence, it makes sense to broaden our comparison of the two senses of reason to see how they compare across all the specific deictics. Table 5.4 presents the frequency of occurrences of reason ( cause) with the different specific deictics available for combination with it.
      Table 5.4 A count of the instances of reason ( cause) occurring with the different specific deictics (classification adapted from Halliday 1994: 181)

      DeterminativeInterrogativeDemonstrativethis 426 that 152 the 4,503which(ever) 4 what(ever) 123Possessivemy 32 your 7 our 7 his 45 her 12its 8 their 31 one’s 1 X’s 21whose(ver) 4Table 5.5 A count of the instances of reason ( rationality, logic) occurring with the different specific deictics (classification adapted from Halliday 1994: 181)

      DeterminativeInterrogativeDemonstrativethis 0 that 0 the 4which(ever) 0 what(ever) 0Possessivemy 7 your 1 our 0 his 10 her 1its 3 their 9 one’s 1 X’s 1whose(ver) 1
      Interpreting these statistics is not as straightforward as it might seem since all the deictics are common words in the language. For the moment we will simply note that reason ( cause) appears to be colligationally primed in the Guardian with demonstrative determinatives; 39 per cent of all instances of reason ( cause) occur with one of them (usually the).If we now compare the distribution of instances of reason ( rationality, logic)across the same set of grammatical possibilities, we find almost the mirror opposite of the distribution found for reason ( cause) (see Table 5.5). It will be immediately noticed that whereas reason ( cause) had an apparent positive priming for demonstrative determinatives, reason ( rationality, logic) occurs hardly at all in such a context. Only four instances of reason ( rationality, logic), constituting a tiny 0.6 per cent of all such instances, occur with the and there are no instances at all with the other two demonstrative determinatives. Prediction(a) has therefore been fully confirmed.It might be argued that this is because the meaning of reason does not permit such choices. However, imagination, with which reason ( rationality, logic) is often coupled, occurs quite comfortably with the. The following example from my data is a prime piece of evidence in support of my position:
    21. Natural selection enriches and disciplines the imagination by the reason- ing faculty.
      Apart from illustrating imagination with the, it shows a periphrastic expression (the reasoning faculty) that seems to have been used with no other purpose than to avoid invading the territory of reason ( cause). It also shows that there is sometimes a need to use the definite article with the rarer sense of reason.Broadly speaking, then, we have evidence that where reason ( cause)is characteristically positively primed for a colligation with a class of specific deictics, we have avoidance by reason ( rationality, logic) of that class. With this in mind, it is productive to look at the four exceptions in Table 5.5, where the priming of reason ( rationality, logic) to avoid the definite article was overridden. After all, these represent a challenge to prediction (a), albeit not a powerful one:
    22. In the age of the New Man, the New Sense, the New Reason, a return to decent values, I was a social leper, a cultural dinosaur, a sex junkie . . .
    23. Instead of being equally shared between its two rulers, the Reason and the Imagination, it falls alternately under the sole and absolute dominion of each.
    24. And by conscripting the unconscious, magnetizers claimed they could restore those parts that the conscious Reason or Will wouldn’t reach.
    25. This name led me to wonder whether the plant which Shakespeareknew by this name (‘The insane root which takes the reason prisoner’) was this common hedgerow native . . .
      Let us deal with the quotation from Shakespeare first. Two explanations offer themselves. The first is that there has been a drift in the primings associated with reason ( rationality, logic) over the centuries; colligation and collocation can be presumed to be subject to the same possibilities of change as grammar and lexis have traditionally been recognised to be. The other is that that Shakespeare, known to be highly creative with other aspects of the language, is here being sowith colligation and collocation. Preliminary discussions with linguists holding medieval corpora suggests that the latter may be the better explanation.Turning now to the other three instances, one of the first things of note is that they show reason ( rationality, logic) with a capital letter. There are 31 instances of reason ( rationality, logic) in my data with a capital letter, exclud- ing those cases where it is part of a title or the initial word in a sentence or headline. This means that over 4 per cent of uses of reason in its rarer senses are capitalized, a huge proportion compared with most words (other than names).This in turn means that it is primed among Guardian writers for capitalisation and this priming allows it to override the need to avoid the. There is no equiva- lent tendency for reason ( cause) to be capitalised. If we discard a handful ofcases where the first couple of words of an article have been capitalised, titles and a handful of sentence-initial cases, there are only nine instances of reason ( cause) that are capitalised in my corpus – less than 0.1 per cent of cases. Thisis incidentally a good example of a priming that cannot be assumed to belong to all users of the language, and of course it is only possible in written genres.The other characteristic that the first three instances share is that they all involve coordination or listing. This turns out to be a highly characteristic feature of reason ( rationality, logic). Re-examination of the full set of data for reason ( logic, logical faculty) shows this to be an important priming in my data. Table 5.6 shows the various patterns of coupling with other nouns that were found featuring reason ( rationality, logic), together with their absolute fre- quency in my data and the percentage of instances of reason ( rationality, logic) that occur in each pattern. It shows that over a third of instances of reasonTable 5.6 Patterns of coordination and listing associated with reason( rationality, logic)
      h and h h (n)or h
      h over h
      h (rather) than h
      h, h, h
      hq and hq
      Other patternse.g. I always put my faith in reason and kindnesse.g. He does not believe in freedom of will, the effectiveness of his literary works, in reason or revolutione.g. The Association of Metropolitan Authorities said the new inspection system was a ‘triumph of political dogma over reason’e.g. This sound account of his preferences – for the original textures of a church, for poetry that succeeded when he responded to instinct rather than reason, for endlessly alive chintzes, and for proto-Socialism – has been updatede.g. Clarity, elegance, reason, perfectionism: these, as John Willett has reminded us, were the guiding principles of Brecht’s theatree.g. Here was a subject in which all her obsessions met: the nature of faith and the mechanics of reason, the darkness of enlightenment, the old debate of nature– nurture given flesh, the cruelty of certaintye.g. She uses reason as well as power to achieve her objectives15535
      • 1%
      • 1%


      ( rationality, logic) occur in some kind of coupling or listing structure. The table shows that reason ( rationality, logic) is primed in the Guardian and else- where to be coupled with other abstract nouns. (Indeed, as four of the examples indicate, the coupling may occur inside larger lists.) This is not reflected in the data for reason ( cause). Indeed I could find no instances of its occurring at all (though with 12,805 instances it is not possible to be categorical, even after a slow and careful inspection). We will see below that this is not the only case where usage of the more common sense appears to be affected by the characteristic primings of the rarer sense.
      Prediction (e): reasonnoany and the other unspecific deicticsEncouraged by the results reported in the previous section, I am choosing to go out of order in terms of our earlier predictions and we therefore turn immedi- ately to the distribution of unspecific deictics across the senses of reason, two ofTable 5.7 A count of the instances of reason ( cause) occurring with unspecific deictics (classification adapted from Halliday 1994: 182)

      SingularUnmarkedTotal Positive Negativeeach 1 every 281
      neither 1 not either 0 no 2,311 not any 159Partial Selective
      Non-selectiveone 1,059another 277a(n) 1,002either 0 some 553 any 68
      Table 5.8 A count of the instances of reason ( rationality, logic) occurring with unspecific deictics (classification adapted from Halliday 1994: 182)

      UnmarkedTotalPositive Negativeeach 0 every 0
      neither 3 not either 0 nor 1
      no 4 not any 0PartialSelective Non-selectiveone 0 another 0a(n) 0either 0some 0 any 3
      which – no and any – were predicted to differentiate the common sense from the rarer sense(s). Table 5.7 shows the frequency of occurrence of reason ( cause) with the range of possibilities identified by Halliday (1994). The table showsstrong priming for unmarked and partial unspecific deictics (with the exception of the either set) and much weaker priming for total positive unspecific deictics. In addition to our earlier prediction, we might expect, on the basis of our consideration of the specific deictics, that reason ( rationality, logic) will avoid the primed areas but possibly occur with either or neither. Halliday does not include or and nor in his table, presumably because they are coordinators. In the counts that follow, it is therefore assumed that an instance of reason preceded by either of these items partakes of the deixis attached to the noun with which it is coordinated. (As it happens, one of the occurrences uses nor in place of neither – see example 33, p. 98 – so this instance is included in my count.)As Table 5.8 shows, reason ( rationality, logic) avoids all but a tiny handful of the unspecific deictics. Once again, then, reason ( rationality, logic) is seen to avoid the primings of its bigger neighbour. There are just four occurrences withno, three with neither, one with nor and three with any.It might seem that the explanation this time lies simply in the fact that reason ( rationality, logic) is an uncountable noun and therefore has no need to occur with unspecific deictics such as and one, but that is to look down the telescopeTable 5.9 A comparison of reason ( rationality, logic) with the nouns with which it is coupled in terms of their ability to appear in the plural or with an unspecific deictic

      Reason and kindness Reason or revolution Dogma over reason Instinct rather than reason
      Clarity, elegance, reason, perfectionism
      The nature of faith and the mechanics of reason
      Reason as well as power. . . one still hears stories of kindnesses to students. . . there were no other socialist revolutionsGone are the old certitudes and dogmas of classThese instincts don’t die immediately with the child . . .Such clarities appear noticeably absent in 1991He was suddenly a warrior, entitled to bear the weapons of history and to display their deadly elegancesMust these faiths inevitably be in conflict with one another?Just as bad is an uncontrolled judiciary with unlimited powers. . . it would be a kindness to draw a veil over themThere will be a revolution by the big clubsPRP is nothing more than a dogma . . .‘There may be an instinct which says “I want to go hunter-gathering over the hills” ’ he saidGeography seems to lend it a clarity and objectivity it does not possessShe has an elegance, a simplicity, a purity of characterA perfectionism never before evident in your life now afflicts youThe enemy that matters is consumerism, a faith in itself, the green equivalent of LuciferCentral Office will disband the association, a power some regard as legally questionable

      the wrong way. It is, I would argue, an uncountable noun because it has to avoid the primings of reason ( cause). Consider for a moment the list of examples given in Table 5.6 (all drawn, of course, from my data). If we examine the nouns with which reason is coupled on each occasion, we find that they can almost all occur in the plural or with an indefinite article. To demonstrate this, I repeat the examples in Table 5.9 in the first column and in the second and third columns I provide an example of the noun with which it is coupled being used in the plural and with a(n) respectivelyI have tried as far as feasible to illustrate the plural or indefinite use with an instance close in meaning to that used in coordination with reason.A glance at the table shows that the only word coupled with reason that is not attested in my data as having both plural and indefinite article uses isperfectionism. While intuition suggests that it is more than probable that a much larger corpus would still have thrown up no plural uses, it is perhaps worth saying that the word only occurs 38 times in the 100 million words of my data and occurs with the indefinite article even so. All the other nouns have both plural and indefinite uses. Our word reason is the exception therefore in havingneither, and it is my argument that this is a direct consequence of the need to keep its colligations apart from those of reason ( cause).The general picture, then, is that reason ( rationality, logic) avoids unspecific deictics and reason ( cause) favours them (apart from those associated withalternatives), and the non-count nature of the former is argued to be a con- sequence of this situation and not the explanation. But, as before, there are a handful of exceptions – four occurrences of reason ( rationality, logic) with no, four with neither/nor and three with any. Let us look at these challenges to the drinking problem hypothesis.We can quickly dismiss the instances with no (and, as it happens, neither); these are listed below, along with the solitary example with nor:
    26. There is no rhyme or reason here.
    27. There was no rhyme or reason to it.
    28. Waiting lists vary greatly, with no apparent rhyme or reason 
    29. Sadly there is no rhyme nor reason in simple boxing terms to the diatribe.
    30. . . . there is neither rhyme nor reason about the numbering.
    31. It is just that they seem to be doled out with neither rhyme norreason.
    32. Neither rhyme nor reason in contemporary poetry [the title to an article].
    33. nor wit nor reason can my passion hide’.All of them, including, perhaps disappointingly, the instances of neither, are versions of a familiar idiom, and reason, though it retains some of its meaning of ‘logic’, is not functioning in any way independently of the other elements of the wording. Since the idiom also accounts for the instances with neither, we do not need to adduce the drinking problem hypothesis to account for the occurrences of reason ( rationality, logic) with neither when reason ( cause) avoids it. Sinclair (1996, 2004) uses such examples to argue for working with lexical items (which may be many words long), rather than with words, a position we shall return to in Chapter 8.We are left once again with a quotation from Shakespeare (from Twelfth Night). This time, his use needs no special pleading in that it of course complies with the rarer reason’s priming for coupling (as, in fossilised form, does rhyme or reason).The three occurrences with any are more interesting:
    34. If there is any intelligence or reason, Red Star will play in Zagreb . . .
    35. And along with their arrogance, their fees swelled beyond any reason.
    36. There is also considerable speculation as to whether the Iraqi leader shows any reason at all.Example 34 illustrates yet again the coupling tendency of reason ( rationality, logic), though a residual ambiguity persists for this reader. Examples 35 and 36, however, are genuinely ambiguous and therefore support the third drinking problem hypothesis.Example 35 is, I think, irresolvably ambiguous in that it can mean in context two equally justifiable things. The ambiguity of example 36 can be resolved , but it is only when it is set in its fuller context that it becomes unambiguous:
    37. For George Bush to put the problems down to ‘the misguided actions of one man’ ignores both the facts and the roots of wider feelings. However, the focus on Saddam does not stop at substituting his per- sonal reasoning for the collective experience of Arab peoples. There is also considerable speculation as to whether the Iraqi leader shows any reason at all. From the time Kuwait was invaded the sanity of Saddam began to be questioned. At first it was unclear whether the invective was meant metaphorically or literally . . .(© Guardian Newspapers Limited 1991)I will examine these cases in the final section of this chapter when I look at all the ambiguous cases together that have been thrown up by the analysis.
      Prediction (b): reason, possessive deictics and ThemeThe best place to start an exploration of prediction (b), which was that reason ( rationality, logic) would avoid serving as Theme except under conditions that were clearly distinct from those of reason ( cause), would seem to be with regard to an area where the rarer sense(s) would seem to be uninhibited by the behaviour of the commoner sense. Tables 5.3 and 5.4 showed one such area where reason ( rationality, logic) holds its own against its powerful neighbour, and that is in respect of its combination with the possessives. Just under 5 per cent of instances occur with a possessive, which, as a proportion of the total number of instances, is a higher proportion than that achieved by reason ( cause) ( just over 1 per cent). Even here, though, closer inspection shows that reason ( cause) has the whip hand, and Theme is, as predicted, the place where this is demon- strated. It is instructive to look at the distribution of possessives for the different senses of reason (see Table 5.10). (For the purposes of calculation, identificationof first position takes no account of subordinators or coordinators nor of auxilia- ries in interrogatives.)As can be seen, reason ( rationality, logic) is typically primed to avoidcombining with a possessive determinative in sentence-initial position, i.e. as part of Theme. Not one case of POSSESSIVE DETERMINATIVE  reason ( rationality, logic) occurs in such a position. On the other hand, in this combination reason ( cause) appears to have no aversion to being sentence Theme.Table 5.10 The distribution of possessive pronoun  reason across the sentence
      reason( cause) 1st position in sentencereason( rationality, logic)1st position in sentencereason( cause)Other positionsreason( rationality, logic)Other positionsmy24087your3041our5020its2063his1403110her7051their20299
      If one inspects the possessive combinations with respect to their grammatical function, one finds that POSSESSIVE DETERMINATIVE  reason ( cause) occurs frequently as Subject. Of the 57 thematised cases, 44 have this function. Fur- thermore, of the instances in non-sentence-initial position, 21 are also head of a nominal group functioning as Subject, where they serve as clausal Theme. This means that 46 per cent of instances of POSSESSIVE DETERMINATIVE  reason ( cause) are part of either sentence or clause Theme. By contrast only four (13 per cent) of the non-initial instances of reason ( rationality, logic) are (part of ) Subject, and one of these is not the head of the nominal group in which it appears and is therefore immediately distinguished from reason ( cause). What, then, we have is evidence of the rare sense (or senses) of the word reason struggling to avoid encroaching on the Theme primings of the word’s most common sense.As with previous exceptions, it is instructive to examine the three remaining instances of reason ( rationality, logic) that defy the general pattern and occur as Theme for their clauses. They are:
    38. I am distraught with you and my thoughts and reason are confused.
    39. I do not want to be a dictator but my reason and conscience told me I had to submit the proposals.
    40. Men, her reason told her, would be shocked.
      Example 40 is an instance of reason ( rationality, logic) invading the colligational primings of reason ( cause). The presence of told, however, with which reason ( rationality, logic) collocates and with which reason ( cause) never occurs, counterbalances the effect of this. The other two make use again of the charac- teristic priming of the rarer sense for coupling with another abstract noun, and one of them also utilises the collocation with told just mentioned.Table 5.11 A comparison of the frequency of co-occurrence of possessives with reasonand reasonsPossessives with reason ( cause) my 32 your our his 45 her 12 its their 31(out of 12,805 instances) one’s 1 X’s 21Possessives with reasons ( causes) my 19 your 17 our his 103 her 31 its 33 their 101(out of 8,259 instances) one’s 0 X’44
      In other words, what we are seeing is that when reason ( rationality, logic) is used in an area where reason ( cause) is also used, the way that other primings of the former are used mark its occurrences out clearly, thereby eliminating any possibility of ambiguity. Furthermore, the commoner sense does not make asmuch use of deictics from that area as might be predicted on the basis of the frequency of the words in the language and the behaviour of the plural. As we have seen, the proportion of instances of reason ( cause) making use of the possessive is just over 1 per cent. That this is significantly lower than might have been expected is suggested by a comparison with the plural reasons. The word reasons is only used to mean ‘causes’, there appearing to be no instance of reason( rationality, logic) in my data, and this means that there are no pressures on itwith regard to the possessives. Table 5.11 shows the raw frequencies of possessives occurring with the singular reason and the plural reasons.In total there are 356 instances of possessives occurring with reasons (amount-ing to a little over 4 per cent of cases) as opposed to 164 instances with reason ( cause). In other words, once reason ( rationality, logic) is removed from the equation, the frequency increases fourfold proportionally and falls in line withthe proportion of instances occurring with the rarer sense(s), as we saw above. This further supports the claim made in the first drinking problem hypothesis that we make use of the common sense of the word in such a way as to avoid unnecessary conflict with the primings we have for the rare sense of the word.
      Prediction (c): reason and Subject with BEWe have just seen that POSSESSIVE DETERMINATIVE  reason ( rationality, logic) avoids Subject function. Prediction (c) was that, given that reason ( cause) is primed to colligate as Subject with BE and the other equatives (SEEMBECOMEGROW), there would be no instances of reason ( rationality, logic)  BE etc. where reason was the head of its own nominal group.Investigation of my data revealed that out of the 703 instances of reason ( rationality, logic), just 10 challenge this prediction. We can therefore broadly confirm that the rarer sense of reason does indeed avoid combination with BETheten cases, however, represent a challenge, albeit not a large one, to the prediction and therefore need to be examined more closely, as we have done for previoussets of exceptions. One of these we have already seen in our consideration of prediction 2 – example 37 – and that needs no further discussion. The remain- ing nine instances are as follows:
    41. Human reason is a fallible guide, they murmur . . .
    42. But what comes across is the idea that reason is always dangerous . . .
    43. If sweet reason and turning a blind eye are not enough and you feel you have to use some form of punishment, do so without excessive anger or physical force.
    44. . . . Shaffer comprehends the revenge motive while suggesting thatreason is valueless unless informed by feeling.
    45. . . . nobody has ever suggested that reason is what gets things done in the Middle East.
    46. Reason is a political entity, and never more so than when its claim is to have transcended politics’ [quotation from Stanley Fish].
    47. Sadly, balance and reason have been rare on both sides.
    48. Lust and reason are enemies.
    49. ‘Beware of popular revolts when reason becomes helpless,’ he said . . .
      Three of these illustrate the coupling priming – 43, 47 and 48 – though 43 is untypical in having a coupling with a non-finite clause. This leaves six challenges to account for. Of these, two manifest characteristic collocations of reason( rationality, logic). Human is a collocate of reason ( rationality, logic), thecombination occurring in my data nine times (1 per cent of all occurrences), and sweet is a strong collocate, occurring 24 times with the rarer sense (3 per cent of all occurrences). Neither epithet occurs with reason ( cause) on any occasion.We can eliminate one more of the challenges by reference to semantic asso- ciation. The rarer sense(s) of reason can be shown to have a semantic association with EMOTION with 46 occurrences of the association occurring in my data. (This is incidentally an instance of a semantic association that is not syntactically tied.) Other examples are:
    50. John Monie comes from that cerebral, calculating school of Australian coaching whose motto might be ‘Reason before Emotion’.
    51. And it is true that the cult-minded people I’ve known as an adultmistrust their reason with a passion I find terrifying.
    52. Jill Barton, of the Surrey Wildlife Trust, spends much of her life trying to reinstate reason into a county gone mad for roads.
      This semantic association can be seen in example 44 ( feeling of course being theEMOTION component), and, like the coupling and the collocates told, human andsweet, has the effect of counterbalancing the intrusion of reason ( rationality, logic) into the territory of reason ( cause).We are finally left with three challenges to the prediction – examples 42, 46 and 49. Two of these are in quotation marks, which to some extent separates them from their immediate textual surroundings. Also, all three have of course no determiner, so they are in that respect in conformity with the characteristic priming for reason ( rationality, logic). Neither factor is sufficient to fullycounterbalance the effects of choosing the reason ( rationality, logic)  BE etc.combination. (After all, with good reason and by reason of also lack a determiner and reason ( cause) can begin a quotation.) However, given the frequency of BE in the language, I would argue that three instances (0.1 per cent of the data) do not constitute a refutation of prediction (c).
      Prediction (d): there/PRONOUN BE reasonThe fourth prediction, made on the basis of the common primings of reason ( cause) in Complement function, was that the rarer sense(s) of reason would avoid the PRONOUN/there  BE  reason structure. This proves to be uncomplicatedly confirmed. There are only seven challenges to the prediction, and these are quickly distinguished from the uses associated with reason ( cause). Five have already been given as examples 21, 22, 24, 25 and 30 in our consideration of the occurrences of reason ( rationality, logic) with the deictics, and need no further discussion here. The remaining two challenges are:
    53. Just two doors away, in Committee Room 9, there was more clarity and reason . . .
    54. Fourth, they are anti-reason.
      One is a coupling; the other a nonce word using reason as one of its components. The fourth prediction is accordingly confirmed.
      Prediction ( f ): reason with for and asPrediction (e) was of course handled out of sequence in order to treat the deictics together. We are therefore left with our sixth and final prediction that reason ( rationality, logic) will avoid being preceded by for and as. As far as as is concerned, the data for reason ( rationality, logic) are clear-cut – reason( rationality, logic) does not occur with preceding as, not even with sentencesubordinator as. Examination of the data with regard to for, however, reveals seven challenges to the prediction. Two of these are straightforwardly instances of coupling:
    55. ‘It should also raise its voice for reason and dialogue in the face of the mounting intolerance and racism in our republic,’ he said.
    56. . . . he will have struck early for decisiveness and reason in policy making.
      A further two are also instances of coupling, but in one case the coordination is with a word sequence rather than a single word and in the other the coupling takes one of the minority forms. In the latter case the presence of for is ac- counted for by the fact that the verb substituting is typically primed to collocate with for.
    57. Applauded for his reason and willingness to compromise, Meyer was modernising ossified union structures . . .
    58. The writer seeks to draw attention away from his lack of cogent argu- ment by substituting ridicule for reason.
      The remaining three, like 58, are explicable in terms of a competing collocational priming for for by a variety of words:
    59. Dr Kohl took a tough line on trade union wage demands and called forreason among striking workers. [Collocation of called with for]
    60. His image is of a character who stands for reason . . . [collocation ofstands with for]
    61. They can only hope that the doctors will eventually be grateful for cerebral sweet reason . . . [collocation of grateful with for]
      The last of the above examples, of course, also features the characteristic collocation of reason ( rationality, logic) with sweet. We can therefore safely conclude that once again a prediction generated by the first drinking problem hypothesis has been confirmed.More generally, we have found that all the predictions generated by drinking problem hypothesis 1 have been confirmed. It would seem to be the case that when polysemous words are primed in our minds, the rarer senses get primed in such a way as to ensure that their primings do not overlap with those of the most common sense.
      Drinking problem hypothesis 2We saw above that even where one sense of a word is overwhelmingly the most common, it still prefers proportionally to avoid the primings associated with the rarer senses; examples of this were the relative avoidance by reason( cause) of possessive pronouns and capitalisation. For this reason (to use a by now all too familiar expression), I do not intend to linger long on drinking problem hypothesis 2. Much of the detail of the argument would be of the same kind as already presented. I shall simply look, without consideration of exceptions or detailed statistics, at a word with two senses that are clearly distinct and approximately as common as each other in my data. The word I have chosen is immunity, defined in Macmillan English Dictionary for Advanced Learners as follows:
      immunity 1. [C/U] a situation in which someone is not affected by some- thing such as a law because they have a special job or position:  from immunity from prosecution ♦ grant sb immunity They would be granted immun- ity if they gave evidence in court. DIPLOMATIC IMMUNITY 2 [singular/U] the protection that someone’s body gives them against a particular disease  toIt is possible to develop an immunity to many illnesses.
      The virtue of this polysemous word is that its separate senses are quite distinct (though not unconnected). There are therefore no problems in deciding which sense is intended in any instance. As we shall see in the next section, this is by no means always the case with polysemous senses.There are 574 instances of ‘legal’ immunity and 102 of ‘medical’ immunity in my corpus, which is a better ratio than obtained for either consequence or reason but still leaves the ‘legal’ sense six times as common as the ‘medical’ sense. However, the figures mask the existence of a number of multi-word lexical items which do not require analysis. These are public interest immunity certificate, which occurs exactly 100 times, with public immunity certificate and interest immun- ity certificate accounting for another 62 cases between them. With these word sequences removed, the ‘legal sense’ drops to 412, leaving us with a ratio of very close to 4:1 in my data. Add to this the fact that the proportions are certainly distorted by the media’s predilection for stories about political intrigue and we can feel comfortable that the spread permits investigation of the second drinking problem hypothesis.Analysis of these data produces the set of contrasts shown in Tables 5.12,5.13 and 5.14, which show the collocations, semantic associations and colligations of each sense compared with their (non-)occurrence in connection with the other sense.Table 5.12 shows the collocations of the two senses. Lemmata are capitalized. As can be immediately seen, the collocations associated with the one sense are avoided by the other.The semantic associations for both senses of immunity are given in Table 5.13. Examples of the semantic association of ‘legal’ immunity with GIVE, TAKE AWAY and SEEK are the following:Table 5.12 A comparison of the collocates of the two senses of immunity ‘Legal’ immunity ‘Medical’ immunitycollocates with certificate (12 instances) no instances (122 if omitted multi-word itemsare included)collocates with prosecution/prosecutor no instances (67 instances)collocates with parliamentary (55 instances) no instances collocates with LIFT (27 instances) no instancescollocates with legal (15 instances) no instances collocates with CHARGE (15 instances) no instances collocates with committee (11 instances) no instances collocates with CLAIM (10 instances) no instances collocates with diplomatic (10 instances) no instances collocates with sovereign (7 instances) no instances collocates with Crown (6 instances) no instancesno instances collocates with NATURAL (11 instances)no instances collocates with infection (8 instances)1 instance collocates with ACQUIRE (5 instances)
      Table 5.13 A comparison of the semantic associations of the two senses of immunity ‘Legal’ immunity ‘Medical’ immunityhas semantic association with GIVE71 instances 6 instanceshas a semantic association with TAKE AWAY115 instances 5 instanceshas semantic association with SEEK38 instances 2 instanceshas semantic association with CRIME13 instances no instanceshas semantic association with GET2 instances 16 instanceshas semantic association with INCREASEno instances 8 instances
    62. . . . his lawyer tells me his client has been granted immunity from prison because of his age.
    63. Yesterday they offered virtual immunity from prosecution to those involved in minor acts during the riots if they came forward to give evidence.
    64. They will be stripped of their immunity when they give up their seats.
    65. . . . there will be a third attempt to cancel my MP’s immunity from prosecution.
    66. . . . investigators cannot call him in unless his parliamentary immunityis lifted.SEEK
    67. . . . a sovereign borrower can claim legal immunity or ignore a for- eign court’s judgement by pleading ‘force majeure’ (superior power).
    68. As Bill Clinton seeks immunity from the latest charges against him . . .
      Examples of ‘medical’ immunity’s semantic association with GET and INCREASE are:
    69. You get an infection, you develop immunity, and you either get well or you die.
    70. This type of immunity can be naturally acquired in two ways . . .INCREASE
    71. . . . most flu sufferers build up an immunity as they recover.
    72. . . . a booster within six to 12 months increases immunity for up to 10 years.
      The colligations for both senses of immunity are given in Table 5.14.It will be seen that the two senses of immunity do indeed avoid each other’s primings (in the Guardian, at least). While this is not surprising in the case of the collocations, it being difficult to imagine contexts in which the ‘medical’ immu- nity could be parliamentary or the ‘legal’ immunity natural, it is not obvious in
      Table 5.14 A comparison of the colligations of the two senses of immunity‘Legal’ immunity ‘Medical’ immunitycolligates with from  NG124 instances 2 instancescolligates with possessive112 instances 12 instancescolligates with classifiers116 instances 16 instancescolligates as noun modifier to a noun head34 instances 2 instancescolligates with to  NG6 instances 25 instancesthe case of the semantic associations and colligations. There is no self-evident reason why ‘medical’ immunity might not be given, for example by means of a course of treatment, nor any reason why such immunity might not be possessed, for example by a lucky patient. Similarly there is no reason why ‘legal’ immunity should not be ‘created’. Most strikingly, the very presence of the occasional from and to following the ‘medical’ and ‘legal’ senses, respectively, is evidence enough that there is no reason for believing that there are grammatical obstacles occur- ring in such a position with these senses. And yet the two senses for the most part do not stray into each other’s colligations.Furthermore, the collocations, colligations and semantic associations listed account for a significant proportion of the data for each sense. So the colligations listed in Table 5.14, for example, account for almost two-thirds (271) of instances of ‘legal’ immunity. In total, 377 of the instances of ‘legal’ immunity in my data have one or more of the primings listed for it. In other words, 92 per cent of occurrences of ‘legal’ immunity are immediately distinguished from the other sense of immunity. Even ‘medical’ immunity, for which far fewer primings have been identified because of the relative sparsity of data, has 52 per cent of its instances covered by at least one of its primings. Drinking problem hypothesis 2 is therefore affirmed.
      Polysemy versus vaguenessI could be accused of having made my life easier in the previous section by choosing a word that has two distinct senses and for which ambiguity is a rarity. What happens, though, when a word has two senses that are closer than the legal and medical senses of immunity? Does the second drinking problem hypothesis still operate? The answer is yes and no. My analysis has depended upon my having been able to allocate every example to one or other sense. In fact, though, senses may blur. The word tea has two (or three) distinct senses – the drink, the leaves from which the drink is made and the meal at which light refreshment is taken. The first two can for some purposes be treated as a single sense, though there are clear differences in their use. The third, though, is on the face of it quite different and it is possible to partake of the meal tea and not drink any.Nevertheless, in a range of contexts it is impossible to be sure whether whatis being described is the meal tea or the sharing of a pot of tea. An example is the following:
    73. . . . the cost of an hour gossiping over tea can never approach the expense of a visit to a restaurant or wine bar.
      A restaurant is somewhere where you would go for a meal, like the meal tea; on the other hand a wine bar is somewhere where you would go for a drink, like the drink tea.Rather than allocate such instances to one or other sense, it seems better to say that such usages draw on both senses. This is not a contradiction of the drinking problem hypotheses, but a recognition that vagueness is sometimes sought after rather than avoided in language.As a fuller instance of this, let us look at a word we have already considered in some detail that has two senses that are distinct but are much closer thanimmunity, namely reason. When I began my examination of the differences in the typical primings of reason ( cause) and reason ( rationality, logic), I noted that I was conflating in the rarer ‘sense’ two closely related but largely distinct senses – reason ( rationality) and reason ( logic). It is time to separate these senses and see what happens.Rationality is part of the apparatus of the human being, the human’s ability to reason. Logic, on the other hand, is abstract, belonging to no one in particular. There are 455 instances of reason ( logic) in my data and 175 instances of reason ( rationality).If we look at the characteristic primings of the two senses, we find that reason ( rationality) has the following typical primings, at least as regards newspaper data:
      • it colligates with possessives (17 per cent);
      • it has a semantic association with LOSS (12 per cent), with the collocation lost, which occurs 8 times, accounting for 40 per cent of the occurrences of the semantic association;
      • it collocates with sweet (24 occurrences), human (7 occurrences), voices of (8 occurrences), voice of (34 occurrences), sleep of (8 occurrences) and told (6 occurrences); in total these collocations, which with one exception occur singly, account for 51 per cent of instances of reason ( rationality).
        Reason ( logic), on the other hand, never occurs with either possessives or LOSSIt also never collocates with sweethuman voice(s)sleep or told, though it does occur with says. I therefore infer that reason ( logic) avoids the primings of reason ( rationality).The primings of reason ( logic) are the following:
      • it has a colligational preference for coordination (or other coupling) with abstract nouns or word sequences, occurring in such combinations 217 times (48 per cent of all occurrences of the logic sense);
      • it has a colligational aversion to any deictic – if we discount instances of rhyme (n)or reason, which might justifiably have been treated separately along with within reason and stands to reason, there is only one instance of a deictic and that is from the quotation from Shakespeare already discussed (example 33).
      It is not as clear as it was the other way round that reason ( rationality) avoids the primings of reason ( logic). I have already noted that reason ( rationality) has a positive colligation for the possessive deictics and it mops up all the other deictics associated with reason ( rationality, logic). However, as we have seen, there are relatively few of these. We cannot therefore argue that reason ( rationality) is avoiding the territory occupied by reason ( logic). Likewise, reason ( rationality) couples with another noun (abstract or other- wise) moderately rarely; there are 24 instances of such combinations (14 per cent of all occurrences of the rationality sense). This cannot however be seen as avoidance, though it certainly indicates a typical difference of priming for the two senses.The truth is that reason ( logic) is recognisable as such when, and only when, one of its regular primings is apparent. Still more so, reason ( rationality) is only recognisable as such when one of its primings is present. The fact is that I recognise the ‘rationality’ meaning in these instances exactly because they are accompanied by possessives. If the primings are absent, the distinction between reason ( logic) and reason ( rationality) disappears. An instance of a merged sense might be:
    74. Galileo remains a seminal masterwork about the eternal struggle ofreason against superstition and self-betrayal.
      As with the tea example, nothing is gained by a spurious attempt to resolve the vagueness. It simply does not matter whether a reader interprets reason here as rationality or logic. The distinction has been erased.
      Drinking problem hypothesis 3 – ambiguityAnother example of the blurring of senses is the word sequence without good reason, which I have treated in my analyses above as an instance of reason ( cause), partly because there is an equivalent expression, without good cause. However, the use of reason here, with no deictic, can be thought to draw a little on the ‘logic’ sense of the word. The equivalent expression without good cause might be thought, similarly, to draw upon the sense of cause as a set of positions which one strongly supports (as in fights for the cause). The point, predictablyenough, is that we are all primed slightly differently and may to different degrees draw on our primings to arrive at likely interpretations of what we hear.There is a fundamental difference between vagueness and ambiguity. Vague- ness occurs when it is unclear what precise meaning can be assigned. Ambiguity, on the other hand, occurs when there are two or more entirely precise meanings assignable to a sentence, and they are different meanings. The third drinking problem hypothesis refers to this sense of ambiguity when it proposes that,whenever one sense of a word invades the primings of another, the effect will be humour, ambiguity (momentary or permanent) or a new meaning combining the two senses.It will be remembered that there was a genuine, albeit momentary, ambiguity in the sentence included in example 37, repeated here for convenience as 75:
    75. For George Bush to put the problems down to ‘the misguided actions of one man’ ignores both the facts and the roots of wider feelings. However, the focus on Saddam does not stop at substituting his per- sonal reasoning for the collective experience of Arab peoples. There is also considerable speculation as to whether the Iraqi leader shows any reason at all. From the time Kuwait was invaded the sanity of Saddam began to be questioned. At first it was unclear whether the invective was meant metaphorically or literally . . .(© Guardian Newspapers Limited 1991)The problem here is simple. In the first place, show collocates with reason ( cause), though the same vagueness that applied to without good reason applies here equally. Secondly, reason ( cause) is primed to occur with any, as we have seen. On the other hand, reason ( rationality) has, as we have seen, a preference for the Object function. The cohesive ties between reasoning and reason and between reason and sanity appear to resolve the ambiguity in favour of reason ( rationality), but, without these textual factors, we would presumably feelunable to resolve the ambiguity.The textual dimension turns out to be important in the following two ambiguous cases of consequence:
    76. But Article 7.1(a) of this directive which allows members to retain equality in state pensions also refers to ‘the consequence thereof for other benefits’.
    77. His reading of Voltaire’s Philosophical Dictionary made him briefly an atheist (Boyer beat religion back into him, to lifelong effect), but of more lasting consequence was his discovery of twenty-one sonnets by a Wiltshire clergyman called William Lisle Bowles (1762–1850). The sonnets struck Coleridge with the force of revelation . . .The ambiguity of the first arises in part from the fact that it has been recontextualised and we have no access to the source text. In other words, the absence of information on cohesion and other textual influences means that we are denied crucial, potentially disambiguating, information. The ambiguity also arises from the fact that priming is posited to be genre and domain specific. The quotation is from legal language and it is more than possible that a corpus ofsuch language would reveal a law-specific priming of consequence ( result) forcollocation with thereof; my newspaper-dominated corpus, however, attests onlythe single instance. Out of context, the only priming that the example displays is the presence of the, which strongly points to the meaning being that of ‘result’. Otherwise the grammatical pattern in which it appears is characteristic of neither sense of consequence. The use of for is a weak priming of consequence( importance), with 4 per cent of instances of this sense occurring with for.This priming is reinforced slightly by the presence of a parallel weak priming for consequence’s synonym importance. On the other hand, less than 1 per cent of instances of consequence ( result) are postmodified by for. Both senses make sense in the context. In the end, though, the presence of the, the guessed-atcollocation of thereof and the simple improbability of a legal document needing to refer to the importance of anything make this reader resolve the ambiguity in favour of ‘result’.The second instance is perhaps the more interesting and the more genuinely ambiguous. The presence of of before consequence points apparently unambigu- ously to the ‘importance’ sense. The fact that it is thematised, however, runs counter to the typical priming of consequence ( importance), this being associ- ated typically with consequence ( result), though we saw earlier that thematisation of consequence ( importance) is associated with affirmation of importance rather than denial, and this instance is being affirmed. Out of context, then, there is no ambiguity. However, the sentences before and afterwards are full of the language of cause-effect. Here is the example again in the full paragraph in whichit appears:
    78. The destruction of the Bastille in 1789 drew from Coleridge impas- sioned verse in praise of freedom and ‘glad Liberty’, and the fevered excitement inspired throughout Europe by the early days of the French Revolution provided the intellectual climate in which his radical conscience began to form. His reading of Voltaire’s Philosophical Dictionary made him briefly an atheist (Boyer beat religion back into him, to lifelong effect), but of more lasting consequence was his discov- ery of twenty-one sonnets by a Wiltshire clergyman called William Lisle Bowles (1762–1850). The sonnets struck Coleridge with the force of revelation, seeming to him, in their natural use of language and heartfelt expression of personal feeling, unlike anything he had ever read. He made over 40 transcriptions of the sonnets ‘as the best presents I could offer to those who had in any way won my regard’, and in his own poetic experiments of the next few years found an important model in the work of the now-forgotten Wiltshire poet.

As can be seen, the whole paragraph (which incidentally comes not from the 96 million word Guardian chunk of the corpus but from the British National Corpus supplement) is concerned with different effects on Coleridge, with all

the words and word sequences marked in bold signalling a different effect or affect. The textual dimension appears to forcibly support the interpretation of consequence here as ‘result’. But then again the description of the model in the last sentence as important counters this somewhat, since it could be interpreted as cohesive with consequence, both arguably referring to the importance of the discovery of Bowles’ work to the development of Coleridge as a poet.

The truth is that this use, unlike example 75 above, is permanently ambigu- ous, and whereas the textual dimension resolved the ambiguity in the reason example, here it creates the ambiguity. In short, the ambiguity in this instance arises out of a conflict between a local priming and a textual pattern. Does this mean that we have reached the limits of what priming can account for? Do we need to posit two pressures on the language user – the pressure of the lexical priming and the pressure of the textual imperative? In preparation for that question, the next two chapters concern the textual dimension to lexical priming.

114 Text: two claims

6 Lexical priming and text: two claims

Some claims about textual priming

Lexical priming has so far been talked about in terms of sentence-internal features, though references to priming for Theme have referred to textual colligation and have thereby hinted at a textual dimension. The discussion of ambiguity at the end of the previous chapter, however, showed that a theory of lexical priming needs to address the issue of how text organisation might be affected (or created) by lexical priming, and how conversely it might affect it. Ideally, text should refer here to both speech and writing, but what follows is limited for reasons of space and corpus construction to written text only. I shall argue in Chapter 8 that there are textual and pragmatic matters that stand separate from lexical priming, but we should not be in a rush to dismiss discourse issues as indepen- dent of and unconnected with the matters we have been considering in this book. In this chapter I shall argue that lexis is in fact intimately bound up with decisions of discourse organisation.

As a way of arguing this position, I ask the reader to consider once again the clause that begins Bill Bryson’s Neither Here Nor There:

  1. In winter Hammerfest is a thirty-hour ride by bus from Oslo . . .
    If all you knew was that this was the first clause of a text, I think you would have certain, fairly clear expectations about the way it might develop. You would expect there to be further discussion of Hammerfest but you would not expect the text to be about Oslo or winter, though you would be unsurprised if the ride was discussed. You might expect some explanation as to where Hammerfest is and why it takes so long to get there. These expectations would become near certainties if you were told (as you already have been) that the clause in question in fact begins a travel book.The first kind of expectation is associated with the cohesion of the text, the second with its semantic organisation, and they would normally be discussed astext-linguistic features. I have myself discussed both features in such terms; Hoey (1991a), for example, deals with the properties of cohesion and in Hoey (1983, 2001) I consider some of the features of text organisation. Here, how- ever, I want to suggest that your expectations are part of your priming of the vocabulary of the clause. Your experience of the word sequence in winter has led you not to expect cohesive chains of lexical items referring to winter and still less to anticipate that a text beginning with this word sequence will be about winter. Your experience of place names in Subject function in travel writing however has strongly predisposed you to expect a chain of lexical references to the place, with the place as topic. Likewise your experience of place names has probably been such as to expect mention of location and transport. (Or maybe not – as I have repeatedly emphasised, priming is personal and varies from language user to language user.), In short, the lexis in Bill Bryson’s sentence – in all sentences – is textually primed. Just as many features of our personality are apparently latent in our genes from the moment of our birth, so also many of the features of a text – its organization, its cohesion, its chunking – are latent in the lexical items we select. (The analogy is of course inexact, in that no one, I assume, ever composes, still less utters, a text, knowing in advance what the whole text will look like, but then one cannot predict from a child’s genes how he or she will turn out.) At the end of this and the next chapter, we will look at how, and to what extent, Bill Bryson conforms to our expectations.I want to argue that words may be textually primed in three ways:
    • Words (or nested combinations) may be primed positively or negatively to participate in cohesive chains of different and distinctive types (textual collocation).
    • Words (or nested combinations) may be primed to occur (or to avoid occurring) in specific types of semantic relation, e.g. contrast, time sequence, exemplification (textual semantic association).
    • Words (or nested combinations) may be primed to occur (or to avoid occurring) at the beginning or end of independently recognised discourse units, e.g. the sentence, the paragraph, the speech turn (textual colligation). (We have considered positioning within the sentence in earlier chapters and here greater attention is given to the supra-sentential aspects of textual colligation.)

    Throughout this book I have been at pains to stress that priming is genre and domain specific in the first instance, though there are many primings that apply across generic and domain boundaries. Predictably, the specificity of the priming is at its greatest when the priming relates to discourse properties, since these have been shown to vary greatly according to genre and text purpose (e.g. Swales 1990; Bhatia 1993).The claims are formulated in such a way as to permit of the possibility of a lexical item having not only a positive or negative priming but also neutral priming with regard to each of the named features. This distinguishes these potential primings from those so far mentioned. I would hypothesise that all words are primed for one or more collocations, semantic associations and colligations, even if these are on the face of it unremarkable; the ubiquity of colligation in particular is discussed in Chapter 8. I do not however hypothesise that the three kinds of textual priming listed above are properties of all words (or their nested combinations). Of course, if an overwhelming majority of words in English were to prove to be not primed in any of the ways mentioned, we would have to regard it as refutation of the claims I am making, or at the very least as limiting their applicability and interest value.
    Claim 1: words (or nested combinations) may be primed positively or negatively to participate in cohesive chains (textual collocation)As was noted in Chapter 1, collocation is usually regarded in corpus linguistics as a local phenomenon characteristically operating within a short distance of the word in focus (usually three words before and after) and yet the term has also been used in text linguistics (by Halliday and Hasan 1976) to describe the relationship between lexical items in a text that helps to create cohesion in the text – relationships such as bee – honeydoor – windowcandle – flame and mountaineering – peaks (all examples drawn from Halliday and Hasan 1976: 286– 7). I want to suggest that the two apparently contradictory uses of the term can be reconciled if we see the cohesive links a word forms in a spoken or written text as its ‘textual collocations’. Scott (1999), in his help file, distinguishes ‘coherence collocates’ from ‘neighbourhood collocates’, a distinction that seems to make a similar point.If Halliday and Hasan’s examples seem plausible (as they do to me at least), it is because the lexical items they cite are primed to co-occur not in a span of three or four words on either side but in the larger textual environment. So just as ride is primed to collocate with bus, so it is also primed to occur with travel and, importantly, with other occurrences of ride across sentences/utterances in a discourse – and, interestingly and by happy chance, ride is one of the words considered by Halliday and Hasan (1976: 287), and the examples just given of textual collocation are drawn from their discussion. In a particular text, if the writer goes along with their textual collocational priming, the result will be a text that contains cohesive ties between ride and other instances of ridetravel and bus. Textual collocation is therefore what lexis is primed for and the effect of the activation of this priming is textual cohesion. The textual collocation of a word with itself, which results in cohesion by repetition, and the textualcollocation of a word with its proform, which results in cohesion by reference, are simply special (albeit numerically extremely common) cases of the more general phenomenon.Cohesion can be seen as either an integral factor in the creation of coherence in a text (e.g. Halliday and Hasan 1976) or as an epiphenomenon of that coherence (e.g. Morgan and Sellner 1980), and both positions have some merit. It is not, however, necessary to decide between the two perspectives from the point of view of lexical priming. It is sufficient to note that cohesion is a recognisable phenomenon in a text and has been shown to correlate in interest- ing ways with coherence (e.g. Hasan 1984; Hoey 1991a, 1991b; Parsons 1995), and to recognise that part of our knowledge of a word is a knowledge of the ways in which it is capable of forming cohesive relations.There are two claims here. The first is that one of the things we know about the words we use is that we expect them to participate in cohesion, or not. Put more formally, words (or nested combinations) may be primed to participate in cohesive chains or links, or to avoid them. The second claim is that we know what kind of cohesion to expect. Words are primed to participate in cohesive chains or links in quite predictable ways. A chain is here defined as three or more items linked by textual collocation, as here defined, including repetition. A link occurs when just two items are so connected.A cohesive chain may be made up of simple repetition (i.e. repetition with no variation other than that allowed for by grammatical regularity); an example is that of the word claim in the previous paragraph. (Though that is not to say that claim cannot form other kinds of cohesive links.) Alternatively, it may be made up of simple repetition, co-referential expressions and pro-forms. Consider the following paragraphs:
  2. With a spare hour on my hands before lunch in Lebanon this week, I revisited the joys of my childhood, crunched my way across the old Beirut marshalling yards and climbed aboard a wonderful 19th-century rack-and-pinion railway locomotive. Although scarred by bullets, the green paint on the wonderful old Swiss loco still reflects the glories of steam and the Ottoman empire.For it was the Ottomans who decided to adorn their jewel of Beirut with the latest state-of-the-art locomotive, a train which one carried the German Kaiser up the mountains above the city where, at a small station called Sofar, the Christian community begged for his protection from the Muslims. ‘We are a minority,’ they cried, to which the Kaiser bellowed: ‘Then become Muslims!’(The first two paragraphs of ‘The irresistible romance of a steam train scarred with the bullet holes of battle’ by Robert Fisk, The Independent, Saturday 12 February 2005, p. 37)A chain is begun with a wonderful 19th-century rack-and-pinion railway locomotive that continues with the wonderful old Swiss loco, the latest state-of-the-art locomotive and train. The longer a chain is, the more it appears to be related to the topic (though it needs to interrelate with other chains as well) (Hasan 1984).A cohesive link is the same except that there are only two members in the chain. An example from the passage is the Ottoman Empire and the Ottomans, where the cohesion takes the form of simple repetition (Hoey 1991a). (There are no further mentions of the Ottomans in Robert Fisk’s text.) Cohesive links tend to be less closely associated with the topic of the text. Examples of words that form neither cohesive links nor chains in the passage (or in the remainder of the text) are hour and crunched.
    Positive priming for cohesionAs noted above, the priming I am positing operates at two levels. At the first level, a word is primed to participate in cohesive chains or to avoid them. At the second, the type of textual collocation that the word is primed for is specified. Starting with the first level, all the following items are examples of words typically primed to participate in cohesive chains (or in cohesive ties) in news- paper writing under specifiable conditions: armyBlairgayplanetpoliticalyear. The list is intended to indicate the range of words that are primed for cohesion. So we have common nouns, names and descriptive adjectives. As will be apparent, the words listed are, in their major senses, non-evaluative and with clear denotations. Though it may be the case that few words that are evaluative or have weak denotations (e.g. ridiculousmakeaction) are primed for cohesion except in special domains and genres, it certainly is not the case that non- evaluative and readily-defined words are always so primed. Words such as thirty and hour are not primed for cohesion in travel writing; the semantic set PLACE NAMES, on the other hand, in this particular kind of writing is so primed, but onlystrongly so when it is in Subject function – an instance of cohesive priming operating on a nested combination of PLACE NAME  SUBJECT. I have noted elsewhere in this book that such priming always starts for each of us as the priming of an individual name, say Hemel Hempstead or Cromer. From individual instances it is hypothesised that the priming is transferred to the set.To show the way that cohesive priming works and the limitations of its operation, let us look more closely at the first of the instances in the list given above. The word army is typically primed for participation in cohesion. A sample of 65 different texts was examined which contained the word army. Only in- stances with a lower case ‘a’ were considered, since it could not be assumed that the behaviour of Army would be the same. Titles were also not considered, and the statistics that follow for this and other words do not include cohesion between title and the body of the text. For reasons that will be explained below, the combination army of was excluded, too. Although concordances were usedfor the basis of the selection, great care was taken to ensure that each instance taken was drawn from a different text. (Otherwise the statistics would of course have been grossly distorted; a single text with 13 instances of army would for example have contained a fifth of the data.)Of the 65 texts examined, 23 used the instance of army within a cohesive chain (which often of course contained further instances of the same word) and 15 used it in a cohesive link, which means that 58 per cent of the texts that contained the word army used it cohesively. This suggests that a reader used to reading the Guardian might become primed to expect it to participate in cohes- ive relations.The observation just made started from the point of view of the text. If we focus on the word, the picture is still clearer. The proportion of instances of army that are cohesive is as already noted very much higher than the proportion of texts containing the word – a text using army cohesively will usually contain two or more instances of the word, whereas with only a few exceptions a text that does not utilise army cohesively will only contain one instance of the word. Looked at from the point of view of the word, 81 per cent of all instances of army examined were found to be being used cohesively.The first thing to note is that, here as previously, priming operates differentlydepending on the nesting. Instances of non-cohesive army are almost one and a half times more likely to make use of the colligational priming army  NOUN than are instances of cohesive army. Perhaps more important, and certainly statist- ically more secure, is the fact that when army makes use of the collocation army of, its tendency to be cohesive drops dramatically, and, as already noted, allinstances of army of were excluded from the analysis described above. The reason is that this collocation is itself primed for metaphor: 63 out of 108 in my data are metaphorically employed (the battered army of home-ownersan army of general practitionersBritain’s huge army of investment salesmen). Of the remaining 45 non- metaphorical instances, 15 are the word sequence army of occupation and 10 utilise the semantic association army of NUMBER. Interestingly, these particular combinations conform to the original generalisation about the positive priming of army for cohesion. So what we have is (for most users, in the context of newspaper writing) a positive priming of army for cohesion, except where army is followed by of, in which case there is a negative priming for cohesion, unless what follows of is occupation or NUMBER in which case once again we have a positive priming for cohesion. Such complexity is expected to be routinely the case with priming (and not only in connection with textual priming).
    Negative priming for cohesionIn our consideration of army, our starting point has been that it is positively primed, unless otherwise affected. Some words are on the other hand negatively primed (e.g. asinineblinkcrossroadselusiveparticularlywobble).All the lexical items in this list appear to be primed to avoid occurring in cohesive chains, though they are no less frequent in my corpus than the items in the previous list. Indeed it is worth dwelling on that fact. One might have predicted that a word’s infrequency in the language would make it less available for participation in cohesive chains, but frequency does not seem to be an important factor. As I noted in Hoey (2004a), however, it is much harder to establish that something does not occur and it is a painfully slow process to move from each concordance line into the original text to check for possible cohesion. The items in the above list must therefore be regarded only as provisional.As before, the list is indicative of the range of the words primed to avoid cohesion and again I present the evidence for the first only. There were 32 instances of asinine in my corpus and of the texts in which they occurred, all but one used it in neither a cohesive chain nor a cohesive link. The solitary excep- tion makes use of near-synonyms (the chain being stupidasinineinanitiesstupid- ityimbecilitiesinanity) but not of repetitions. It is perhaps worth adding that this negative priming for cohesion is not entirely explicable in terms of its being an evaluative adjective; so, after all, is racist and that is positively primed for cohesion.As we saw in Chapter 2, in our discussion of sixty and 60, when it comes to priming, orthography cannot be ignored. The word crossroads does not appear to favour appearing in cohesive chains and so is included in list 2; the word Crossroads, on the other hand, is the title of a former British soap opera, and, given the British love of soap opera, chains freely.
    Priming for cohesive links or short chainsSome words do not participate in cohesive chains but occur quite frequently in cohesive links. This points to the fact that there are words which are primed to form only brief chains or to participate in cohesive links without chaining. Instances are the following: agooptionreasonsixty.With these words, the pattern is for them not to be part of long cohesivechains. However, they are primed to occur in cohesive links or in short cohesive chains (one option … another option .. . a third optionsixty .. . 35 … five). These chains characteristically are localised within the larger text.Once again I will illustrate the claim with the first item on the list. I examined a sample of 40 texts containing one or more instances of ago with respect to whether ago formed cohesive chains. I found that only 2 texts out of the 40 (5 per cent) contained chains of ago, though a much higher proportion used ago for a single link – 16 (40 per cent) contained such a link. The remaining 22 texts (55 per cent) only had non-cohesive instances of the word. From the lexical perspective, examination of 50 instances of ago revealed that only 6 of themoccurred in cohesive chains. The rest divided evenly between forming a single link and forming no cohesive relations whatsoever. Although examination of 40 texts is a time-consuming business, the data cannot be considered sufficient for confident generalisation, but the evidence we have allows us to claim tentatively that ago is negatively primed for cohesive chains but positively for cohesive links.
    Priming for type of cohesionI mentioned earlier that the nature of the cohesive chains for each lexical item varies, and we have just seen an example of this. Whereas option is almost always repeated without variation, sixty is normally part of a chain containing co- hyponyms (i.e. other numbers). This is normal and a part of our priming with regard to cohesion. Each word that is primed positively for cohesion is also primed, I want to argue, for a particular type of cohesion. Thus in a particular genre or domain a word may not be primed to appear in chains made up of simple repetition but it may on the other hand be primed to occur in cohesive chains made up of its hyponyms.The word gay will serve as a useful example. Fifty texts were examined that contained one or more instances of the word gay (in its sexual sense) and 36 of these (72 per cent) participated in cohesive chains or links (25 chains, 11 links). From the perspective of the word, there were 131 instances of gay in the 50 texts, and 116 of these were cohesive, meaning that 89 per cent of instances of gay in my newspaper data were cohesive. (In passing it is worth noting that the situation was strikingly different with the non-sexual sense of gay, of which there were no cohesive instances. Textual priming, therefore, like other kinds of priming, contributes to the distinction between polysemous uses of a word.)The point here, though, is not that gay (in the sexual sense) is primed for cohesion. That point, after all, has already been made with army. The point here is that gay overwhelmingly tends to occur in chains made up of simple repeti- tion; i.e. gay is repeated as gay, which is then repeated as gay and so forth. Of the 25 chains, 20 are either exclusively or predominantly made up of reiterations of the word gay. The synonym homosexual (or the noun homosexuality) occurs in 18 of the chains, but in only 5 do its occurrences outnumber those of gay. Other synonyms virtually never occur. These results are in line with those of Baker (2005) who shows how gay and homosexual belong to separate discourses.Ago, which we saw was primed for links and very short chains, is another instance of a word that is primed only for simple repetition, the only exceptions being arguable links with yesterday ( a day ago) and last year (?  a year ago).On the other hand, the word planet functions rather differently. Like army and gayplanet favours cohesive chains. As with army, the picture is complicated by the non-participation in cohesive chains of specific combinations such as on theplanet (which almost always references the all-embracing nature of a generalisa- tion) and this planet (which usually refers to Earth). Even with these included in the statistics, however, planet forms cohesive chains in 47 per cent of texts in which it appears. Crucially, though, although it does sometimes form chains of simple repetition, it appears just as frequently in chains with its hyponyms (e.g. planet – Uranus – Saturn – planets – Pluto). So planet is primed, for Guardian readers at least, for hyponymy (and meronymy).More briefly, other words are primed to occur in chains where pronouns are common. Blair is an instance of this (e.g. Mr Blair – his – Mr Blair’s – Tony Blair – Mr Blair’s – his). Blair, like Hammerfest, is an instance of a claim that applies to a semantic set PERSONAL NAME.I hope that these points are obvious. There is no hyponym to Blair and everyone knows he is male and that therefore the masculine pronoun will apply. Likewise, if the domain is the solar system, everybody knows that planet is bound to occur in a cohesive chain with hyponyms. But that very obviousness is evidence of the correctness of the claim that the cohesive properties of the word are built into the word itself. This is why Emmott (1989, 1997) and Sinclair (1993, 2004) can argue that cohesion is prospective. Sinclair overstates the position in claiming that the whole of the previous text is encapsulated in the sentence currently being read and Emmott’s position is closer to my own (and indeed has been influential upon it) in that she finds a psychological explanation for the operation of cohesion, but the fundamental insight of both is sound when they claim that we do not constantly refer back to the previous text, as the literature on cohesion would have us believe. The reason why we do not refer back, I would argue, is that we are primed to expect cohesion of particular types for particular words and therefore anticipate its occurrence in advance of its appearance.
    Claim 2: every lexical item (or combination of lexical items) may have a positive or negative preference for occurring as part of a specific type of semantic relationClaim 1 looked strange but it was in fact merely an extension of the notion of collocation to take account of a word’s long recognised ability to collocate across sentences, though collocation here, however, was importantly extended to include collocation with itself; in other words, lexical repetition. The second claim is similarly strange on first sight but is in fact simply an extension of the notion of semantic association. Indeed, as we shall see, in some of its manifesta- tions it is indistinguishable from the kinds of semantic association we were looking at in Chapter 2. The claim is that every lexical item (or combination of lexical items) may be positively or negatively primed for occurring as part of a specific type of semantic or pragmatic relation or in a specific textual pattern(e.g. contrast, comparison, time sequence, cause-effect, exemplification, problem- solution). Such semantic relations or discourse patterns may be textual, i.e. the relations between clauses or parts of clauses or between larger chunks of text, or they may reflect and incorporate relations between a speaker and a listener of the kind described in conversational analysis (e.g. Schegloff 1972; Schegloff and Sacks 1973) or in discourse analysis (Sinclair and Coulthard 1975), where the relation between a speaker or writer’s utterance is in focus. (They may also reflect the interaction between writer and reader.)I start with sixty, whose cohesive priming was briefly referred to on page 120, and which we have looked at in previous chapters. Examination of the textual contexts of 100 instances of the word in my corpus revealed that 41 occurred in a contrast relation, 16 participated in a non-contrastive comparison relation and 37 occurred within the problem component of a problem-solution pattern (Winter 1974; Hoey 1979, 1983, 1993, 2001; Jordan 1980, 1984). This left only 21 instances not accounted for. (The figures fail to add to 100 because a clause can be in more than one textual relation.) So the evidence supports the claim that in newspaper writing sixty is strongly primed for use in contrast relations and as the problem component of problem-solution patterns and weakly primed for use in non-contrastive comparison relations.We find a very similar situation with ago, the cohesive properties of which were also considered above. Inspection of 100 instances of thematised ago showed that it was strongly primed for occurrence in contrast when it is part of Theme, with 55 appearing as part of a contrast relation and a further 16 appearing in some kind of comparison relation. (The proportions rise still further if instances of not long ago and as long ago as are discounted.)As with other kinds of priming, it may be nested combinations that are primed, rather than individual words. So, for example, the combination of The ADJECTIVE  side in sentence-initial position has primings that appear to be moredistinct than those found for uses without the adjective or the initial positioning. I examined 137 instances of this combination (excluding all national, county and continental adjectives, which were suspected to have primings of their own). Eleven were in titles, and these were discounted, since titles have a different relationship to the rest of the text from that of normal clauses. This left 126 instances available for analysis. Of these, 58 (46 per cent) were part of a contrast relation and 19 (15 per cent) were part of a close parallelism. (Nine of these were in both relations.)I am perhaps alone in making the claim that lexis is systematically primed for textual semantic association (Hoey 2004a) but I am certainly not the first to make the claim for individual lexical items. McCarthy (1998), for example, notes that got is associated with problem (an important element of problem- solution patterns). Hunston (2001) likewise gives several valuable examples of textual semantic association (though of course she does not use the term). Shenotes, for example, that the combination may not be is associated with contrast between ideal and more achievable. Similarly, she notes that feted as is associated with contrast.
    A return to the Bill Bryson sentenceAs this book has developed, I have sought to relate, wherever possible, the different aspects of priming to the sentence of Bill Bryson’s with which this book began (more or less). With this in mind we turn now once again to his sentence to see how the claims made in this chapter might apply to it. So far we have only looked at the Bill Bryson sentence in splendid isolation; indeed one of the reasons for choosing the first sentence of a book to illustrate our claims was that it could stand on its own. However, we clearly cannot say much about its textual primings without quoting something of what follows. Space and copy- right considerations prevent quotation of great swathes of Bill Bryson’s text, but it may be helpful to quote the first two paragraphs of the book and the first sentence of the third. I have numbered the sentences for convenience of reference:
    (1) In winter Hammerfest is a thirty-hour ride by bus from Oslo, though why anyone would want to go there in winter is a question worth consider- ing. (2) It is on the edge of the world, the northernmost town in Europe, as far from London as London is from Tunis, a place of dark and brutal winters, where the sun sinks into the Arctic Ocean in November and does not rise again for ten weeks.(3) I wanted to see the Northern Lights. (4) Also, I had long harboured a half-formed urge to experience what life was like in such a remote and forbidding place. (5) Sitting at home in England with a glass of whisky and a book of maps, this had seemed a capital idea. (6) But now as I picked my way through the grey, late-December slush of Oslo I was beginning to have my doubts,(7) Things had not started well. (8) I had overslept at the hotel, missing breakfast, and had to leap into my clothes . . .
    In respect of priming, names are no different from any other words, except of course that it is a much more common experience to encounter a new – and therefore unprimed – name that it is to encounter an entirely new lexical item. Once a name is encountered a few times, though, it becomes primed in exactly the same way as any other word in the language. When you first read Bill Bryson’s sentence, Hammerfest was (I assume) unprimed for you. By now, it is perhaps heavily primed for association with winterbus and Oslo! It is not therefore odd to start our consideration of textual priming in the Bill Brysonsentence by looking at the place-names. Hammerfest and Oslo share the same cohesive priming in that they are typically primed to collocate textually in travel writing with repetitions of their name, though the nesting of PLACE and subject greatly strengthens the priming. (Claims about the priming of Hammerfest are based on the behaviour of other place names in a corpus of travel writing made up of travel magazines; 500 instances were analysed in their textual contexts.) Repetitions of Hammerfest do occur in accordance with this priming but not within the passage quoted. They in fact occur 16 pages later after a number of amusing digressions, and the town gets the next chapter to itself. Oslo forms a desultory chain, with four further mentions in Chapter 1, one of them in sentence 6.As place names, Hammerfest and Oslo are also primed for repetition by pro- forms it and there. The pro-form there linking with Hammerfest occurs in the second half of sentence 1 and it occurs in sentence 2. As regards the cohesion of place names in travel writing Bill Bryson therefore conforms to the typical priming. (In newspaper writing, other than travel writing, the propensity of place names to form cohesive chains and links appears, albeit without detailed study, to be much reduced. All primings are in principle genre and domain specific, as has been remarked in several places – this is particularly marked of textual primings.)The word winter in sentence 1 is only weakly primed for cohesion in news- paper text. (My travel corpus did not permit its study in travel writing.) Only 10 per cent of the texts examined (9 out of 90) showed any cohesive tendency and of these only 3 showed winter in cohesive chains (3 per cent). Looking at the same data from the lexical perspective, I found that 26 per cent of instances of winter were cohesive (exactly 100 were examined), split evenly between cohesive chains and cohesive links. The chains were all short. When winter does participate in cohesion in my data it is primed for simple repetition and avoids pronouns. This conforms to Bill Bryson’s usage, though the chain he produces in his first chapter is slightly longer than some (but this is likely to be where the priming for travel writing is inclined to differ most from that for newspaper writing).However, the proportion of cohesive links rises dramatically if antonymous links are taken into account. Of the 90 texts considered, 24 (27 per cent) contained an antonymous relation between winter and summer, usually across sentence boundaries, and of these only 3 were already cohesive because of the repetition of winter. Bryson’s text partly matches this expectation. There are a number of references to summer in the first chapter, the last occurring only three sentences before a resumed mention of the Hammerfest trip. Whether they would be read as cohesive – or as antonymous – is however less certain.The words busride and hour are all negatively primed for cohesion in both travel writing and newspapers; clearly, this would not be the case for texts inthe domain of transport where the priming for bus and possibly the other items would presumably be positive.Turning now to the second claim, the word Hammerfest is, as a member of the set of place-names, primed in travel writing for two types of textual semantic association, namely location and characterisation. These may not look like tex- tual relations but they represent questions that a reader may ask of (part of ) a text. So location answers the question:
    Where is X?
    and characterisation answers the question: What is it like?Neither feature has been much handled in text-linguistic terms, though Suther- land (1985) provides a useful preliminary text-linguistic account.Both features may occur within the sentence or across sentences. An example of their presence within a single sentence (when of course they become tempor- arily indistinguishable from ordinary semantic association) is example 3:
  3. Lying in one of the most untouched pockets of tropical paradise in the Caribbean, with a coastline composed of a multitude of idyllic coves and harbours, it is no accident that Antigua is one of the world’s most popular honeymoon destinations.
    The Bill Bryson text, however, illustrates the second option. It will be seen that both location and characterisation is provided in sentence 2. However, the characterisation is not expected to be this brief in the genre of travel writing, given the nesting of place, Theme and text-initial position. A fuller characterisa- tion is needed and the beginning of it arrives at the very end of the chapter:
  4. We approached Hammerfest from above, on a winding coast road, and when at last it pivoted into view it looked simply wonderful – a fairyland of golden lights stretching up into the hills and around an expansive bay. I had pictured it in my mind as a village – a few houses around a small harbour, a church perhaps, a general store, a bar if I was lucky – but this was a little city. A golden little city. Things were looking up.

This long distance relation is quite typical of the interaction between writer and reader that contributes to the creation and interpretation of a text; the fulfilment of a reader’s expectation may be deferred endlessly. So if the word

murdered occurs in Chapter 1 of a detective story, it will typically be primed textually such that the reader will expect an answer to the question ‘Who did it?’ The answer may be – indeed is extremely likely to be – deferred until near the end of the book, but the question arising from the priming of murdered will remain in the reader’s mind. In this respect, textual semantic association is quite unlike local semantic association of the kind described in Chapter 2.

There is another important respect in which textual primings such as textual semantic association differ from local primings. The primings for collocation, semantic association and colligation previously noted have helped distinguish the version Bill Bryson wrote from the relatively unnatural version that I offered in Chapter 1. When it comes to the textual primings discussed in this chapter, however, the differences between the natural and unnatural versions to some extent disappear. Both can, as first sentences, begin texts that utilise the textual collocations (cohesion) and textual semantic associations with which their shared lexis is primed, and their relative naturalness or unnaturalness is unaffected. So substituting my unnatural version for Bill Bryson’s original barely affects the coherence or naturalness of the opening of the book, the only change dictated by the substitution being the replacement of the pronoun at the beginning of sentence 2 with the full name of the town to be visited:

(1) Through winter, rides between Oslo and Hammerfest use thirty hours up in a bus, though why travellers would select to ride there then might be pondered. (2) Hammerfest is on the edge of the world, the northernmost town in Europe, as far from London as London is from Tunis, a place of dark and brutal winters, where the sun sinks into the Arctic Ocean in November and does not rise again for ten weeks.

On the other hand, overriding of the textual primings of the lexis in the first sentence may produce unnatural-sounding text, despite the naturalness of the sentence used. Consider, for example, the following:

(1) In winter Hammerfest is a thirty-hour ride by bus from Oslo, though why anyone would want to go there in winter is a question worth con- sidering. (2) Bus rides always raise interesting questions about winter travel.

(3) I wanted to see a bus. (4) Also, I had long harboured a half-formed urge to experience what life was like on a bus ride. (5) Sitting at home in England with a glass of whisky and a book of maps, this had seemed a capital idea. (6) But now as I picked my way through the grey, late-December slush of Oslo I was beginning to have my doubts.

(3) Things had not started well. (4) I had overslept at the hotel, missing breakfast, and had to leap into my clothes . . .

I would not claim that this text is incoherent, but I think it is unnatural in much the same way and for much the same reasons that my version of Bill Bryson’s first sentence was unnatural. The textual primings of much of the lexis have been overridden. The priming of the place names Hammerfest and Oslo for cohe- sion by repetition and pro-form has not been followed, nor have the negative primings of busride and hour for cohesion. The textual semantic associations of PLACE NAME have likewise been set aside. Consequently, the reader struggles to make sense of the otherwise reasonably natural second, third and fourth sen- tences. The other sentences, including the first, are unaltered.

The naturalness or unnaturalness of both sentences and texts depends on whether speakers or writers conform to or override the primings of the lexis they use. This has two implications. Firstly, it supports the view that lexical priming underpins linguistic choices from the syllable to the discourse. Secondly, it suggests that textual choices and local clausal choices, though driven by the same kinds of priming, are nevertheless partially independent, in that natural- ness in the one set of choices is independent of naturalness in the other. This implication will be returned to in Chapters 8 and 9 when we consider creativity in language and the relationship of text and lexis. First, though, we need to consider a third kind of textual priming, which impacts less on questions of naturalness and more on questions of ordering and organisation.

Text: a third claim 129

7 Lexical priming and text: a third claim


The notion of textual colligation was first introduced in this book in Chapter 3. Our original definition of colligation included as one of its components ‘the place in a sequence that a word or word sequence prefers (or avoids)’, and this property of colligation was described in connection with consequence both in connection with the thematisation of the phrases as a consequence and in conse- quence, and (in Chapter 5) in our consideration of the operation of the drinking problem hypotheses on the polysemous uses of consequence. In this chapter, however, in addition to providing further evidence for thinking that priming for Theme or Rheme is common, the notion of textual colligation will be extended to cover not only positioning within the sentence but positioning within the speaking turn, the paragraph, the conversation and the text, though limitations in the corpus I am working with mean that observations on the speaking turn and the conversation are sadly going to be brief and programmatic only.

Textual claim 3

Two textual claims were made in Chapter 6, with reference to textual colloca- tion and textual semantic association. The third and final textual claim is the following, which will be seen to be an extension of the part of the definition of colligation just quoted: ‘every lexical item (or combination of lexical items) is capable of being primed (positively or negatively) to occur at the beginning or end of an independently recognised “chunk” of text’. When we encounter language in speech or writing, we are aware of the contexts and co-texts in which we encounter it – that has been an underly- ing assumption throughout this book. The claim I want to make at this point is that just as we are aware that words are typically used as part of Subjects or Adjuncts, so we are also aware of their textual position. So, for example, our awareness includes the knowledge that (as we have already seen) certain

words tend to come early on in a sentence, while others tend to favour final position.

The examples in Chapters 3 and 5, with regard to the use of consequence as Theme, relate to priming for sentence-initial position, but a word may equally be primed for other positions in the sentence. As an example of the latter, it is interesting to note that reason ( rationality, logic) has a positive priming for end of sentence position. A massive 24 per cent of cases (154 instances) occur as the very last word of the sentences in which they appear. By contrast, only 817 instances of reason ( cause) are sentence-final, representing a more normal 6 per cent of cases. This also shows that the drinking problem hypotheses apply to textual primings as much as to other kinds of priming.

Another example of a word sequence primed for Rheme is provided by Bastow (2003), who notes that the writers of US defence speeches are primed to place the nominal group our men and women in uniform at the end of clauses (though of course he does not express the insight in the terms I am using). It is worth remarking that Bastow’s claim is actually more precise than saying that our men and women in uniform appears in Rheme; he is specifying a quite specific position for the word sequence – final position. My investigation of this aspect of textual colligation suggests Rheme is too big and crude a category (everything after the Theme) to permit interesting textual colligational claims, and that Bastow’s observation is more characteristic of the precision with which words and word sequences are primed.

This kind of textual colligation is a textual priming, rather than a grammatical priming. After all, the choice of Theme is in part affected by the textual surround. Although Halliday (1994) places Theme-Rheme analysis within his grammatical system, it belongs to his textual metafunction and is better seen in my view as a textual perspective and constraint upon sentence construction. As I shall attempt to show, Theme-Rheme is the tip of an iceberg in respect of our awareness of the textual environment in which we encounter the words we use. Just as a word may be primed to occur (or to avoid occurring) in first or last position in a sentence, so it may also be primed to occur (or avoid occurring) in first or last position in a paragraph, a section or a text. So, for example, consequence is not only primed to favour Theme, it is also primed to avoid paragraph-initial and text-initial position. The plural consequences, on the other hand, which is less strongly primed to occur as Theme, is positively primed to be paragraph-initial, though it shares the aversion of consequence for being text-initial.

With luck, your reaction to these claims will be that what I am saying is self-evident. Obviously writers will sometimes start paragraphs by noting a multiplicity of consequences and then use the next few sentences to spell out what they are. Equally obviously, a single consequence will be linked closely to its cause. If you do react this way, it is, I would argue, because you are primed to use these words in the textual ways I have described. You may then object

that consequence and consequences have long been recognised to have special text organising functions (e.g. Winter 1977; Francis 1986, 1994) and therefore do not count as evidence for textual priming, but I hope what follows will convince you that they behave no differently from other words in respect of their priming for textual position.

One problem, though, must be faced from the outset and that is the difficulty of accessing sufficient data, as I mentioned at the beginning of this chapter. In a written corpus of nearly 100 million words, the number of paragraph boundaries and, of course, the number of texts are far fewer. The evidence for the claims I make in this chapter should be seen as suggestive rather than conclusive.

sixty (again)

Of 307 instances of sixty in my data, 208 are thematised, of which an exact 200 are the first word of the sentence. This means that sixty is strongly primed for occurrence as part of Theme.

The priming of sixty for sentence-initial position is not of course a surprise – indeed, given that many object to starting a sentence with a numeral, it is very much as expected. What however might be surprising is the fact that 14 per cent of all the sentence-initial instances of sixty in the newspaper data (9 per cent of all instances) are the first word in the text in which they appear, the first word being defined as either the first word of the title, subtitle or first full sentence. Slightly over a third of these (10) are in combination with years.

It might be objected that this finding is the product of the number of instances of sixty in sentence-initial position in my data and of the shortness of the texts in my newspaper corpus. If after all a word is put into sentence-initial position often enough in a corpus of short texts, it might well follow that it will appear frequently in text-initial position. If the texts in which sentence-initial sixty appears were on average ten sentences long, then there would be a one in ten chance of its being text-initial, without there being any need for an explanation in terms of priming. To check whether this was so, I therefore counted all the sentences in every text where sixty was the first word, and found that these texts averaged 20 sentences in length. There is therefore a one in twenty chance of sentence-initial sixty occurring at the beginning of a text on the basis of random distribution. Given, though, that the actual proportion of instances of sentence- initial sixty that are also text-initial is one in seven, sixty is occurring at the beginning of texts three times as often as it should do on the basis of random distribution.

We saw with army that textual priming for cohesion varies according to the operation or otherwise of other primings. The same is true for textual colligation. One of the common of collocations of sixty is with per cent. If this collocation is adopted, the textual priming for text-initial position is overridden. Out of

91 instances of Sixty per cent in sentence-initial position, only three are also text- initial. If these are taken out of the equation, we are left with 26 text-initial instances of sixty occurring within a data bank of 117 sentence-initial instances. This means that nearly a quarter of all sentence-initial instances are text-initial and that sixty occurs text-initially about five times more often than random distribution would predict.

The reasons why sixty begins newspaper texts are all related to the goal of newspaper production. Newspapers are more aware of their place in time than any other kind of discourse; a number of articles begin Sixty years ago . . . , a fact we shall return to below. An example is:

  1. Sixty years ago, a dying Elgar went to France to make peace with Delius and hear his music interpreted by a prodigy.
    Furthermore, they have a need for particularity; Bell (1991) describes how precise statistics are a characteristic of newspaper writing. If an event affects sixty people, it may be a significant event, for example:
  2. Sixty schools were closed this week in southern Bulgaria as tension mounted between nationalists and ethnic Turks demanding language teaching for their children.
    The fact that the choice of sixty is the product of external factors does not constitute a challenge to the notion of textual priming. In the first place, the text-initial priming of sixty does not extend to 60 (nor do many of its other primings – there is no association of 60 with vagueness, for example). So the choice of sixty over 60 is made simultaneously with one of the discoursal choices described above. Secondly, a Guardian writer is not obliged to place sixty in sentence-initial position. They could just as easily have written:
  3. A dying Elgar went to France sixty years ago to make peace with Delius and hear his music interpreted by a prodigy.
  4. Tension has mounted in southern Bulgaria between nationalists and ethnic Turks demanding language teaching for their children and sixty schools have been closed this week.

If these sound less likely as the beginnings of texts, it is only because they no longer contain the appropriate primings for text-initial position.

Thirdly, and importantly from the point of view of critical discourse analysis and sociolinguistics, the claim is that the priming of sixty for journalists (and consequently for their readers) is created in exactly the same way as all other primings. They simply have encountered numerous previous examples of sixty in

text-initial position and unthinkingly reproduce the priming in their own writ- ing, in so doing (re)creating and satisfying an expectation in the readers. Journal- ists do not think of writing articles that focus on the events of fifty-nine or sixty-one years ago, even though decades have no special value as a way of talking about changes in the world and even though the events of 59 or 61 years ago are presumably as interesting as those that happened sixty years ago. Likewise, because they have been primed by exposure to the writings of such journalists, readers would be puzzled to encounter articles beginning ‘Sixty-one years ago . . .’. I have reiterated throughout this book that claims about priming have to be domain and genre specific and nowhere is this more true than in the area of textual priming. My suggested explanations for the fact that sixty is textually primed are entirely dependent on the nature of newspaper writing and have no application to, say, travel writing or academic texts. We shall see shortly that travel writing has its own primings and the same is likely to be true for academic articles too. Upton and Connor (2004) have shown that the different sections and moves that can be used to describe research papers (e.g. Swales 1990) differ in their expression of stance and Gledhill (2000) has shown how collocations differ in scientific papers, depending on whether the words are found in the abstract, introduction, methodology or conclusion sections. Given these demon- strated differences, it would be unexpected if at least the sections were not marked out in the manner I have described. Intuitively, for example, I would guess that the word recent might be positively primed for text-initial position in academic articles. Word sequences such as recent researchrecent advances and recent developments seem familiar as text-opening gambits (though if there is one thing a corpus linguist quickly learns it is that their intuitions almost always simplify the picture or tell outright lies – primings affect recognition, but they

seem to have little impact upon intuitions).

An experiment with paragraphing

With these results in mind, it seems worth revisiting earlier work on paragraph- ing. In the 1960s a series of influential papers were published that saw the paragraph as a structural unit, with a topic sentence that was then restricted and illustrated (Becker 1965, 1966; Christensen 1965, 1966). Support for this position was to be found in an important but neglected book by Robert Longacre (1968), which showed that some Philippine languages had special markers for paragraph boundaries and offered a structural description of the paragraph, based on tagmemic theory. Although Longacre’s evidence is convincing for the lan- guages he describes, the evidence for paragraph structure in English has always seemed suspect (Rodgers 1966; Stern 1976; Hoey 1985), though years of teach- ing paragraph structure in freshman English classes will have had a priming effect and the structure may be truer today than it was when it was first proposed.

A key paper in the early description of paragraphing was Young and Becker (1966). Never properly published, presumably because it did not chime in with the theoretical interests of the time, but made available as a progress report, the paper described an experiment with paragraphing whereby a short extract from a monograph on American Civil War history was given to a small group of informants in de-paragraphed form and the informants were required to re-paragraph it. What rightly interested Young and Becker was the fact that there was a considerable measure of agreement among their informants as to where the paragraph breaks should come. At the time they interpreted this as evidence of the existence of paragraph structure; I suspect it was the only reasonable interpretation, given the emphasis on sentence structure in linguistics in the 1960s. Nearly 20 years later, however, I argued that the breaks their informants favoured were not made because they were recognising paragraph structural units but because they were responding to their perceptions of the way the text as a whole was structured (Hoey 1985). Either way, Young and Becker’s paper showed that paragraphing was not random.

In the light of evidence that some words are primed for particular paragraph and text positions, it seemed worthwhile to revisit Young and Becker’s experi- ment to see whether the informants were in fact revealing how they had been primed. I therefore re-conducted the experiment with a larger sample of informants in 1996. The informants in question were 67 first-year undergraduate students, who had not been taught about either priming or paragraphing (at least not at the university). I reported my findings in Hoey (1997c) and here reinter- pret these findings in the light of the theory of lexis proposed in this book.

The passage Young and Becker used in their experiment, and which therefore I also used in mine, was the following, taken from Lincoln and His Generals by

T. Harry Williams:

  1. Grant was, judged by modern standards, the greatest general
  2. of the Civil War. He was head and shoulders above any general on either
  3. side as an over-all strategist, as a master of what in later wars
  4. would be called global strategy. His Operation Crusher plan, the
  5. product of a mind which had received little formal instruction in the
  6. higher area of war, would have done credit to the most finished
  7. student of a series of modern staff and command schools. He was a
  8. brilliant theatre strategist, as evidenced by the Vicksburg campaign,
  9. which was a classic field and siege operation. He was a better
  10. than average tactician, although, like even the best generals of
  11. both sides, he did not appreciate the destruction that the increasing
  12. firepower of modern armies could visit on troops advancing across
  13. open spaces. Lee is usually ranked as the greatest
  14. Civil War general, but this evaluation has been made without
  15. placing Lee and Grant in the perspective of military
  16. developments since the war. Lee was interested hardly at all
  17. in ‘global’ strategy, and what few suggestions he did make to
  18. his government about operations in other theatres than his own
  19. indicate that he had little aptitude for grand planning.
  20. As a theatre strategist, Lee often demonstrated more brilliance
  21. and apparent originality than Grant, but his most audacious plans were
  22. as much the product of the Confederacy’s inferior military
  23. position as of his own fine mind. In war, the weaker side
  24. has to improvise brilliantly. It must strike quickly, daringly,
  25. and include a dangerous element of risk in its plans. Had Lee
  26. been a Northern general with Northern resources behind him he would
  27. have improvised less and seemed less bold. Had Grant been
  28. a Southern general, he would have fought as Lee did.
  29. Fundamentally Grant was superior to Lee because in a modern
  30. total war he had a modern mind, and Lee did not. Lee
  31. looked to the past in war as the Confederacy did in spirit.
  32. The staffs of the two men illustrate their outlooks. It would
  33. not be accurate to say that Lee’s general staff were
  34. glorified clerks, but the statement would not be too wide
  35. off the mark . . .

As I have noted in papers on Young and Becker’s experiment (Hoey 1985, 1997c), this passage is organised according to two major principles. In the first place it makes use of a matrix (Hoey 1991c, 2001), whereby a set of largely parallel questions are asked of two topics – Grant and Lee. The matrix can be set out as shown in Table 7.1. The sequencing of the text follows the first column down until the penultimate question and then moves to the second column, moving back to the first column for the final question. It would be logical therefore to mark the movement across the columns with a paragraph break. Such a break would occur at line 13. It would also be logical to mark the movement back with a break at line 29.

More subtly, there are two places where the parallelism of the two halves of the passage is not strictly maintained. The first occurs when evidence is provided for Grant’s superiority as a global strategist. It would be possible to mark the deviation from the symmetry either by marking a paragraph at line 4 where the ‘digression’ begins or at line 7 where it ends (but probably not at both). The second occurs when a substantial explanation is embarked on for the apparent greater brilliance of Lee. A break at line 23 would mark this.

Finally, the parallelism is weighted in the direction of Grant with Lee being unfavourably compared with him on several counts. It is therefore surprising that Lee comes off better as a theatre strategist. A paragraph break at line 20

Table 7.1 A matrix analysis of the Grant/Lee passage



Who was the greatest general of the Civil War?

Who was the greatest global strategist?

What evidence have you for saying so? OR

Give me an example

Who was the greatest theatre strategist?

Why was this?

Grant was, judged by modern standards, the greatest . . . (lines 1–2)

He was head and shoulders above any general . . . (lines 2–4)

His Operation Crusher plan . . . (lines 4–7)

He was a brilliant theatre strategist . . . He was a better than average tactician . . . (lines 7–13)

[answered within 7–13]

Lee is usually ranked as the greatest Civil War general . . . (lines 13–16)

Lee was interested hardly at all in global

strategy . . . (lines 16–19) –

As a theatre strategist, Lee often demonstrated more brilliance and apparent originality (lines 20–23)

In war, the weaker side has to improvise brilliantly . . . [lines 23– 28]

Who was the better Fundamentally Grant was superior to Lee,

and why? because . . . he had a modern mind and Lee did not . . . [line 29–31]

would both mark out the surprising nature of the information and prepare the reader for the detailed explanation that follows.

The matrix analysis, it should be emphasised, does not imply that any of these paragraph breaks should be mandatory, but it would be strange if a large group of informants, given the passage to paragraph, did not among them take some account of the factors I have just described.

The second organising principle that shapes the passage is one of an argument containing statements of different level of generality. The argument’s pattern, somewhat crudely represented, is shown in Figure 7.1. Such a pattern would also justify breaks at lines 29 and 32.

So much for the organisation of the passage and the places where paragraph breaks would be motivated. How did my informants in fact choose to paragraph the passage? They were asked to paragraph the passage above, which was in exactly the format given here with the lines numbered; indeed the format is identical to that used by Young and Becker. The students were only given five minutes to decide where to make breaks but were free to decide for themselves

Grant (lines 1–13)

Lee (lines 13–28)



(lines 29–31)


(lines 32–35 etc.)

Figure 7.1 A partial representation of the organisation of the passage

Table 7.2 The distribution of the students’ paragraph break choices

Line on which sentence startsNumber of informants beginning a paragraph at this point% of informants making the choice
32 (The . . . )1320
32 (It . . . )58

how many breaks were required. Slightly less than half went for three paragraph breaks with the remainder ranging from one to eight; the totality of their decisions of where to break is given in Table 7.2. The total for each paragraph break is given in terms of the lines in which the sentences begin.

As can be seen, the students were not unanimous about the appropriate place to break. No single positive choice enjoyed complete support, though the sen- tence beginning in line 13 came closest in this respect with 94 per cent favouring this as an appropriate place to break. There were, however, several places where

the students were unanimous that a break was not desirable – the sentences beginning on lines 9, 24 and 30 were all roundly rejected as break points. Generally, though, the lack of unanimity casts doubt upon claims for the struc- tural status of the paragraph.

In terms of the organisation of the passage described above, possible explana- tions for the students’ decisions are as follows. The near-unanimity of the break at line 13 marks the place where the passage moves from column 1 to column 2. This move is fundamental to recognition of the parallels that are being created between Grant and Lee in terms of the way they have been evaluated and any decision not to break at line 13 would have important implications for the readers in terms of their ability to discern that parallelism.

Other decisions are less likely to affect fundamentally the readers’ orienta- tion, but nevertheless may subtly tweak it in a number of ways. The 11 people who broke at line 4 are marking out the place where the parallelism between Grant and Lee breaks down and the 22 people who broke at line 7 are indicating the place where the parallelism returns. The 32 who broke at line 20 are highlighting the place in the text where the answer provided by the writer is unexpected, though strictly there is no deviation from the parallelism, while the 32 who broke at line 23 may have been motivated either by the desire to highlight the move from particulars to supportive generalisation or by the wish to mark a deviation from the parallelism in a manner similar to that posited as an explanation for those who broke at line 4. Those who broke at line 29 – almost two thirds of the students – were presumably indicating the shift to summary and, perhaps, the return to the Grant side of the matrix.

These, then, are the textual explanations for the students’ choices. In 1985, when I first considered the Young and Becker experiment, they were the only explanations that I considered necessary. However, there are some problems lurking in the informants’ responses. In two places there is real doubt as to where to break, with line 4 and line 7 sharing the burden of indicating deviation from the parallelism of the underlying matrix and lines 20 and 23 evenly divided. Furthermore, seven informants broke at line 16 where there is no justifiable textual reason for breaking and three informants broke at either line 25 or 27, where again there is no obvious structural reason for a break.

One possible explanation for the students’ failure to agree on where the most appropriate breaks might come was that they had a conflict to resolve between responding to the structure and communicative purpose of the passage and responding to their priming as reflected in the lexical items that begin the sentences of the passage.

With this in mind, I set about examining whether there was any evidence that the words used at the beginning of each of the students’ paragraph breaks were primed to begin (or avoid beginning) a paragraph. My intention in doing so was to identify possible paragraph ‘triggers’ and see whether there was any correlation

between the use of such apparent triggers and the choices made by the students. For this reason, I did not use a large body of data to establish whether words were textually primed in the manner described, simply because the test of their priming would come as much from the students’ responses as it would from the corpus. In most cases, the numbers looked at were less than 100 instances, and in a couple of cases, as will be seen, very few indeed. This only matters if we assume that corpora are the only valid evidence for the existence or otherwise of primings. Part of the point of the experiment I am reporting is that priming can be explored in more than one way.

A number of the sentences in the passage have as their first word the names Grant and Lee, and a number of the more popular paragraph breaks co-occur with their use. It made sense therefore to see whether the semantic set SURNAME was primed for use at the beginnings of paragraphs.

My corpus, as already mentioned, is mainly made up of Guardian newspaper text from the years 1991–4, during which period the British prime minister was John Major and the leader of the opposition was Tony Blair. Given that Grant and Lee are in the passage being compared and contrasted, and Major and Blair were in a natural position to be contrasted by virtue of their being electoral opponents, it seemed natural to create two concordances of 100 instances each of the names Major and Blair and examine them for sentence-initial instances. Surprisingly, given their apparent centrality to British political news, only 5 instances of Major and 13 of Blair were actually sentence-initial. This strongly suggested that surnames (or at least these surnames) are negatively primed for this position. Since my objective was to examine the paragraph priming of surnames, I added a further 22 surnames manually by consulting book reviews and interviews, giving me a total sample of 40 single sentence-initial surnames

(i.e. without accompanying first names). This is in itself a tiny sample, but it should be remembered that the corpus analysis in this instance was designed to generate hypotheses about priming that could be checked against the students’ paragraphing decisions.

When the 40 surnames that were sentence-initial were examined, it was found that exactly half were also paragraph-initial. So, on admittedly sparse data, it looks as if sentence-initial SURNAME is primed to begin paragraphs. We seem here to be looking at the possibility that when a negative priming (for sentence- initial position) is overridden, a positive priming (for paragraph-initial position) comes into play. As we shall see, this seems to be a quite common kind of nesting in connection with textual colligation, where an overridden negative priming triggers a textual colligational priming.

In the passage that the students paragraphed, a SURNAME begins sentences on lines 1, 13, 16 and 30. Line 1 is automatically the beginning of a paragraph by virtue of beginning the passage (it was also the beginning of a paragraph in the original) and line 13 was the most popular choice made by the students, so the

sentences that start on these lines support the view that SURNAME is a trigger for paragraphing. So, in a rather different way, does the sentence that starts on line

16. I remarked above that there appear to be no structural or logical reasons for making a break at this juncture, and yet 11 per cent of the students chose to break there. If SURNAME is primed to begin paragraphs, it may be that these students followed their primings at the expense of their sense of the shape of the passage.

Only the sentence that starts on line 30 provides no support for the hypo- thesis. This need not worry us. In the first place, it is flanked by sentences which, as we shall see, compete to be a break. Secondly, it is near the end of the extract and there is some evidence to suggest that the students did not in the main choose to make breaks in the last quarter of the text. Thirdly, and most import- antly, priming is always a matter of probability rather than requirement.

A very different picture from that found for SURNAME revealed itself when I looked at the word he, which, like the surnames Grant and Lee begins a number of candidate paragraph breaks in the passage (lines 2, 7, 9). Whereas surnames avoided sentence-initial position, the evidence pointed towards he being posit- ively primed for this position. Of 100 instances consulted, 30 were in very first position in the sentence. On the other hand, this positive priming for being sentence-initial was not accompanied by a positive priming for paragraph-initial position. In fact he occurred in paragraph-initial position in the sample I exam- ined two and a half times less often than would have been occurred as a result of random distribution. So here was a very different hypothesis from that we arrived at for SURNAME – the nesting of he with first word in the sentence is negatively primed for beginning a paragraph.

In the passage, there are three places where the pronoun he begins a sentence and two of these (on lines 2 and 9) the students unanimously rejected as paragraph boundaries. So, perhaps predictably (but as before I appeal to the predictability as evidence of the psychological truth of textual colligational priming), names are primed to start paragraphs and pronouns are not. The third, however, on line 7, where a third of the students chose to begin a new paragraph, represents a counter example. We shall return to this example shortly, but for now I note that there are good rhetorical grounds for breaking at this juncture in that, as I noted above, it marks the place where the text returns to the (at this stage invisible) parallelism between the two generals. It also returns to the topic of the ways in which Grant is evaluated.

So far we have hypothesised that the negative sentence-initial priming of SURNAME exists alongside a positive paragraph-initial priming and that the positive sentence-initial priming of he exists alongside a negative paragraph priming. It would be tempting to assume that this reversal of primings might be regular. However, when I examined 100 instances of the word his (which of course begins line 4 of the Civil War passage), I found that, unlike hehis was negatively

primed to begin sentences, but that, like he, when it was sentence-initial it was also negatively primed for beginning paragraphs. So his is an instance of a word that is negatively primed for both sentence-initial and paragraph-initial position. Seventeen per cent of the students broke at line 4 where the sentence beginning with his occurs. This is definitely a minority choice, especially as there are reasonable grounds for breaking here, as we have seen.

Another pronoun that appears at the beginning of a potential paragraph break is it, which begins the sentence at line 24. Because of the multitude of uses to which the word it is put, no attempt was made to determine whether the pronominal use, like he, favoured sentence-initial position. Instead, attention was given to whether it offered a plausible opportunity for a new paragraph. To this end I examined 149 sentence-initial instances of it in its anaphoric pronominal use. Only 8 per cent of these ( just 12 cases) turned out to be also paragraph-initial position. Given that one in four might have been expected on the basis of random distribution, we can assume that pronominal it is negatively primed for beginning paragraphs. This is certainly supported by the students who were again unanimous in not starting a paragraph at line 24.

The instance of it at line 24 is not the only case of a sentence in the Civil War passage beginning with it. Line 32 also begins with it, but here its use is anticipatory rather than anaphoric. In this use, it would appear to be positively primed for paragraph initiation (though this judgement is based on examination of only 65 instances), in that 24 of the 65 instances begin paragraphs. Here, the students do not at first sight provide support. A meagre 8 per cent selected the sentence on line 32 as a paragraph boundary. However, 8 per cent starts to look like a large proportion when one realises that this is the final sentence of the passage and that both the desire to avoid one-sentence stragglers and uncertainty about the way the passage might continue would militate against its selection as the beginning of a new paragraph.

Instances of SURNAME and pronouns account for the initial words of ten of the sentences of the Civil War passage. We are left with a small number of singly occurring phrases. One of these is as a theatre strategist, which begins a sentence on line 20 of the passage. Working on the assumption that this was an instance of a local semantic association between as a and JOB/WORK ROLE, I looked for instances of this association. Initially I examined 1000 instances of as a and found that, excluding the sentence conjuncts as a consequenceas a result and as an example there were only 35 instances of as a occurring at the beginning of a sentence. Of these 35 cases, 17 conformed to the semantic association as a JOB/WORK ROLE, and eight of these began paragraphs. Such data may not contradict the claim that as a JOB/WORK ROLE is primed to begin paragraphs, but they can hardly be used as strong evidence in its support. However, the students’ com- munal judgement is certainly compatible with it; almost exactly half chose to break at line 20.

Another sentence-initial phrase used in the passage is in war, which offers the possibility of a paragraph break at line 23. However, it proved very difficult to investigate the primings of this phrase using a corpus. From 215 instances of in war as a self-standing prepositional phrase (though including instances of in war and (in) peace), there were only 15 examples of the phrase in sentence-initial position – another instance of negative priming for beginning sentences. (There were, however, 56 instances of in war in final position in the sentence and a further 33 that were in final position in their clauses, strongly pointing to the phrase having a textual colligation for end position.)

Of the 15 cases of sentence-initial in war, few as they were, five were also paragraph-initial. As the average length of the paragraphs which they began was five sentences, this incidence of paragraph-initial cases hints at the phrase being positively primed to begin paragraphs; so, very provisionally, one would hypo- thesise that in war is positively primed to occur sentence-finally and paragraph- initially – but not at the same time! The students’ choices certainly support the view that the phrase is positively primed to begin paragraphs when it is sentence-initial. Almost half made a paragraph break at line 23, even though it is really just an extension of the point being made in the previous sentence – another instance, perhaps, where the posited conflict between the priming and the rhetorical shape of the passage was resolved in favour of the priming.

The words Had X been do not appear to be primed, one way or the other, for paragraph breaks. Examination of 243 instances of Had X been showed that the word sequence’s association with paragraph boundaries was dependent on whether

SURNAME or PRONOUN filled the spot marked by X. Of 25 instances of Had SURNAME

been (all there were in my corpus), 12 are paragraph-initial. On the other hand, out of 158 instances of Had PRONOUN been (PRONOUN here being limited to the traditional set of you, he, she, we, they, this, that, these and those) only 29 were paragraph-initial.

These results are entirely in line with those for SURNAME and he and it given above. However, in so far as there is any evidence about the priming of Had .. . been on its own, the evidence points towards the word combination having a different kind of priming for the typical Guardian writer (and reader). While it is true that almost half of the instances of Had SURNAME been are paragraph-initial, the paragraphs are characteristically shorter than is the average for my data as a whole. Calculations of paragraph length in my corpus as a whole suggest that (one-sentence paragraphs as always excluded) the average length is four sentences and that figure has been repeatedly borne out when calculating the average length of particular sets of paragraphs. For Had SURNAME been, how- ever, the average length of the paragraphs is exactly three sentences and only 2 paragraphs from the 12 considered are above this level.

The same picture occurs with Had PRONOUN been. At first sight, the average length of the 29 paragraph-initial cases – 3.4 sentences – seems to be only

slightly lower than that for paragraph length in the corpus as a whole. But this includes two paragraphs of 10 and 14 sentences length respectively. With these excluded from the calculation, the average comes to under 2.8 sentences per paragraph. A total of 12 of the paragraphs initiated by Had PRONOUN been are only two sentences in length.

The data are not large, because sentence-initial cases of non-interrogative Had X been are not numerous. A further 14 paragraph-initial instances of Had X been were however looked at (where the X was either a country’s or a company’s name, an abstraction or a first name) and these, too, were found to be in short paragraphs, averaging 2.6 sentences. (I have made no calculation of the number of one-sentence paragraphs that occur in typical texts in my corpus as a whole, but my impression while undertaking the analyses was that there were also many more one-sentence paragraphs associated with Had X been than I would have expected.) It would therefore seem to be that where Had X been is paragraph-initial in newspaper text, we are primed to expect the paragraph that follows to be short. So textual colligational priming is not simply a matter of positioning but of length. Since in newspapers hypothetical statements are unlikely to be the focus of attention, it is not difficult to see how this priming might arise.

It will be remembered that the three people who broke at lines 25 and 27 (Had Lee been … , Had Grant been …) had little structural grounds for doing so. Their decision to break in these places is now explicable in terms of the apparent preference of Had SURNAME been for beginning paragraphs, itself driven by the textual colligational priming of SURNAME. In each case the paragraphs they produced were either one or two sentences in length, as would be predicted on the basis of the data we have been considering.

The word fundamentally starts the potential paragraph break at line 29, and instinct (my priming?) would suggest that it is primed positively for beginning paragraphs in this position. Evidence for this, however, is hard to find. Out of 786 instances of fundamentally, only 20 were the first word in their sentence, which of course left me with few data to work on: clearly the word does not like to begin sentences. Of the 20 sentence-initial cases, 6 begin paragraphs and 13 do not, with one instance beginning a one-sentence paragraph. The average length of the paragraphs is five sentences, though the mean is four. This means that, subject to the cautions necessary from the paucity of data, we have evid- ence for believing that fundamentally begins paragraphs roughly 50 per cent more often than can be accounted on the basis of random distribution. This putative positive priming is certainly supported by the students, almost two thirds of whom chose to begin a paragraph at line 29.

The word sequence that starts the first sentence on line 32 – The staffs – did not permit investigation in my data, there being only one sentence-initial instance in 100 million words. However, illustrate was more promising, both because it is

considerably more frequent in my corpus and because it is a lexical signal of the generalisation-exemplification relation and might therefore be expected to participate in the chunking of text. A sample of 100 instances of illustrate used as a finite verb in main clause constructions with a non-pictorial sense were examined, and it was found that 38 of the instances began paragraphs of more than one sentence. There were in addition 17 single-sentence paragraphs – again, intuitively, a high number. Removing the latter from the calculation, 46 per cent of the instances eligible to begin paragraphs of more than one sentence were indeed paragraph-initial. Since the average length of the paragraphs so begun was 3.7, we can infer that illustrate is one and half times more likely to begin a paragraph than random distribution would predict. In accordance with this finding, 20 per cent of the students chose to break at line 32, despite the fact that little of the passage remains beyond this point.

We can represent the match (or mismatch) of the students’ choices against the structural grounds for breaking and the textual colligational grounds for breaking, as shown in Table 7.3. I have represented both sets of grounds crudely in terms of positive or negative. Even allowing for the lack of subtlety in this, there is a considerable matching between students’ decisions on where to make a paragraph break and whether there are organisational or colligational grounds

Table 7.3 A match of the paragraphing decision of the students with the organisational and lexical factors that might have led to those decisions

Sentence-initial word or phraseLine no.Organisational grounds for breakingParagraph-initial colligational priming% of informants making a paragraph break at this point (67 informants)
Grant1PositivePositive100 (by default)
As a JOB/WORK ROLE20PositivePositive49
In war23PositivePositive49
It (pronoun)24NegativeNegative0
Had SURNAME been25NegativePositive3
Had SURNAME been27NegativePositive3
It (anticipatory)32bNegativePositive8

or not for such a break, the only mismatch coming near the end of the passage when both shortage of time and uncertainty about how the passage might con- tinue will have affected the pattern of decisions.

One point to note with the results presented in Table 7.3 is that textual colligational priming and structural factors only support each other up to a certain point. There is therefore ample reason here why the students should have been undecided as to how to make their paragraph breaks. Pulled one way by structural factors and another by their lexical primings, it is no wonder that there was no unanimity among them. What from a lexical point of view seem unmotivated breaks (at lines 4 and 7) are justifiable in structural terms. More frequently, what from a structural point of view seem anomalous breaks by a minority (at lines 16, 25, 27 and 32) become explicable in terms of lexical priming.

A further experiment

The decisions that were made by the students are open in themselves to further investigation. I have, for example, posited that there is a tension between the structural desire to mark deviation from and return to the parallelism between Grant and Lee on the one hand and the negative priming of his and he on the other. Why not test whether this is the case by converting one of the pronouns (his) to a surname (Grant’s)? Similarly, I have claimed that the only reason people want to break at line 16 is because the surname is primed for them to begin paragraphs. So why not change Lee to a pronoun? If, as I have claimed, the word sequence as a JOB/WORK ROLE is primed to begin paragraphs, why not move it later into the sentence? (Of course, this will leave us with Lee at the beginning of the sentence, so this will simultaneously have to be converted into a pro- noun.) And, if in war is apparently primed for beginning paragraphs, what will happen if it is moved to the end of the sentence?

All of these changes do not affect the meaning of the passage and all result in entirely natural sentences (at least to my intuition – I have not undertaken the necessary colligational analysis to prove that they are, though we saw in our analysis above that the latter two changes are in line with the normal use for such nestings). The result is the following adapted passage. As you will see, the line numberings remain the same and the passage still reads normally. I have emboldened the changes.

  1. Grant was, judged by modern standards, the greatest general
  2. of the Civil War. He was head and shoulders above any general on either
  3. side as an over-all strategist, as a master of what in later wars
  4. would be called global strategy. Grant’s Operation Crusher plan, the
  5. product of a mind which had received little formal instruction in the
  6. higher area of war, would have done credit to the most finished
  7. student of a series of modern staff and command schools. He was a
  8. brilliant theatre strategist, as evidenced by the Vicksburg campaign,
  9. which was a classic field and siege operation. He was a better
  10. than average tactician, although, like even the best generals of
  11. both sides, he did not appreciate the destruction that the increasing
  12. firepower of modern armies could visit on troops advancing across
  13. open spaces. Lee is usually ranked as the greatest
  14. Civil War general, but this evaluation has been made without
  15. placing Lee and Grant in the perspective of military
  16. developments since the war. He was interested hardly at all
  17. in ‘global’ strategy, and what few suggestions he did make to
  18. his government about operations in other theatres than his own
  19. indicate that he had little aptitude for grand planning.
  20. He often demonstrated more brilliance and apparent originality
  21. as a theatre strategist than Grant, but his most audacious plans were
  22. as much the product of the Confederacy’s inferior military
  23. position as of his own fine mind. The weaker side
  24. has to improvise brilliantly in war. It must strike quickly, daringly,
  25. and include a dangerous element of risk in its plans. Had Lee
  26. been a Northern general with Northern resources behind him he would
  27. have improvised less and seemed less bold. Had Grant been
  28. a Southern general, he would have fought as Lee did.
  29. Fundamentally Grant was superior to Lee because in a modern
  30. total war he had a modern mind, and Lee did not. Lee
  31. looked to the past in war as the Confederacy did in spirit.
  32. The staffs of the two men illustrate their outlooks. It would
  33. not be accurate to say that Lee’s general staff were
  34. glorified clerks, but the statement would not be too wide
  35. off the mark . . .

I then gave this modified version (without the emboldening, of course) to 32 native speaker first-year undergraduate students, who had, like the previous group, not been taught about either priming or paragraphing at the university. None of them were party to the previous experiment and the conditions under which they performed the test were the same as with the previous group; the need to ensure no overlap with, or knowledge of, the earlier experiment accounts for the smaller number of informants.

Table 7.4 shows how the second group’s choices compare with those of the first group. I have highlighted the lines where the sentences were modified, since it is with these sentences that we are mainly concerned. In the latter part of the

Table 7.4 A comparison of the paragraphing decisions of the two sets of informants on the original and modified passage

LineNumber of second set of informants choosing this pointas a paragraph break% of second set of informants choosingthis point as a paragraph break (32 informants)% of original set of informants choosingthis point as a paragraph break (67 informants)

passage where no changes were made, the two groups behave in similar fashion. So we again have an isolated decision to break at line 27, presumably driven by the positive priming of Had SURNAME been. We again have slightly under a third of informants breaking at line 29 and a minority breaking at the first of the sentences on line 32. A couple of informants have chosen this time round to break at line 30, again motivated, one guesses, by the positive priming of SURNAME, and the second sentence on line 32 continues to punch below its lexical priming weight! But generally the pattern of decisions is very similar.

However, earlier in the text, everything has changed. It was hypothesised that changing one of the pronouns on line 4 to a surname would motivate a change of paragraphing practice and so it proves. The proportion of people choosing to paragraph at line 4 has more than doubled, despite its being structurally more natural to break at the point when the parallelism returns. Correspondingly, the latter break has nearly halved in popularity.

It was also hypothesised that removing the surname at the beginning of line 16 would remove the temptation to break there. This proved more or less to be the case; a solitary informant has inexplicably made a break but the proportion has dropped from 11 to 3 per cent.

Perhaps the most radical change in the ways that the two groups of informants have chosen where to break comes at line 20. Moving as a theatre strategist later

into the sentence and replacing Lee by he has resulted in a drop from almost half to less than a quarter of informants starting a new paragraph at this juncture despite the continuing good structural reasons for breaking here. Furthermore, the effect of the apparent unavailability of this breakpoint has driven a large proportion of the informants to break at line 23, the proportion rising from 49 to 59 per cent.

The pressure to keep paragraphs to a length of three or four lines, and the apparent unavailability of the sentence that begins on line 24, makes line 23 seem an attractive break. Nevertheless we have to concede that, alone among the hypotheses we were seeking to test, the hypothesis that moving in war to the end of the sentence would result in a drop in its popularity as a paragraph break is the one that has not been supported. This deserved further attention. I there- fore looked at 137 instances of sentence-initial The ADJECTIVE side (excluding national, county and continental adjectives). (It will be remembered that this nesting of primings was looked at in connection with textual semantic associa- tion, where it was found to be associated with contrast. It hardly needs saying that contrast is built into the accompanying adjective in this instance.)

Eleven of the instances of The ADJECTIVE side were text-initial, usually in the title, and these were excluded, as were one-sentence paragraphs (of which there were 10). I also excluded any cases at the beginning of quoted speech (9), unless there was also paragraph indentation. Instances elsewhere in speech were regarded as non-paragraph-initial and included in the calculations. Once the exclusions were made, I was left with 106 instances of The ADJECTIVE side, of which 47 (44 per cent) began paragraphs; the average length of the paragraphs was 3.4 sentences. Random distribution would have accounted for 29 per cent of the cases. We can therefore conclude that all I had done in moving in war to the end of the sentence was substitute one paragraph-primed word sequence for another.

A final return to the Bill Bryson passage

The evidence from the corpus investigation and the paragraph experiments points to there being a priming text-initiation and paragraph-initiation, some words being primed positively for these textual positions and others being primed negatively. The Bill Bryson passage quoted on page 124 begins the book Neither Here Nor There. It is therefore appropriate we should return a final time to his words to see whether the first sentence contains any likely lexical primings for text-initiation and whether the beginnings of the next two paragraphs are primed for paragraph-initiation.

A clue to one of the textual colligations operative in the Bill Bryson sentence we have so often examined comes in a sentence in my corpus that parallels it quite closely. I have italicized the places where the sentences vary:

  1. Ntobeye is a two-hour ride by four wheel drive vehicle from the vast refugee camp at Ngara.
    The author of this sentence drew on a different PLACE NAME for the subject and a periphrastic phrase is used in place of Oslo, the VEHICLE is no longer a bus and the time of the journey is shorter, but it is otherwise Bill Bryson’s sentence! Hammerfest is much smaller than Oslo and the vastness of the refugee camp couples with my ignorance of Ntobeye to make me assume that Ntobeye is not large. Where two places are named in the same clause, the smaller will have a strong tendency to be thematised and the larger to be at the end of the Rheme; this is a rather specific kind of textual colligation.Examination of 300 instances of place, drawn from a small specialised corpus of travel writing, revealed a further textual colligation that ties in with this, namely that in travel texts, PLACE NAME is primed colligationally to appear in the structure PLACE NAME  is  EVALUATION.One can either see this as a colligation of PLACE NAME or as a colligation of is. From the latter perspective, the word is characteristically is primed to occur in the structure subject-verb-complement and, in combination with this structural choice, has as one of its semantic associations in travel writing the double association of PLACE  EVALUATION. The reason I mention this is that this combina- tion in turn is primed to colligate with text-initial position in travel writing. Text-initial examples from my data include the following:
  2. Madrid is one of the world’s favourite meeting destinations.
  3. At the very heart of Europe, Hungary is a magical land bursting with ancient culture . . .

On the face of it, the Bill Bryson sentence does not conform to this pattern, in that the thirty-hour ride is presumably factual (though the reader may supply an evaluation of such a long journey in winter). However, there are other textual colligational primings at work in the sentence. Re-examination of the same 300 instances of place names reveals that 133 were paragraph-initial (excluding one- sentence paragraphs). Given that the average length of the paragraphs in the data (as always excluding one-sentence paragraphs) was 3.5 sentences, this means that PLACE NAME is 50 per cent more likely to begin a paragraph than could be explained by random distribution. Furthermore, 31 of the place names are text- initial with another 15 appearing in titles. In other words, 15 per cent of all the place names in my mini-corpus of travel writing begin a text. Given that the texts in my corpus averaged 14 sentences in length, this means that PLACE NAME is twice as likely to begin a travel text as would be predictable on the basis of random distribution. On both counts the Bill Bryson sentence is in conformity with the typical priming of PLACE NAME.

Of the other paragraphs in the extract from Neither Here Nor There, one begins I wanted to and the other begins Things. The latter is an example of a word primed to begin paragraphs. Excluding the usual one-sentence paragraphs, text- initial instances and speech-initiations, I was left in my Guardian corpus with 417 sentence-initial instances of Things. Over 37 per cent of these (156 cases) were paragraph-initial. So Bill Bryson’s usage is in keeping with its priming in news- paper text (though it remains to be investigated whether the priming typically operates in travel writing – my corpus was too small for such an investigation). Incidentally, there were 50 instances in my data of text-initial Things. The texts which it begins would have to be no more than nine sentences in length on average for this not to be evidence of its being positively primed for text-initial use. Though of course Bill Bryson does not make use of this priming, his paragraph is the start of a flashback and could be said to start his tale. This hints at more subtle types of priming than those I have discussed, crudely and at inordinate length, in this chapter.

The other paragraph beginning Bryson uses – I wanted to – seems to have no special association with paragraph boundaries at all. As I have repeatedly noted, primings are not rules. It is possible to override them. It is of interest that the likely explanation for the boundary is textual – the lack of cohesion between this and the previous paragraph, apart from the complex repetition involving northern, indicates a new start. We saw at the end of the previous chapter that the textual and the lexical seem to operate in different ways and here we have further evidence of the fact. I return to this matter in the next chapter.

Some conclusions

The evidence presented in this chapter suggests that we have to connect our systems of description of text organisation with our systems of description of lexis. I have argued for many years that text organisation has a lexical perspective (Hoey 1979, 1983, 1991a) but the implication of this and the previous chapter is that there is a hidden colligational signalling that none of us is pedagogically aware of (though in our own writing we probably show daily awareness in the choices we make and avoid). Writing effectively involves using appropriate text and paragraph beginnings. If there is one conclusion I would want to draw from this and the previous chapters it is that corpora are not just important for the study of the minutiae of language – they are central to a proper understanding of discourses as a whole, and that in turn means that there is no aspect of the teaching and learning of a language that can afford to ignore what corpus invest- igation can reveal. This is a matter I shall return to in the final chapter.

All of the positional claims I have made in this section are of course formulated in terms of the written word, but there is every reason to suppose that similar claims can be made about the beginning and end of speech turns, conversations

and the like. Michael McCarthy (personal communication), who has access to the one of the best spoken corpora in the world (CANCODE), notes that the is primed negatively to occur at the beginning of speech turns. Conversely, in 51 examples of I know drawn from a small corpus of casual conversation of my own, 26 are either the first words of a turn or within one or two words of the beginning of a turn (e.g. Yeah I knowNo no I know), suggesting that I know is typically primed positively for turn beginnings; none are conversation-initial. But work is needed on a much larger body of spoken data.

152 Grammatical creativity

8 Lexical priming and grammatical creativity

Corpus linguistics versus generative linguistics

Painting in broad-brush strokes, traditional generative grammarians have derived their goals, if not their methods or descriptions, from Chomsky, and for them the distinction of a grammatical sentence from an ungrammatical one has been a central consideration. They have not been interested in probability of occur- rence, only in possibility of occurrence. Most of their data have been invented examples and some of these have been hard to envisage in any context. They have, in short, been concerned with the creativity of language. Their models have been designed to account for any sentence, however extraordinary or unlikely, as long as informants have been willing to affirm that the sentence in question is an instance of English.

Still painting with a broad brush, corpus linguists in contrast have derived their goals and methods in part from John Sinclair and his associates and in part from what concordancing software currently makes feasible. These linguists have typically seen their goal as the uncovering of recurrent patterns in the language, usually lexical but increasingly grammatical. They have not been much con- cerned with the single linguistic instance but with probability of occurrence, and their data have been always authentic. They have been concerned with fluency in language rather than creativity, and corpus models have been designed to account for the normal and the naturally occurring.

Like all broad-brush paintings, this lacks light and shade. My account of the generative tradition allows no place for Fillmore, for example, who has been much concerned with matters of fluency and naturalness (Fillmore et al. 1988), and my account of current corpus linguistic work ignores the work of Carter (2004), for example, whose encounters with unexpected and inventive usages in spoken corpora has compelled him to place the nature and extent of creativity in language under careful and instructive scrutiny. But I stand by the general picture I have swiftly painted.

In Chapter 1, I argued that Bill Bryson’s sentence showed that linguists had to account for naturalness as well as creativity, and in subsequent chapters I have tried to show some of the factors involved in the production/selection of a natural sentence and a natural text. Important as naturalness is, however, no claims about the nature of language can be countenanced that cannot address the issue of how language users are creative in their daily use of language, and in this respect generative grammarians have been perfectly correct in their focus. This and the next chapter accordingly attempt to show how a theory of lexical priming might handle different kinds of creativity.

Types of creativity

There are of course a number of recurrent uses of the term ‘creativity’ and their relationship is not simple. At one end of an imaginary spectrum, there is the Chomskyan use (Chomsky 1957) – creativity as a fundamental property of language. Chomsky’s point, of course, was that we do not recall sentences, we newly create them, and he argued that this ability of any native speaker to newly mint sentences needed to be explained. Although lexical priming has, I hope, been shown to account for much that happens under the guise of newly minting sentences, it would be disingenuous to suggest that creativity in Chomsky’s sense is thereby accounted for.

At the other end of the spectrum, there is the literary sense of creativity – original texts that refresh the language and force us to think and see things in new ways. If linguistics cannot say something interesting about literary language, it is an admission that we have not yet got to the heart of our discipline.

The former type of creativity – Chomsky’s type – is invisible because it is all around us. All but a trivially small percentage of sentences are creative in his sense of the word. The latter type is on the other hand highly visible and highly valued. All but a trivially small percentage of sentences are uncreative in this second sense of the word.

In between these, there is actually another kind of creativity – sentences that make no claim to be literary but which surprise us in some way, either because they draw attention to themselves by their clever wording or because they are momentarily hard to process or make us aware that they are indeed made of language.

In this chapter I want to address the first of these types of creativity – the Chomskyan type. The concern here is to consider whether lexical priming can account not only for what is natural but also for what is possible. To do this, I will need to consider a number of issues – the nature of grammatical categor- ies, the status of the word, issues of inflection and phonological priming, the movement from lexical to grammatical priming and the relationship between lexical and textual choices.

Grammatical categories

In my discussion of colligation and semantic association in this book, I have talked of grammatical functions and grammatical categories as if they were givens in the system. However, Hunston and Francis (2000) have shown that the grammatical functions can be reformulated in terms of grammatical categories. Although I have used grammatical functions such as Subject and Object as quick and understandable ways of talking about regularity of use in the clause, I believe Hunston and Francis’ argument carries weight. So are we left, then, with gram- matical categories as the grammatical bedrock without which we cannot have language? Sinclair (1991) seems to argue for this position, but I would like to question the prior existence of even the most basic of grammatical categories such as ‘noun’ and ‘verb’.

The strategy I used in Chapter 3 to establish (some of ) the colligations of consequence was to compare the word’s behaviour with that of other abstract nouns. Likewise in Chapter 4, I compared the colligations of hyponyms and synonyms with each other. Suppose, though, I had compared consequence not with aversionquestion or use but with taught or if or I had compared architect not with actoractressaccountant and carpenter but with has or on. What would we have learnt from such comparisons? Fairly obviously, we would have noted that consequence and architect both collocated with theone and another in positions immediately prior to the word. We would also have noted that consequence collocated with and architect with an in this position but that two positions to the left, there was collocation with and an for each of the nouns. We would likewise have noted that both words collocated with of immediately after the word, though in the case of architect, as we have seen, the collocation is pre- dominantly associated with the metaphorical sense of the word. None of these statements would have been true of taughtifhas or on. We might also have noticed that, when consequence occurred with the and was given first position, it colligated with a finite verb occurring after it. In short, we would have noted that consequence typically operates as a noun.

The statement that consequence typically operates as a noun is, as the above list of collocations and colligations indicates, only shorthand for claims of exactly the kind we have been considering throughout this book. The claim that consequence is a noun is really a claim about its collocations, colligations and semantic associa- tions. Its nominal status is the product of a cluster of collocations and colligations that only become visible when we stop taking it for granted that it is a noun.

The grammatical category we assign to a word, I want to argue, is simply a convenient label we give to the combination of (some of ) the word’s most characteristic and genre-independent primings. It is in fact the outcome of other factors, not the starting point for a linguistic description. The nested combina- tion of features that we label ‘noun’ on a particular occasion is, like any other nested combination, capable of being primed for other features, and if the same

nested combination occurs for other words, the category ‘noun’ (or ‘verb’ or ‘adjective’) can itself be colligationally primed. Statements of such priming will be, to all intents and purposes, syntactic statements. As shorthand for the nesting I have just described, we can say that the grammatical category a word belongs to is its grammatical priming. So instead of saying that consequence is a noun, we could say that consequence is strongly primed for use as a noun, ‘noun’ being here, as I have indicated, a convenient shorthand for a cluster of other primings. Like all primings, priming for grammatical category is a matter of tendency rather than requirement. So the lexical item winter, used in Bryson’s (and my) sentence as a noun, can also be used as a verb (I’ll winter in Brussels), as can bus.

Consider, too, the following sentence from a recent charity appeal letter from the Intermediate Technology Development Group:

  1. If your supporter number ends in ‘D’, you already Gift Aid your donations.
    Although there are interesting things to say about supporter and D, it is Gift Aid that I want to attend to. The capitals are evidence of the word sequence’s dominant grammatical priming as nominal group, since we associate capitals with names and not with verbs unless in sentence-initial position, but in this sentence the dominant priming has been overridden. To my knowledge, I have only received one such letter. If, though, subsequent charity appeals were to use Gift Aid in a similar way, then I (and other careful readers of charity letters) would become primed receptively to expect its use as (part of ) a verbal group in the domain and genre of charity appeals. This would be an example of a drift in the priming, discussed in Chapter 1 (see p. 9).So far my examples have all been nominal. But the claim applies to other grammatical categories as well. Consider the following sentence, part of a Guardian article on Shackleton, originally chosen as an illustration of the ‘verb’ use of winter referred to above:
  2. The expedition returns to England, having rescued the men left to winter on Elephant Island and picked up the party from McMurdo Sound.
    Most of the words in this sentence are attested in my data in a grammatical role different from that used in the above sentence. Without comment, I list the following, all taken from my data; the relevant words are emboldened:
  3. Many happy returns (NOUN)
  4. When he came to . . . (ADVERB)
  5. If you want a share of the prosperity that is there for the having (NOUN)
  6. One of the rescued remarked . . . (NOUN)
  7. the left luggage becomes a Pandora’s box of horrors and possibilities (ADJECTIVE)
  8. In winter, Hammerfest is . . . (of course! NOUN)
  9. He went on and on (ADVERB)
  10. I found it growing there and in more northerly outlets behind a sea wall, heavily picked but with little sign of exploitation (ADJECTIVE)
  11. I’m just trying to up the ante in home entertainment (VERB)
  12. We are on the up and they are on the way down (NOUN)
  13. There’s less to party about (VERB)
  14. . . . trying to sound the depths of voters’ feelings (VERB)
    All the above are arguably polysemous uses. Several of the other words have non-dominant grammatical uses, for example:
  15. . . . in European rather than purely island terms
    where both the parallelism with European and the modification by purely point to adjectival (as opposed to noun modification) use of island, and
  16. I am rooting about for the elephant folios, like Prince Alexei Soltikoff’s lithos of his Indian travels. Not quite elephant is the 1849 David Roberts Egypt and Namibia . . .
    I am not sure what this means but the modification of elephant by not quitesuggests another adjectival use.If we accept that grammatical categories are labels for combinations of primings, we have also to accept that the primings of some words or word sequences will not permit the application of the conventional grammatical labels (or any labels). Sinclair (1991), for example, argues against the treatment of of as a preposition, showing how its collocations and colligations are substantially different from those of other words that we give the label ‘preposition’ to. A similar argument could be made about other, less ubiquitous, words, such as agothan and far. What we call grammatical categories are best regarded as post-hoc generalisa- tions from the individual instances of lexical primings. Of course this claim is challenged by inflections, and I shall look at these in a later section after refining and reformulating the general claim I have been making for lexical priming.
    Word versus lexical itemThe position articulated in the earlier part of the book was built up over a number of years. My inaugural lecture in 1994 and a paper in 1996 presented atthe 23rd International Systemic-Functional Congress in Sydney (neither offered for publication) were my first attempts at articulating the notion of colligation and much of the matter in the second half of Chapter 4 was also first presented, in cruder fashion than here, in the latter paper. While all this was going on, unbeknown to me until later, Sinclair was giving conference papers and publish- ing articles that explored related ideas, and his work predates mine by at least a year (Sinclair 1996). As should already be apparent from the first three chapters, Sinclair and I had arrived independently at similar conclusions. This is not as surprising as it might seem. Firstly, my own thinking in the mid-90s was heavily influenced by Stubbs (1995, 1996), who in turn drew heavily on Sinclair’s work. Secondly, as my dedication indicates, I worked alongside Sinclair for 14 years, including on the Collins COBUILD English Language Dictionary; it is almost certain that I first learnt about colligation in this context. Nevertheless, with full allow- ance made for these factors, it is interesting that our positions are very similar in a number of important respects.Drawing on collocation, colligation, semantic preference ( semantic associa-tion in this book) and semantic prosody (which overlaps with the notion of pragmatic association described in Chapter 2), Sinclair (1996, 2004) shows how clauses such as it is not really visible to the naked eye are made up of ‘difficulty’ ‘visibility’  preposition  the  naked  eye, noting that this combination is ineffect a single choice. He argues that there are very many patterns like this and that they represent not exceptions in the operation of a grammatical system but the norm. He terms the patterns ‘lexical items’ and comments:
    If the model of a lexical item offered . . . turns out to be the only one, and the computational search is successful, then a text will be analysed into a string of units, each statistically independent of those on either side. The major structural categories that have been proposed here – collocation, colligation, semantic preference and semantic prosody – and their inter- relationships will be elaborated and will assume a central rather than a peripheral role in language description.(Sinclair 2004: 39)
    The argument of this book has been that the structural categories Sinclair lists are indeed central and are categories of the lexicon, constructed for each language user. We have seen how they interrelate and shall continue to look at their interrelations in this and subsequent chapters. That there are very many single choices in English of the kind Sinclair describes is implicit in all our discussion. For example, we earlier saw that the words a  word  against collocate with say or hear, have a semantic association with COMMUNICATIVE INTERCHANGE, have pragmatic association with ‘hypotheticality’ and ‘denial’ and colligation with modal auxiliaries. All of these features produce what is inSinclair’s terms a single choice, a single lexical item, illustrated in the embold- ened part of the following example:
  17. Thatcher wouldn’t hear a word against him
    However, close as our positions are, they are not identical. In the first place, there is a textual dimension to my approach which will have become apparent in Chapters 6 and 7. In the second place, central to my position is that words have collocations, colligations, etc. for the individual user and that corpora can only reflect this indirectly. Thirdly, while accepting the insights tied up in Sinclair’s notion of the lexical item, I am less confident that the lexical item can replace the word as an analytical starting point. He is certainly right that there are fewer lexical choices here than words, and this is a challenge to the unthink- ing adoption of the word as the basis of any linguistic description (such as mine), though this chapter will modify my position. There is, though, no obvious boundary to the posited notion of the ‘lexical item’. The combination‘hypotheticality’  ‘modal auxiliary’  ‘denial’  ‘production/receipt of com- munication’  hear/say  a word against in turn has a semantic association/colligation with ‘human subject’, a slightly weaker but still strong semantic association/colligation with ‘human “prepositional” object’, and a textual colliga- tion with sentence-final position. Are these also part of a single lexical item, a single lexical choice? I would argue that the question is not a fruitful one and that it is better not to rush too quickly to close off the upper boundary of the lexical item, particularly in the light of the kind of evidence presented in Chapters 6 and 7. The notion of priming and the operation of nesting can account in a systematic way for the move from the word to the lexical item, and indeed, from the lexical item to the wider text and (as we shall shortly see) from the syllable to the word. My claim is that priming contextualises theoretically and psychologically Sinclair’s insights about the lexicon.
    Word versus phonological stringBut the problem of the ‘word’ remains. I have formulated priming in terms of words and yet, as Sinclair shows, words are often subsumed within larger entities. Furthermore there are many languages where word boundary is prob- lematic and where the phenomena I have described in this book will operate either at a unit larger than the English word or, more commonly, smaller. The truth is that I have focused on the word as a convenient starting point for the description of priming, rather than for theoretically grounded reasons. Self-evidently for the child or the foreign language learner in an immersion situation, it will always be sounds or stretches of sound that are primed in the first place. The association of a (stretch of ) sound with a sense is itself the resultof priming, and therefore the priming of words is, strictly, an instance of nesting. For many speakers, sl is primed to associate with a slippery quality in slip (but also slimy, slope), slip is primed to have a quasi-collocational preference for ery (but also for way and shod), slippery is primed to have a collocational preference for slope (but also for customer) and slippery slope is primed negatively to avoid the Subject function in the clause and positively to end clauses either as prepositional Object or as direct Object. So we move from sound to syntactic position by reference to the same process of priming. All but the first of these involve some nesting.We should not allow the neat quasi-hierarchical account above to fool us into imagining that the description closely matches the psychological reality, or we will quickly drift back into seeing words as isolates, albeit combining in relat- ively under-described ways. As mature users of a language our priming presum- ably moves up and down this hierarchy. We might (at least for the sake of argument) encounter slippery slope first and use this as the starting point of our priming for slippery and slip, which could then be the start of our priming for sl. More importantly, what is primed may be a single sound (e.g. [t] or [d], which are primed to colligate with words themselves primed for use as verbs in English) when these sounds appear at the end of a syllable, in the same kind ofway that consequence is primed to colligate with subject  BE  that clause whenat the beginning of a clause, or it might be an extended sequence such as [1 kelæbereien w1] (in collaboration with), where the whole sound sequence is, I suggest, primed for a single sense/function. A phonetic/phonological starting point allows priming to explain wordplay, malapropisms and rhyme.
    Priming and grammarOnce we recognise that priming applies in the first place to stretches of sound such as syllables, the natural next step is to recognise that syllables such as ingto, and ful have their own priming. These too have collocations, colligations and semantic associations. In inflectional languages, an important part of the descrip- tion of such languages will concern the characteristic primings of key syllables. What we count as grammar is the accumulation and interweaving of the primings of the most common sounds, syllables and words of the language. So grammar is, in such terms, the sum of the collocations, colligations and semantic associa- tions of words like iswastheand of, syllables like inger and ly, and sounds like [t] (at the end of syllables) and [s] and [z] (likewise at the end of syllables). Danks (2003) finds that the processes of word formation are similar in kind to those described in the creation of lexical items; components of words are primed to combine in certain ways, but the priming may be overridden.From another perspective, what we think of as grammar is the product of the accumulation of all the lexical primings of an individual’s lifetime. As we collectand associate collocational primings, we create semantic associations and colligations (and grammatical category primings). These nest and combine and give rise to an incomplete, inconsistent and leaky, but nevertheless workable, grammatical system (or systems). The two perspectives on grammar just described are, in my view, quite compatible, despite their very different foci. The first attends to how a grammarian’s object of inquiry relates to lexical priming; it will be seen that I do not see lexical priming as rendering grammatical investigation redundant, though it does indicate that some objectives are more achievable than others in this kind of investigation and alters the way one might interpret grammatical claims and data.The second perspective attends more to the semantic and grammatical sys- tems a speaker builds up in their lifetime. For some (though not necessarily all) speakers, these systems may in self-reflexive fashion be brought to bear on the lexical primings that gave rise to them and some of the primings may be adjusted to accommodate them to the semantic and grammatical systems that the speaker has built/inferred from others. Alternatively a tension may arise between the data and the system. In such circumstances cracks in the priming may occur as a result of conflict between the original priming and the self-reflexivity of the post-hoc systems. Cracks, briefly mentioned in Chapter 1, are returned to in the final chapter.The claim then is that language acquisition is a matter of stretches of sound stream becoming primed in such a way that they become imbued, by means of nesting, with a rich and complex web of socially embedded, genre-sensitive collocations, semantic associations, colligations and text colligations (Chapters 1, 2, 3, 6). As a second stage, the language user becomes aware of shared primings between related words (Chapter 5) as well as of distinctive primings for differ- ent uses of a word. Out of these they will begin to abstract. Semantic associa- tions and colligations are of course themselves abstractions, but the abstractions that I am positing involve a reflexive priming. To take a concrete example, blackmail and bully are close co-hyponyms, with a periphrastic superordinate along the lines of ‘put someone under pressure to do something they weren’t planning to do’. As we saw in Chapter 4, co-hyponyms typically share some (though not all) of their primings. In this case, both blackmailed and bullied will be primed for many language users to associate semantically with PERSON(S)
    • BE  (e.g. He was blackmailed). The resultant combination is then primed to colligate with  into  V-ing (e.g. into working for them). Examples of sentencesreflecting these primings are:
  18. He was blackmailed into working for them [an authentic example like all the others despite its simplicity].
  19. A local man had been bullied into guiding them through the treacher- ous, quaking waste.For these patterns to be primed in the first place, the language user has of course to have encountered blackmailed and bullied in these contexts a moderate number of times. This means that they have also encountered the combination PERSON(S)  BE  blackmailed/bullied  into  V-ING a fair number of times. Thus, it is hypothesised, as a second stage (or possibly at the same time), that the language user becomes primed to associate PERSON(S)  BE  .. . NOMINAL GROUP
    • into  V-ing with words meaning ‘putting someone under pressure to dosomething they weren’t planning to do’ and will therefore be prepared both to immediately recognise and perhaps to produce sentences such as:
  20. . . . their parents have been seduced into believing that antibiotics and other dangerously useless but expensive medicines can save them.
  21. . . . yet even this mild-mannered diplomat has been goaded into show- ing exasperation with the United States and Britain.

Once this pattern has been primed in this way, it becomes available both for further primings and for comparison with other patterns. So just as the language user subconsciously had noted that blackmail and bully share primings, so they

also may note that PERSON(S)  blackmailed/bullied  PERSON(S) also colligates with

  • into  V-ing. They may therefore associate the two patterns and treat them as

‘co-hyponyms’ of some more abstract pattern meaning the same thing. If they do so, they will have created one of the patterns described by Hunston and Francis (2000) or by Fillmore et al. (1988) and Goldberg (1995), and indeed Hunston and Francis categorise the abstract pattern as ‘V n into n’ and list the vocabularies associated with it (p. 117).

I would guess that not all language users make it this far but that the majority do. Certainly the ability to cope orally in the language will demand only that the lexis of the language is appropriately primed for the user. However, the patterns described by the linguists just mentioned are fairly basic and fairly meaningful, and a language user who did not connect up their patterns in such a way would soon stumble in the creation of extended monologues such as are characterist- ically needed in writing.

The process of abstraction need not stop here. The relationship of PERSON(S) 

BE  blackmailed/bullied  into  and PERSON(S)  BE  voted/thrown  out of is such as to permit the priming of BE  V-[d], [t], [1d] and [n] for co-occurrence with PREPOSITION. At the same time the colligational priming of blackmailed and bullied

for occurrence in both the patterns PERSON(S) (choice 1 from the set)  BE 

blackmailed/bullied  into  and PERSON(S) (choice 2 from the set)  blackmailed/ bullied  PERSON(S) (choice 1 from the set)  into  (and indeed the colligational priming of voted and kicked for occurrence in the similar patterns PERSON(S) (choice 1 from the set)  BE  voted/kicked  out of and PERSON(S) (choice 2 from the set)  voted/kicked  PERSON(S) (choice 1 from the set)  out of ) will permit

the colligational priming of PERSON(S) (choice1)  V  PREPOSITION for something that looks like the active/passive distinction. In short, from primings such as these, repeated over many verbs, language users may create for themselves an active/ passive distinction and in so doing put themselves on the way to creating a grammar. Notice that the priming has shifted at each stage. Initially I posited a priming for blackmailed and bullied, neither of them particularly common words. As a result of these words’ shared priming, it is possible for the words BE .. . into to become primed. This priming led in my argument to the nested priming of the combination of BE  V  /d/ /t/ and /1d/ (and possibly PERSON(S)) to occur with PREPOSITION. At the same time, each of the items in the semantic sets that comprise the verb choices in these patterns was being primed for the pattern PERSON(S) (choice1)  V  PREPOSITION, which in turn was being primed for what

are usually referred to as active and passive voice.

This is of course not the only, or even the most likely, way in which the priming of individual words might lead to the creation of grammatical categories or grammatical relationships. My point is not to argue for a particular sequence – that would be futile in any case since it is inherent in the notion of priming that each language user’s route to abstraction will be unique to them – but to argue that the principle of priming allows for the creation of grammatical abstractions. Lexical priming does not therefore assume the incorrectness of grammatical work, whether that work is very close to the surface as in the linguists mentioned immediately above, a little more abstract as in systemic linguistics or some way below the surface as in the many generative grammars spawned by the Chomskyan revolution. It does, however, assume that the gram- mars are never complete, because even the most thorough of grammar-creating language users must constantly encounter non-congruent usages produced by those without a fully integrated grammar (or occasionally, perhaps, without any grammar at all). It also assumes that the grammar cannot have central place in a linguistic description; that privilege belongs to the lexicon.

Priming, discourse and text

There is one area where we have seen on several occasions that lexical priming cannot account for all linguistic choices, and that is in the area of discourse. We have seen in several places in the past few chapters a close relationship between the lexical choice and the textual pattern but also in places a tension between the two. The simple fact cannot be escaped that we do not think of a word and then start uttering, drawing as we progress on all the primings at our disposal. Self-evidently, we instead mostly start with a communicative need, whether that need is an expression of sympathy or a brief apology, an attempt to be vague or pass a turn back to our listener, or an extended act of informing and persuading, such as this book.

There seem to be two fundamental dynamic processes involved in the produc- tion of spoken or written interaction. One has been the subject of this book. Every lexical choice starts off a series of options and predilections that result in an amazing fluency in any situation in which the speaker has been primed to perform. The other is the discoursal, and this is the process whereby we decide that we shall speak or write and what we want to say. Sinclair (1996, 2004) refers to semantic prosody as the outcome of all the choices a speaker or writer makes; what I am here envisaging is the obverse of this – the initial impulsion to inform, contradict, praise and so on. If the semantic prosody matches the ori- ginal intention, presumably the speaker/writer is satisfied.

We have therefore to assume that the discoursal impetus and the lexical priming are interconnected but not coterminous. We also have to assume that primings are stored two-way. If sorry is primed in different combinations to occur as part of an expression of sympathy or as an apology, so also feelings of sympathy and apology must be primed to elicit sorry. If and things is primed to indicate vagueness (Channell 1994) or to indicate willingness to pass a turn (Duncan and Fiske 1977), vagueness and turn-ending must be primed to elicit the words and things. If the words In this book, with which I began the first chapter of this book, are primed for text-initial position in an academic mono- graph, then the need to start an academic monograph must be primed to elicit words such as In this book. What it is that is primed I have no clear notion of; certainly a human need or discourse context is altogether less precise than a stretch of letters or sounds. But there must be some such two-way priming for all the other primings to be available for activation in communication.

In 1991 I proposed the model of language shown in Figure 8.1 to account for the lexical and textual phenomena I had been describing (Hoey 1991a). The claim I was making was that lexis and text were organised rather than structured and that phonology, syntax and interaction were structural systems needed to act as interfaces between phonic substance and lexis, between lexis and text and between text and the extra-textual context. I want now to modify this position, while retaining much of what was there posited. I would argue that encounters with stretches of phonic substance prime us to accept and produce morpholo- gical and syllabic combinations that produce our words. Morphology therefore seems to function as an interface between lexis and phonic substance, though the processes of priming are the same. Phonology on the other hand is perhaps better seen as an abstraction from the primings associated with stretches of phonic substance, of a similar theoretical status to syntax, in that our phonology may have inconsistencies but has a regularising function. Similarly, encounters with words and word sequences result in our being primed to produce accept- able sentences and texts, with grammar being a product of these primings with a regularising function. Thus far my present position differs only slightly from that posited in 1991, though clearly the notion now is of phonology and syntax





phonic substance




extra-textual features


Figure 8.1 A map of the interlocking of linguistic levels (taken from Hoey 1991a: 213)

as products rather than interfaces. However, I would now see text as an interface between lexis and the discoursal need in the extra-textual context, with discourse structure being, again, a product rather than an interface (see Figure 8.2).

However we model the relationship (and I am not wedded to this particular representation), it is clear that we need both the impetus of lexical priming and the representation of discourse need. Wang and McCarthy (2004) note that the principles of reduplication and repetition are amenable to being explained in terms of general probabilistic trends (priming in my terms) (Wang, forthcoming), but that the exceptions are explicable only when the larger organisation of the texts in which they appear are taken into account. I suspect that the motivation for overriding a priming may often be a textual pressure (though, as we have seen, these are themselves subject to description in terms of priming).

disourse need in extra-textual context


lexical primings


phonic substance

discourse structure

grammatical structure means

phonological structure

Figure 8.2 An alternative mapping of linguistic levels

Priming prosody

The model presented as Figure 8.2 appears to leave one question glaringly unanswered. If syntax/grammar is not an interface between lexis and text, how do individual primings combine to create text? The answer I shall offer in this section is that of what I term ‘priming prosody’.

Throughout this book, I have been arguing two positions, though at times one or other of them has dominated. The first is that every word is characteristically primed for a range of genre, domain and situationally-specific features, which cumulatively account for, and contribute to, what have traditionally been treated as the syntax, semantics, pragmatics and discoursal features of a language. The second is that this priming is individual for each user of the language and therefore the characteristic primings of a word, as reflected in a corpus, need not be any particular user’s primings. But the alternative to the characteristic primings is not an absence of priming (except of course when a word is being encountered for the first or second time). Taking into account the broadening of the concept of priming in this chapter, we can assume that every user is primed, more or less specifically, to use every familiar word in every familiar situation in

particular ways, and that creativity, in the Chomskyan sense, can be explained in terms of the more general primings, just as naturalness can be explained by the more specific primings. There is, however, one step still to be taken and that is to consider how the primings interact. For that we need the notion of what I term priming prosody (referred to in Hoey 2004b, 2004c, as colligational prosody). Priming prosody is not the same as semantic prosody (Sinclair 2004) in that Sinclair’s concept refers to the meaning outcome of the choices made in an utterance. Priming prosody is concerned with the processes of utterance con- struction rather than utterance construal, though it is assumed that the two concepts are profoundly related. Priming prosody occurs when the collocations, colligations, semantic associations, textual collocations, textual semantic associa- tions and textual colligations of words chosen for a particular utterance har- monise with each other in such a way as to contribute to the construction and coherence of the utterance. This can be illustrated by reference again to the Bill Bryson sentence we have been examining on and off throughout this book. Figure 8.3 lists in columns (some of ) the characteristic collocations, colligations (etc.) of (some of ) the component words of the first clause of this sentence. Some of the colligations and other associations listed have not been discussed in any detail, but they are all demonstrable. Between the columns will be found lines of various kinds indicating where words in the clause are likely to share a particular priming, based on corpus evidence. These are claimed to be, for most users, the likely priming prosodies implicated in the construction of this clause.

There are four such prosodies indicated in Figure 8.3. Firstly, we have prosody of collocation with BE, indicated by a straight line. Secondly there is prosody of colligation with relational process, indicated with dotted lines. Thirdly we have prosody of semantic association with PLACE NAME, indicated by a line of dashes. And finally, there is the rather specific prosody of more common/less common PLACE NAME prosody, indicated by a thicker line.

I have indicated that the ‘more common place name/less common place name’ prosody is more specific than the others. But it is of course possible, as indicated by previous discussion in this chapter, to have more general prosodies. So, for example, as indicated in Figure 8.3, winter colligates with PREPOSITION, but of course PREPOSITION also colligates with NOUN and winter is primed to occur as NOUN. This therefore is also an instance of priming prosody. The same more general priming prosodies can be shown to operate across the clause (and indeed all clauses). Figure 8.3 therefore represents the priming prosodies that con- tribute to the naturalness of the clause. A separate figure would be needed to show the full range of prosodies that contribute to its acceptability as a clause of English. Such a figure would be complex but not in principle difficult to create, and it would carry information of the kind handled traditionally by grammars.

Priming prosody is claimed therefore to integrate the kinds of corpus-driven insights discussed in this book and the more abstract descriptions provided

winter Hammerfest thirty-hour ride Oslo


with preposition

collocates with in

in winter colligates with present tense

has semantic association with TIMELESS TRUTH


with text-initial position

colligates with

paragraph-initial position

colligates with subject

colligates with relational processes

colligates with relational processes

and apposition

collocates with BE

has semantic association with PLACE NAME(S)

has semantic association with MODE


has same colligations etc. as Hammerfest


semantic association with LESS COMMON PLACE NAME

thematised in winter collocates with BE

has semantic association with


colligates with relational process

colligates with prepositions

has semantic association with



(temporal or spatial)

has semantic association with MORE COMMON PLACE NAME

Colligates with text-initial position

Figure 8.3 The priming prosodies that bind the colligations etc. of Bill Bryson’s first clause

elsewhere. To take just one instance, grammarians have disputed whether the head of a noun phrase is (normally) a noun, the standard position adopted (e.g. Halliday 1994; Biber et al. 1999), or whether the head is really the determiner (the ‘determiner phrase’ hypothesis) (Abney 1987 cited by Rappaport 2001). Such a dispute can be resolved, I suggest, by recognising that each of the members of the class of determiners are primed to occur with words that are primed to function as nouns and that such ‘noun’ words are primed to occur

with members of the class of determiners. (The priming prosody thereby cre- ated both helps define what counts as a noun and as a determiner in a language.) With the notion of priming prosody, we leave the Chomskyan concept of creativity and turn to the kinds of creativity that draw attention to themselves – the second and third kind of creativity discussed in the first section of this


Other kinds of creativity 169

9 Lexical priming and other kinds of creativity

Language that surprises

In the previous chapter, we were concerned with the kinds of creativity that go unnoticed – the natural ability of any language user to produce utterances that are novel. It was argued there that lexical priming provided an adequate account of such creativity. But as we noted, there are other kinds of creativity, where language is used in startlingly novel ways, whether by accident or design. One of these kinds of creativity occurs when someone says or writes something that surprises the recipient, whether because of its incongruity, humour, wordplay or simple oddness. Carter (2004) has shown that linguistic creativity of this kind is deeply embedded in the ordinary language practices of non-literary users of English (if the word ‘ordinary’ can be used of a facility that is always special, however often it is used). A theory of the lexicon must have something to say about such language.

It also needs to be able to say something about the special – and specially valued – creativity evident in the writing of literary writers. Bill Bryson’s creat- ivity can be handled along the lines discussed in previous chapters, but what do we do with the language of writers such as Gerard Manley Hopkins, Dylan Thomas and James Joyce – writers some of whose utterances would not be covered by traditional generative grammars? As I said in Chapter 8, if a linguistic approach cannot say something interesting about literary language, there is some- thing wrong with the approach. Accordingly this chapter concerns itself with surprising language.

Unintended creativity

For a short time, the Guardian had as one of its weekly supplements a magazine called The Editor and a regular feature of this magazine, known as ‘Our readers’ reads’, was a selection on the back page of cuttings from local newspapers (and elsewhere) submitted by readers who had been surprised and amused by the way

language had been used. These were supplemented by comments from the Guardian which often highlighted and made more amusing (and visible) the original oddity. One such page included the following:

    An interesting cultural exchange from the Stevenage Comet, Jan 11
  2. Saturday, 13th JanuaryFAREWELL NIGHTLive MusicBACKDOOR MENFeaturing Jeff Fuller and the Tucker Sisters plus Friendswith Special GuestEDDIE MARTINA NIGHT TO BE MISSEDFree entry – 8 pm start
    Put your diaries away. From the Post & Weekly News, Jan 11

Zero tolerance in Bridlington. From the Free Press, Jan 11

All of these are, for different reasons, instances of surprising English or they would not have been included in the column, and all under different circumstances might have been produced deliberately for the purposes of humour or wry comment.

Priming conflict

I want to suggest in this section that just as priming prosody contributes to the apparent naturalness of an utterance, so a lack of prosody may contribute to the apparent unusualness of an utterance. One manifestation of a lack of prim- ing prosody is priming conflict, which occurs when a choice of one priming is overwhelmed by another, more dominant, priming. The result here is either ambiguity or humour.

We saw in Chatper 8 that priming begins with the phonological string. It is also possible (and natural) for orthographic strings to be primed, and it is probable that the two types of priming reinforce each other. In the case of example 1, seen from the former perspective, the phonological string /ræb/ has a range of possible primings  /a1/,  /1t/ and /i:/, and the wrong one has

been chosen. Seen from the latter perspective, Rabb has the possible primings i, it (as in Peter Rabbit) and ie. Either way, we have priming conflict. In my corpus, the combination Rabb  i is greatly more common than Rabb  ie when associat- ing semantically with NAME. There are 323 cases of Rabbi  NAME in my data but only seven cases of Rabbie  NAME, of which three are Rabbie  Burns. The dominant priming of Rabbi  NAME has therefore been chosen, when the less dominant priming of Rabbie  Burns needed to be conformed to. Although offered by the Guardian as humorous, it is worth noting that if the writer had been James Joyce, the collision would have been applauded.

Note that what is likely to be primed as dominant will vary according to who you are and where you live. For a nationalistic Scot living in Edinburgh, Rabbie  NAME is likely to be the dominant priming. The example, however, comes from Stevenage and one might guess that at least some people of non-Scottish descent in the Stevenage area will not have that priming. For them Rabbi will be the more likely combination (irrespective of their faith). We have here priming conflict where the dominant priming may have inappropriately shouldered out the less dominant, but in this context more appropriate, priming.

Example 2 also manifests priming conflict. The combination to be missed occurs 81 times in my data of which almost 75 per cent are instances of not to be missed. The semantic set associated with the latter phrase is ‘opportunity’ – opportunity not to be missedchance not to be missedCD not to be missed etc. However, there are no instances of a night not to. Instead, there are 16 instances of a night to (excluding references to the film A Night to Remember), 13 of which occur, like not to be missed, as part of a positive evaluation. The absence of a night not to as a priming results in a priming conflict between the desire to reproduce a night to and the need to combine not with to be missed.

The third example is perhaps the most interesting. The semantic association of man with  WEAPON is a strong priming for Guardian writers, there being 78 instances of the combination in my data. Of these 91 per cent refer to a man

holding a weapon, suggesting that the comical reading of example 3 is the natural one. This impression is reinforced by three of the remaining seven instances, which do indeed refer to cases where the man is being attacked with the weapon but have clear disambiguating features. These instances are

. . . killing a man with his bare hands

. . . a black man with his truncheon

. . . a man with the assault rifle

The three examples are all disambiguated by the clash between the marker of definiteness that precedes the weapon and the marker of indefiniteness that precedes man. The expectation is that they will either both be definite or both indefinite (or more rarely, definite followed by indefinite).

A fourth instance is disambiguated by the postmodification of knife:

. . . the man with the knife in his throat

which shows the man to be, in a way, holding the knife but not as a weapon.

So how did the sentence come to be produced, given this typical priming of man with  WEAPON? The answer lies in a phenomenon we have encountered several times before, where an exception to the dominant priming exists. When we looked at army in Chapter 6, we saw it had a particular cohesive priming except where it was in combination with of in which case this priming disap- peared, unless again it was an instance of of  occupation or NUMBER, in which case the priming reappeared. It would appear that the same phenomenon is in opera- tion in this instance. The combination HUMAN  with a .. . carving knife is not primed the same way as man with  WEAPON. In 24 instances in my data, only twice does it refer to the case of a person holding a carving knife. So we have here a clear conflict between two primings, with the dominant priming over- whelming the less frequent one and producing the comic combination. Interest-

ingly, it takes some linguistic skill to avoid the conflict.

So much for the second type of creativity. Clearly I have only scratched the surface of explaining why some sentences surprise us, but I hope I have at least demonstrated that lexical priming has something useful to say about such cases.

Literary creativity

We are left with the third kind of creativity, the kind associated with high literary endeavour. To illustrate the way lexical priming might be used to explain such creativity, I have chosen three types of text: a piece of non-fictional prose by Charles Dickens, part of a poem by Lord Tennyson and the beginning of a poem by Dylan Thomas. The first is immediately recognisable as being by Dickens, despite its being a piece of travel writing, and my aim here is to show how Dickens has created a stylistic effect by overriding a textual collocational priming. The second is an immediately intelligible piece of nineteenth-century verse that regularly crops up in anthologies of English poetry; my aim with this is to show that lexical priming permits an explanation of why one line in it is found by many to be so memorable. The third is a piece of very difficult writing, on the very borders of intelligibility and grammaticality; here my intention is to show how even so the writer makes use of (what are assumed to have been) his lexical primings. The three extracts are intended to cover the gamut of literary

creativity from language barely distinguishable from non-literary language to language barely comprehensible.

A passage from Charles Dickens’ Pictures from Italy

The passage from Dickens I want to comment on is a very long paragraph taken from one of his travel books (appropriately enough, given the focus elsewhere in this book on travel writing), namely Pictures from Italy. The punc- tuation, which does not conform to twenty-first century practice in places, is Dickens’ own:

One day we walked out, a little party of three, to Albano, fourteen miles distant; possessed by a great desire to go there by the ancient Appian way, long since ruined and overgrown. We started at half-past seven in the morning, and within an hour or so were out upon the open Campagna. For twelve miles we went climbing on, over an unbroken succession of mounds, and heaps, and hills, of ruin. Tombs and temples, overthrown and prostrate; small fragments of columns, friezes, pediments; great blocks of granite and marble; mouldering arches, grass-grown and decayed; ruin enough to build a spacious city from; lay strewn about us. Sometimes, loose walls, built up from these fragments by the shepherds, came across our path; sometimes, a ditch between two mounds of broken stones, obstructed our progress; sometimes, the fragments themselves, rolling from beneath our feet, made it a toilsome matter to advance; but it was always ruin. Now, we tracked a piece of the old road, above the ground; now traced it, underneath a grassy covering, as if that were its grave; but all the way was ruin. In the distance, ruined aqueducts went stalking on their course along the plain; and every breath of wind that swept towards us, stirred early flowers and grasses, springing up, spontaneously, on miles of ruin. The unseen larks above us, who alone disturbed the awful silence, had their nests in ruin; and the fierce herdsmen, clad in sheepskins, who now and then scowled out upon us from their sleeping nooks, were housed in ruin. The aspect of the desolate Campagna in one direction, where it was most level, reminded me of an American prairie; but what is the solitude of a region where men have never dwelt, to that of a Desert, where a mighty race have left their footprints in the earth from which they have vanished; where the resting-places of their Dead, have fallen like their Dead; and the broken hour-glass of Time is but a heap of idle dust! Returning, by the road, at sunset! and looking, from the distance, on the course we had taken in the morning, I almost feel (as I had felt when I first saw it, at that hour) as if the sun would never rise again. But looked its last, that night, upon a ruined world.

It would be very possible to devote a full chapter to exploring the way Dickens exploits and overrides his and our (presumed) primings. But, apart from the obvious pressures of space, one problem presents itself in the analysis of this and the next two texts that makes it undesirable to spend too long on them. I have stressed repeatedly throughout this book that primings are not in principle generalised across all text types, genres and domains, (though no doubt some are, such as primings for grammatical category). Still less can they be trusted to apply across centuries. Dickens wrote Pictures from Italy in 1842; his primings and those of his readers can certainly be assumed to be significantly different in respect of travel writing from those of Bill Bryson and his readers and informed by a quite different set of previous reading experiences both of travel writing and of other kinds of text. It will be remembered that the specificity of primings with regard to genre, domain and the like was particularly noted in respect of textual primings. Yet it is the textual priming that Dickens has, I believe, overridden that I want to focus on. Clearly caution is required.

I want to attend to Dickens’ use of ruin. There are (at least) two uses of ruin available to the language user – one is evaluative, where ruin is a way of saying that something (a building, an economy, a plan) is in a bad way; the other is

descriptive, where a ruin may be a tourist attraction and deemed beautiful. The first use is negatively primed with regard to cohesive chains. I examined 40 texts containing this use of ruin, and none contained cohesive chains. The second use is also negatively primed for cohesive chains, but they do occasionally occur. I examined 39 texts containing the second sense of ruin (I couldn’t find a fortieth in my corpus), and there were two chains, both in the context of tourism. Dickens has therefore – according to contemporary primings in a different type of writing – overridden the typical priming for ruin and created a sub- stantial chain. More crucially, the chain begins with the descriptive use and quickly turns into a chain of the evaluative sense. The former in my data, as just noted, chains rarely; the latter chains not at all. I repeat the passage with the two senses orthographically marked out, the descriptive being italicised and the evaluative being emboldened. Obviously there is an element of judgement here, but the priming of the two senses informally permits the distinction to be made safely in a number of cases. So, while a ruin in my data can be either evaluative or descriptive, the ruin has more use in the descriptive sense and ruin (without an article) is always evaluative. The word ruined is predominantly evaluative, also.

One day we walked out, a little party of three, to Albano, fourteen miles distant; possessed by a great desire to go there by the ancient Appian way, long since ruined and overgrown. We started at half-past seven in the morn- ing, and within an hour or so were out upon the open Campagna. For twelve miles we went climbing on, over an unbroken succession of mounds,

and heaps, and hills, of ruin. Tombs and temples, overthrown and prostrate; small fragments of columns, friezes, pediments; great blocks of granite and marble; mouldering arches, grass-grown and decayed; ruin enough to build a spacious city from; lay strewn about us. Sometimes, loose walls, built up from these fragments by the shepherds, came across our path; sometimes, a ditch between two mounds of broken stones, obstructed our progress; sometimes, the fragments themselves, rolling from beneath our feet, made it a toilsome matter to advance; but it was always ruin. Now, we tracked a piece of the old road, above the ground; now traced it, underneath a grassy covering, as if that were its grave; but all the way was ruin. In the distance, ruined aqueducts went stalking on their course along the plain; and every breath of wind that swept towards us, stirred early flowers and grasses, springing up, spontaneously, on miles of ruin. The unseen larks above us, who alone disturbed the awful silence, had their nests in ruin; and the fierce herdsmen, clad in sheepskins, who now and then scowled out upon us from their sleeping nooks, were housed in ruin. The aspect of the desolate Campagna in one direction, where it was most level, reminded me of an American prairie; but what is the solitude of a region where men have never dwelt, to that of a Desert, where a mighty race have left their footprints in the earth from which they have vanished; where the resting-places of their Dead, have fallen like their Dead; and the broken hour-glass of Time is but a heap of idle dust! Returning, by the road, at sunset! and looking, from the distance, on the course we had taken in the morning, I almost feel (as I had felt when I first saw it, at that hour) as if the sun would never rise again. But looked its last, that night, upon a ruined world.

What Dickens has done here is create a cohesive chain in contradiction of the negative priming of both senses of the word for chaining. Furthermore, a glance at the passage above shows that the evaluative sense dominates; yet this is a travel text and what little chaining does occur with ruin occurs in travel writing with the other sense. This points to the fact that he has mingled the senses in a single chain in defiance of normal cohesive practice. Most lexical items are primed cohesively to avoid cohesive links between their polysemous senses. This, like all primings, can be overridden, as Dickens shows.

Dickens’ strategy here is typical of creative writing. In the first place, he has maintained the vast majority of the primings associated with ruin. Each individual sentence uses ruin in a not untypical manner, except perhaps for a couple of cases where one’s priming might have predicted a ruin in place of ruin. The creativity lies in an act of overriding a single, important priming. Secondly, he takes a priming associated with one sense (in this the case the tentative weak priming of ruin ( tourist attraction) for cohesive chaining) and uses it with a different sense.

Two lines from Tennyson

The creativity of the novelist usually involves undramatic deviations from the dominant primings (though of course James Joyce shows that this need not be the case). Poetry on the other hand routinely involves more notable deviations. A question that arises here is how memorability is achieved in poetry. It is rare for sentences of prose works to be recalled, though religious texts are an exception. The lines of poems are, however, more frequently remembered. One poem that is often anthologised and is remembered by many older British readers is ‘The Charge of the Light Brigade’ by Lord Tennyson in which he celebrates the crazy heroism of the Light Brigade who rode to their deaths as a result of a mistaken order.

I want to focus on just two lines from the poem:

Theirs is not to reason why Theirs is but to do and die

So well-known are these lines that when Cecil Woodham-Smith wrote her celebrated book on the military disaster she named it The Reason Why.

The first point is that their  NOUN, when Subject of its clause, is primed to

collocate with BE to. The sequence their NOUN BE to, where the noun is head of its own group functioning as Subject, occurs 87 times in every 10,000 instances of BE to. The sequence their NOUN BE occurs only 21 times in every 10,000 instances of BE. The same priming is true of his  NOUN, my  NOUN and so on, leading to the more abstract priming of POSSESSIVE  NOUN for BE to. This combination accounts for 5 per cent of all instances of BE to (whereas the combination POSSESSIVE  NOUN  BE accounts for 1.4 per cent of instances of BE). There are on the other hand no instances of theirs occurring with is to (apart from a single quotation from Tennyson’s poem). Tennyson has therefore complied with the priming of POSSESSIVE  is to but overridden the expectation of a noun.

The second point concerns to reason why. It is notable that Cecil Woodham- Smith’s book was called The Reason Why, not To Reason Why. As we saw in Chapter 5, a why clause often follows reason ( cause). Here however it has been yoked to the verbal use, whose sense is closer to reason ( rationality, logic). Once again, then, we have a blend of conformity to a characteristic priming with a notable overriding of a facet of that priming.

A few words from Dylan Thomas

Dylan Thomas wrote slowly and his texts are typically full of sentences that would be excluded from grammars. Here are the first two lines of Thomas’ poem ‘A grief ago’:

A grief ago,

She who was who I hold, the fats and flower,

You will be disappointed perhaps to learn that I wish to look at the use of just one word in these lines, which however I offer as indicative of how I would approach the rest of the poem. The word is ago, which we have already looked at in Chapter 6. The word sequence a grief ago has been discussed in the literature in connection with its use of grief and I want to place that choice in a larger context. The word ago has the following typical primings, offered for each of the types of priming we have considered in this book, presented here without evidence (but see Hoey 2004a):

  1. ago is primed for collocation with years, weeks and days;
  2. it is primed for semantic association with units of time, e.g. six weeks agoa minute ago;
  3. it is primed for semantic association with measurement, e.g. asixforty;
  4. it is primed for pragmatic association with statements rather than questions or instructions;
  5. it is primed for colligation with Adjunct function;
  6. it is negatively primed for cohesion;
  7. it is primed to appear in contrast relations;
  8. it is primed for paragraph-initial position, when it is sentence-initial;
  9. it is primed for text-initial position, when it is sentence-initial.

Dylan Thomas has in fact conformed to six of these primings and has over- ridden only three. He has coupled a marker of measurement (a) with ago, incorp- orated it into a statement, used it in an Adjunct, not employed it in any cohesion (it will be remembered that the cohesive link between title and text were systematically excluded from consideration in our exploration of cohesive prim- ing), placed it at the beginning of a verse (which I take to be the nearest equivalent we have to a paragraph) and begun his text with it. The primings he has overridden are the first two (and the semantic association can be seen as a generalisation out of the collocations) and the seventh. In short, even when writers are straining at the limits of what a language is capable of expressing, they make use of more of their primings than they reject. A sentence that overrode all its reader’s primings would only be a sentence by virtue of starting with a capital letter and ending with a full stop; it would not correspond to anything recognisable as an instance of language in use.

178 Theoretical and practical issues

10 Some theoretical and practical issues

Cracks in the priming

The notion of priming in this book has largely been discussed impersonally, as if it were simply a property of the language. As shorthand, I have in places talked of words being primed, and only the tell-tale words ‘typically’ and ‘characteris- tically’ have hinted at the personal and individual history that lies behind the apparent property of the word. In fact, though, as I hope was made clear in Chapter 1 and sporadically elsewhere, priming is what happens to the individual and is the direct result of a set of unique, personal, unrepeatable and humanly- charged experiences. Words come at us both as children and as adults from a plethora of sources. Parents, caretakers, friends, teachers, enemies, strangers (friendly and scary), broadcasters, newspapers, books, cards, letters, fellow pupils or colleagues – all at different times and to different degrees contribute to our primings.

The contexts in which we encounter lexis contribute to the way it is primed for us and we are in turn and as a result primed to use such lexis in these contexts. But we must not be crude about context. We have enough control of the situation in which we speak to use words as if we were in a particular context, thereby contributing to the creation of such a context (or of a different context that references such a context). Likewise, the kind of context we are in may be felt to drift during the course of our talk and the talk we produce and hear will both have contributed to the drift and be affected by it.

Inevitably, not all of the data we receive from such complex sources will provide a single picture, nor will we feel the same way about the sources of the data. The language of an enemy will prime us differently from the language of a friend, simply because we may not wish to emulate the enemy and may wish to show solidarity with the friend. And two friends, or members of our family, may use a particular word differently, resulting in a conflict in the primings.

Many conflicts are unlikely to create communicative problems; a word may after all have a number of collocates and semantic associations, and it is only if

the latter directly contradict each other that a problem is likely to arise for the user. A particular point of conflict (a crack) in the primings, however, would seem to occur as a result of the grammar that each individual is building up out of the primings and which is hypothesised (in Chapter 8) to act self-reflexively on the primings, regularising them and generalising from them.

Cracks, briefly mentioned in Chapter 1, occur when conflicting data about the use of a word or word sequence is received and the language user can find no way of resolving the conflict. The cracks created by the self-reflexive grammar are less overt than those from outside, particularly from education (though the former may be affected by education, which often attempts to model the indi- vidual’s internalised and personal grammar). As an example of a crack caused by the self-reflexive application of grammar on the primings that gave rise to it, consider the cases of me and you and me and X. For the great majority of speakers of English is strongly primed to occur in Subject function and me is strongly primed to avoid such a function, and, coupled with similar colligational primings for weshehe and they, this leads to a grammatical inference about an aspect of the operation of the pronoun system. But for many of these speakers, me will also be primed to collocate with and and you and to have a semantic association with NAME, as in me and my friend Alastair. If therefore there is a need to use one of these expressions in Subject function, there is a conflict between the two primings, which can only be resolved by one overriding the other. If the speaker remains unaware of the conflict, there need be no problem in this, though of course the resolution may make the speaker’s inferred grammar of pronouns leak a little more.

In the instance mentioned in the previous paragraph, the crack can be mended in one of two ways or not resolved at all. One way of mending the crack, and certainly the better, is to reserve me and NAME or me and you for certain social situations and particular types of genre/domain. So in casual conversation with me over a pint of real ale, my son used the word sequence Me and my friend Alastair as Subject; it is doubtful whether he would have used the same kind of expression at the press briefings he used to organise as part of his work. For him clearly there is no longer a crack. The other way of mending the crack is simply to treat one of the pairs of primings as receptive only (see Chapter 1 for discussion of receptive and productive primings). This however may result in tension at home or in the classroom.

Sometimes attempts at mending the cracks merely move the problem to another place. So if the me and you priming is treated as receptive only and is replaced by you and I, there is a risk that the latter will be the result of I being primed to collocate with you  and and not the result of the self- reflexive functioning of the grammar. If the more specific priming is not overrid-

den by the more general colligational priming of as avoiding all grammatical functions other than Subject, it will give rise to (parts of ) utterances such as between you and I.

It will be recalled from Chapter 1 that cracks may also occur as a result of conflict between a speaker’s primings and someone else’s primings; sometimes the conflict is also between the speaker’s primings and the other person’s post- hoc (but strongly believed in) grammatical system. One of the places where this is particularly likely to happen is, as already mentioned, in the educational system, which introduces another form of self-reflexivity. Explicit input from the teacher, in particular the correction of writing and, sometimes, speech in the classroom, often produces conflict with the primings achieved at home. Indeed the conflict regarding you and me described in previous paragraphs is as likely to be provoked by a teacher as by the speakers themselves. Cracks can be resolved either by adjusting the original priming or by rejecting the educational challenge to the priming. Either way, the resolution of such conflicts can be painful. Of course the degree of conflict will vary from person to person. It is not unreason- able to suppose that the child whose home primings least conflict with the school primings will both respond most positively to language in the classroom and will suffer least from the need to resolve the conflict.

Worse than adjusting the original priming or rejecting the new attempt at

priming is a permanent uncertainty about the priming, a codification of the crack, leading to long-term linguistic insecurity. Primings are, as was noted in Chapter 1, domain specific. Most cracks can be mended by assigning one set of primings to one domain or social context (e.g. family and friends) and the other set, whether the result of self-reflection or as a result of educational challenge, to a different domain or social context (i.e. education, science, the middle class etc.). We can combine the notion of cracks in the priming with our recognition in Chapter 8 of the existence of phonological priming to explain how it is that chil- dren utter things they have never heard. Take for example the observation that

children never hear goed but say it all the same. For the vast majority of speakers of English, the sounds [d] and [t] and the syllable [1d] are primed to have a semantic association with the very broad set of (different) words used to report actions, but the word go, although a member of this set, is not primed to collocate with these sounds. This creates a crack, which is worsened by the fact that, in stories, names and pronouns are typically primed to occur in narratives with a nested combination of action verb and [d][t] or [1d]. As a way of mending the crack, the child allows the priming of the more common item [d] to override the negative priming of the less common item ( go). This is because they will have much more data on [d][t] and [1d] than on go, and because they are likely to be using GO in a narrative context in conjunction with a name or pronoun (both of which, as just mentioned, are primed in narratives to occur with an action word and [d][t] or [1d]).

Harmonising primings

If the view of language put forward in this and the previous two chapters is correct, there are a number of implications, both for linguistics and for sociology.

(We have already alluded briefly to the implications for psychology in Chap- ter 1.) The first is that grammar is less central to our understanding of the way language works. It is more than likely that many users of a language never construct a complete and coherent grammar out of their primings. Instead they may have bits of grammars, small, self-contained mini-systems that do not con- nect up but represent partial generalisations from the individual primings (cf. Hopper 1988, 1998).

Secondly, if each person constructs their language out of the primings ac- quired from a unique set of data, there can be no right or wrong in language (and no absolute distinction between native and non-native speaker, though the latter will have acquired their primings by strikingly different routes). When people claim that something someone else said or wrote is ungrammatical, they are really only claiming that the other person’s utterance is different from that predicted by their primings, and of course, hard as it is for anyone to admit, their primings have no special status. (Of course if there were no one whose primings predicted the utterance, that would be of significance, but demonstrat- ing that this was the case would be almost impossible and in any case such utterances are likely to be rare, if we accept the point above about the dissolu- tion of the distinction between native and non-native speakers.)

Built into the above implication is the assumption that everybody’s language is unique, because all our lexical items are inevitably primed differently as a result of different encounters, spoken and written. We have different parents and different friends, live in different places, read different books, get into different arguments and have different colleagues, and therefore there is next to nothing that is shared in the data on the basis of which words get primed for us. How is it then that speakers from the same linguistic community (however that is defined) are mutually intelligible?

I have already alluded to one of the mechanisms that ensure that the primings of different speakers harmonise – the makeshift grammars we are constructing (note the present tense and progressive aspect). We never in principle finish our grammar(s), though in old age it may be that new data are no longer processed and ossification of vocabulary and its primings may occur.

Self-reflexive harmonising only goes so far as an explanation of the consist- ency of primings across speakers. Every culture, I should like to suggest, has external harmonising mechanisms, whether these are oral mechanisms such as songs, proverbs, rituals, drama or folk tales, or written mechanisms such as sacred texts or best-sellers. Labov (1972) describes various harmonising mech- anisms at work in New York inner-city black culture, including sounding and competitive narration, that appear to have contributed to the development of hip-hop, a contemporary harmonising mechanism for a generation, at least as regards receptive priming. The most important controlling mechanism, how- ever, in the great majority of industrialised and large-scale cultures (and many non-industrialised and smaller cultures) is that of education. Mastery of a subject

is mastery of the collocations, colligations and semantic associations of the vocabu- lary of the discipline, mastery, in fact, of the domain-specific and genre-specific primings, and the job of teachers is (among other things) to prime the learners’ vocabulary appropriately. The examination system in turn is designed (among other things) to verify that the vocabulary of the discipline has been properly utilised – i.e. appropriately primed. Looked at more critically, examinations also seek to ensure that only the examinees whose primings harmonise with those already in positions of influence or power are able to take up positions of influence or power themselves.

A second way in which cultures have sometimes attempted to harmonise their primings is through their shared literary and religious traditions. Where mem- bers of a culture share a faith, sacred texts have a harmonising effect. Literary traditions have traditionally been formulated in terms of a literary canon of ‘great writers’, and feminist complaints against the male-centredness of the canon are justified if we see harmonisation as the exercise of power over the language of others. Whereas we all have different experience of the spoken word, in theory we could all have a shared experience of the written word if we all read the same works. If we all read Lord of the Flies, the priming effects on us ought apparently to be similar in that the linguistic data are the same in all cases; actually, however, our previous experience of the lexis encountered will ensure that the priming effects differ in reality, however slightly. In any case, as education has ceased to be designed for an elite in western cultures and as fewer children in many of these cultures are brought up in a faith or at least the same faith, so the harmonising potential of shared traditions has diminished to the point where they have become unimportant. In the English-speaking world it is therefore only in isolated and/or self-contained cultures, and more importantly in sub-cultures within the larger cultures, such as religious denominations or university students of English, that this kind of harmonisation can have any impact.

The third way in which modern cultures harmonise the primings of a linguis-

tic community is through the mass media, which are second only to education in this respect (and possibly have more importance for some speakers). But of course the primings may be different from that promoted by education and they are domain specific. There are also issues here with regard to receptive priming versus productive priming, which apply to the literary canon as well. A news- caster on TV is a member of a linguistic community that the great majority of listeners are unlikely to want to emulate, and the same applies to nineteenth- century novelists (though John Fowles in The French Lieutenant’s Woman sought to turn his receptive primings into productive ones).

Perhaps the least noticeable type of priming comes in the form of dictionaries and grammars. This is why there is always irritation whenever grammarians and lexicographers argue that their function is to describe, not prescribe. Such a

posture is seen by those who instinctively recognise the need for harmonisation as a betrayal. The problem of course is that linguistic scientists find it hard to be linguistic legislators (and vice versa). Every time a new dictionary comes out, I am interviewed about it; and journalists always ask about whether certain new words should have been included. Dictionaries enshrine and enable a degree of harmonisation of priming, but they may contribute to cracks too.

The implications for learners of a second language

This book is offered as a contribution to the development of linguistic thought and as an attempt to ground recent corpus work in a theory sympathetic to the findings coming out of such work. I cannot forget, though, that from 1993 to 2004 I was director of a unit dedicated to the teaching of English as a second or other language. My colleagues and my students would be unforgiving if I did not, however briefly and inadequately, reflect upon the linguistic implications of what I have been saying for the language learner and the language teacher. In any case I have always advocated the bridging of theory and practice. So, if this is the theory, what should be the practice?

The first point to note is that the learning of a second language (L2) is necessarily a very different experience from learning a first one (L1) for a whole raft of reasons, all of which will be very familiar from the literature of second language acquisition. In the first place, when the vocabulary of the first language is primed, it is being primed for the first time. When the second language is learnt, however, the primings are necessarily superimposed on the primings of the first language. So where a bilingual dictionary, a course book or a teacher provides a single word translation of an item designated to be learnt in the second language, the learner will, I claim, immediately activate the primings from the first language. The semantic associations and colligations of the new word will be deemed to be the same as, or at least very similar to, those of the L1 equivalent.

The situation will be more complicated where the distinctions between L1 and L2 are erased. The interconnections between the primings of two languages being acquired alongside each other will need separate study. My discussion in the previous paragraph is not intended as an indirect endorsement of mono- lingual ways of viewing the world but as a representation of the simplest rela- tionship between languages.

The transfer of primings from earlier to later languages is, I suspect, unavoid- able, except where the learner learns through immersion and is never tempted by word-for-word translation (a rare set of circumstances indeed). For some purposes, transference of semantic associations and colligations is likely to be a productive and helpful strategy in the earlier stages of language learning, particularly where the cultural practices associated with the two languages are not overly

different. In any case, if the use of semantic associations and colligations trans- ferred from L1 results in the occasional alien utterance, this may be better than the paralysis that would presumably come from the inability to string two words together (to use an old metaphor with a meaning closer to the literal than usual). That said, the language teacher has at some point to decide when or whether to crack these primings. Certainly we can assume that a whole new class of false friends is implied by priming – words that mean exactly the same in the two languages but are primed differently for L1 learners as regards collocations, colligations and semantic associations. Certainly, also, we can assume that the practice of learning words in lists will aggravate this situation, in that a list both strips the words of all their primings and asserts their strict parallelism.

Despite the obvious differences that arise from learning a second or subse- quent language as opposed to a first language, the distinction between a native and non-native speaker starts to evaporate when we recognise that we are all learners in some areas of our language and beginners in many others. There is not a set of agreed primings that a learner should acquire; priming is, after all, unique to the individual. Furthermore, as just noted, no L1 speaker is primed to deal with every situation they might encounter. My lexis is not appropriately primed to allow me to produce medical discourse, nor would I use the vocabu- lary of numismatics (coin collecting) appropriately (and in both areas there will be many lexical items for which I have no priming at all). There are in truth vastly more situations in which I am unprimed than there are in which my fluency will bring me the credit of competence. Even in areas where I can claim expertise, there are sub-areas in which my vocabulary is insufficiently primed to impress. (Phonologists and phoneticians may already have noticed this.) So-called native speakers are non-native in many contexts and all speakers are, according to the position I have been putting forward, in a permanent state of learning. (This is not to deny the existence of a critical period in L1 learning in which the principles and products of priming are laid down for the child and used as the basis of all subsequent priming.)

What distinguishes learners (or more accurately, types of learning) is not therefore whether they are native or non-native but how the primings come into existence. When a speaker is surrounded by evidence – all of it good, in marked contradiction of early claims made by Chomsky (e.g. 1965) – the primings get built up inductively at variable speeds. When however the speaker is not so surrounded, other strategies need to be used.

I am not primed in the lexis of numismatics, but I am primed in some of the lexis involved with philately (stamp collecting). Initially, I had no stamp collect- ing friends and I primed myself by reading an introductory book on stamp collecting. Each chapter described methods of handling and identifying stamps, with a clear glossary. I then bought a regular stamp magazine (Gibbons Stamp Monthly) which contains articles for beginners, again with terms explained. Finally

I forced myself to read some arcane articles on printing methods, watermarking and the like. All the while, I collected stamps and checked them against a catalogue which used the terminology I was acquiring in ways that allowed me to put to the test whether I had understood. If I misunderstood, I might miss something of value or assign a stamp incorrectly, so there were practical conse- quences to misunderstanding. After a couple of years I then joined a local society and put my primings to the test in the production of conversation. Some of those I spoke with were fully primed in the lexis associated with this context while others, like me, were still in the process of acquiring the necessary lexical primings/knowledge, and in such a context the overlap between acquiring primings and acquiring knowledge is very considerable.

The processes I have just described as a learner in an area of my L1 hitherto unknown to me – you will notice that I hesitate to call it specialism, since one person’s specialism may seem bread and butter to another – seem similar to those undergone by many second or subsequent language learners. Such learners seek to acquire the primings associated with those areas of life in which they wish to seem competent and in which the effects of seeming incompetent may be practical as well as social. For each learner, this may vary. If I wish to tour a country, I may want the appropriately primed lexis for food, accommodation and directions, as coursebooks for time immemorial have assumed. I may on the other hand want to read academic texts in the language or engage in conversa- tion about football. Crucially, as any language learner will confirm, the primings we have may serve us in one situation and fail us in another – as indeed is true for the learner of the philatelic lexis. The difference between the two types of learning and learner is one of degree, rather than one of kind.

Apart from the inevitable impact of primings from other languages (usually but not inevitably the first language) and from other contexts, the range of speakers who can prime us when we seek to acquire new primings in another language or another area of a language is radically impoverished compared with the range that will typically be encountered in a situation of immersion. Then, again, the social contexts in which we as learners encounter the language are not only similarly impoverished but are typically experienced in common with other people who have been primed to a similar degree as ourselves and whose own language efforts will, willy-nilly, prime us. Thirdly, the quantity of data from which we will be primed will be markedly less than any immersed learner is likely to encounter.

What are the implications of all this for language learning? It must in this context be recalled that priming is the result of a speaker encountering evidence and generalising from it. For me as a beginner in stamp collecting, the introductory text and beginner sections in the magazines were ways of shortcutting this process. Instead of learning what a perforation gauge was from constant encounters with the word sequence, a definition was provided along with

instructions for use. The priming came not from frequent, necessarily unordered encounters but from a single focused and generalising encounter. In the same way, language teaching materials and language teachers can provide essential shortcuts to primings. This can happen in a multitude of ways. Usage notes, drilling exercises, texts or tapes with repeated instances of a word sequence, collocational observations and illustrations or just drawing a class’s attention to a feature all may speed up the process, as may judicious use of corpus-based monolingual dictionaries and corpus-based descriptive grammars. Lexicographers and grammarians exposed to concordances are effectively being given an acceler- ated priming. (Higgins and Johns 1984 and J. Willis 1998 have incidentally shown how such accelerated priming may be directly useful to learners, without the mediation of a dictionary or grammar.) If the lexicographer or grammarian reflects this ‘instant’ priming in their dictionary or grammatical entry, the entry offers an extremely valuable shortcut to a lexical item’s characteristic colloca- tions and colligations.

If shortcuts to priming are provided, and for many learners the classroom and the teaching materials used in the classroom provide the only context for priming, it is essential that the primings are not unhelpful in the area in which they will be used. While it is not possible to say that any set of primings are correct and another incorrect, it certainly is possible to say that someone’s primings are not in harmony with those of their likely listeners or readers and that they will accordingly sound unnatural to them. The lexical approach (Lewis 1993) can ensure that primings are harmonious with those of most listeners and readers, and Willis (2003) likewise seeks to integrate grammar and vocabulary teaching in a manner that ought to result in naturalness in the learner. Woolard (2004) seeks to teach the collocations of the most common words in the language, inevitably thereby also integrating the grammatical and lexical. Many of the language teaching materials to which I have been exposed on the other hand have provided me with opportunities to acquire unhelpful primings. Unhelpful primings may result from a textbook’s overemphasis on certain fea- tures of the language (McEnery 2003), or on its fabricated illustrations of gram- matical points. At best, unhelpful primings will result in cracks in the priming when the learner encounters authentic instances of the language away from the teaching context. This may lead to insecurity or distrust of the value of what has been learnt in the classroom. At worst, it may inhibit the development of helpful primings and stunt language growth.

Grammatical notes are a particular kind of shortcut. They may be beneficial if they are rooted in the lexis that gave rise to the grammatical observation but harmful if they are rooted in no characteristic lexis. A grammatical note on the present perfect is likely to be helpful if the pattern is discussed in connection with spoken sequences such as ’ve shown, ’ve found that and ’ve learnt that and written sequences such as have shown thathave found that and have learnt that, but

may actively encourage unhelpful priming and result in unnatural output if discussed with no attention to key primings.

Shortcuts may be necessary, but, given the nature of priming for L1 learners, second or subsequent language learners should be exposed to authentic data wherever possible, and the data should both reinforce existing priming (i.e. by overlapping with previously encountered material) and permit new priming to take place. Krashen (1981) must be correct in talking of the need to give learners material at the threshold of their competence; beyond the threshold, priming will not take place – the learner will simply switch off. On the other hand, when the learner is comfortably on this side of the threshold, existing primings will be reinforced but no new ones created.

Authentic data of course come in two forms. If the data are written only, the learner’s primings will initially – and perhaps permanently – be of letter sequences. Effectively, learners have to have their lexis primed twice over, both as letter sequences and as sound sequences. I speak from experience here. I read adequately in several languages in which I have low competence as a speaker, mainly, I would argue, because my primings are only for the written word.

These seem to me the main implications for second language learning, though there are others. Several factors, however, seem in need of investigation. The first concerns the conditions under which priming takes place. I have throughout this book talked as if all encounters are equal, but we can assume that this is not so. The devout reader of the Torah, the New Testament or the Koran will presumably be more powerfully primed than the skim-reader of a newspaper found abandoned on a railway seat. Presumably, likewise, if the encounter is accompanied by some important effect or emotion, it will be weighted. We need to investigate the conditions under which priming takes place. It is hard to believe that all activities and all levels of (in)attention have the same effect. There are obvious implications for the possible effectiveness of different kinds of language activity.

A particular area for investigation is the relative effect on priming of produc- tion as opposed to reception. To what extent does an utterance produced by a speaker or writer reinforce (or even contribute to creating) that person’s primings? This needs to be discovered. If as I suspect the act of production is strongly reinforcing, it might follow that the learner needs to speak or write as often as possible.

On the other hand, if production reinforces a priming, what are the implica- tions for the learner who repeatedly makes a mistake (defined as a word or sound sequence without support from any part of the linguistic communities in which the speaker wishes to mix)? Does the speaker likewise prime themselves with the mistake? Perhaps the ‘silent period’ sometimes referred to in the literature is one in which the learner waits to be primed by others (though those others may of course themselves be making mistakes).

A brief conclusion

It is not only in the area of language teaching that further investigation is needed. This book has barely alluded to language change. Lexical priming, given its individual nature and its genre and domain specificity, would seem to offer a dynamic mechanism for change worthy at least of exploration. Nor has the book said much about the early stages of a child’s acquisition of language, though again the implications are obvious. Intonation, too, needs to be addressed. Are certain words or word sequences primed to occur with certain pitches or tones? Or is intonation entirely independent, in which case it may belong with the discourse impulse in the map of levels I offered in Chapter 8.

Even where I have touched on a topic, my coverage has often been sparse and programmatic. Apart from the obvious point that my data in some cases barely permit more than the most tentative of speculations, I have been brief on how colligations lead to grammar and how creativity operates, and I have only touched upon the phonetic and phonological implications of the position I have been advocating, and that from a position of inadequate knowledge.

But books have to end. I have offered in this one a theory designed to build upon the work of corpus linguists but addressing some of the questions that non- corpus based theoretical and descriptive linguists have found interesting. I have sought to be integrative rather than combative, and I hope that the book may provoke others to develop lexically-driven models.

I referred above to the possibility that priming might contribute to our understanding of language change. When I was a child, I was primed to expect narratives to have the words The End in text-final position, in a line of their own and always with initial capitalisation – a textual colligational priming. Along with probably everybody else, I am no longer so primed. But I am going to override both the temporal context within which my former priming operated and the restriction of the priming in that temporal context to narratives only. As I noted in the previous chapter it is possible to override one or more primings for creative purposes, and, after all, whatever its merits or demerits, this is a creative work.