[CHWP Titles]

Old English Glossaries: Creating a Vernacular [1]

Antonette diPaolo Healey

Dictionary of Old English
University of Toronto

CHWP B.36, publ. November 1997. © Editors of CHWP 1997. [First published in CCH Working Papers, 4 (1994) and in Dictionnairique et lexicographie, 3 (1995).]

[Abstract / Résumé]

Old English, glossaries, Latin, bilingual dictionaries, information structure, vocabulary, logical order, alphabetical order, class glossary

The response of a reader in this post-modern age when confronting for the first time an Old English glossary is that of surprise. The strangeness and the novelty of what we survey arise from the disjunction between our expectations of what a glossary should be, and what, in fact, exists for this genre in the earliest period of English, between the seventh and the twelfth centuries. A glossary, as we all know, is 'a partial dictionary' (OED2 s.v. glossary1, main sense). We assume, almost without question, that it is an alphabetical word list pertaining to a particular subject matter (such as a glossary of medical terms) or to a specific piece of writing (such as a glossary to the books of the Bible). We also assume that its intent is illumination: to guide the reader from an unknown or opaque term (in either a foreign language or one's own) to a term which is familiar or, at least, comprehensible. As we explore the possibilities of creating Early Dictionary Databases, I wish to test our assumptions about dictionaries against the evidence of the Old English glossaries.

Size of the Glossary Corpus of Old English

The Dictionary of Old English Corpus in Electronic Form consists of three million running words of Old English to which are attached another two million running words of Latin. Old English glossaries constitute only 1% of this Electronic Corpus, 0.3 Mb in a 30 Mb corpus. For purposes of classification, the editors of the Dictionary have identified 143 texts out of a corpus of 3022 as glossaries.[2] The Old English items within each glossary range in number from a single and forlorn Old English gloss in a thicket of Old High German glosses to the more than 2000 Old English glosses in the Corpus Glossary (Wynn 1961-2), the largest glossary in our citation base, containing more than 8000 items, 75% of which are Latin-Latin glosses, and the 4096 items in the Antwerp Glossary (Kindschi 1955), the majority of which are Latin-Old English. These glossaries represent an interesting subset of our corpus, and they are significant in the history of the language: they are the first English attempts at compiling bilingual dictionaries, from Latin to Old English. By the end of the transitional period between Old and Middle English, when the effects of the Norman Conquest are fully reflected in the spellings and vocabulary of English manuscripts, scribes have become so adept at constructing glossaries that at least one trilingual glossary is produced. Written around 1200,[3] the trilingual glossary of Latin, Anglo-Norman, and English in MS Bodley 730 is in no way a distinguished piece of work -- one might, in fact, call it retrograde in its treatment of English.[4] Yet it reflects a continuing interest in the meanings of words and an expansion of focus from one source language, medieval Latin, to two target languages, Anglo-Norman and English. It is not until the work of the Tremulous Hand of Worcester in the thirteenth century that we find our first monolingual lexicographer of English, who attempted to define and interpret for Middle English readers the strange vocabulary and archaic spellings of Old English. What is of even more interest for the history of lexicography is the important discovery by Christine Franzen in 1991 that the Tremulous Hand of Worcester produced a glossary in the thirteenth century, which is the earliest known glossary arranged in alphabetical sequence not by Latin, but by English, word (Franzen 1991: 119-24).[5]

Types of Old English Glossaries

Although the Middle Ages would not itself have classified glossaries as "Research Tools", they were intended as aids to learning. The first question then to be asked of these dictionaries is how is the information accessible. To those of us raised in the West on the Roman alphabet with its well-established sequence of letters, the fact of absolute alphabetization is an everyday and unexamined convention. Therefore, we are mildly surprised to find, when examining these early glossaries for the first time, that what is now the universally-accepted convention for dealing with long lists of words was not used. Lloyd W. Daly has suggested in his analysis of alphabetization in the classical and medieval periods, that absolute alphabetization, a highly-refined system for filing a large number of words, could only take place if there were the notion of "slips" for ordering the material (Daly 1967: 86).[6] The use of slips or cards seems only to have come about once paper was abundantly and cheaply available. Daly was not able to find any positive evidence from the material culture for the use of slips until after the early Middle Ages, nor is there evidence from linguistic records of a proper word for "slip" in this sense in either Greek or Latin (ibid.: 86-7). Following Daly's lead, I searched our corpus for the Latin terms scheda 'strip of papyrus bark', schedula 'small leaf of paper', and their variant spellings, to discover their Old English equivalents. I found only five words: four of them, carte, tag, gewrit, and ymele, suggest either a piece of paper, or a document, or what is written; one, scrad, from its context as a gloss to Isidore 6.14.8, has the sense '(emended) part (of a text)'. There seems not to be any word in Old English which conveys the notion of "slip". According to the OED, the noun schedule in the sense of '[a] slip or scroll of parchment or paper containing writing [...] a short note' is not attested in English until 1397 (OED2 s.v. schedule, sense 1). This is confirmed by the MED in its entries for cedle and scedle. The noun slip in the sense of '[a] piece of paper or parchment, esp. one which is narrow in proportion to its length' is not attested in English until 1687 (OED2 s.v. slip sb.2, sense 10.a).

Class Glossaries

In the Old English period, there are two basic principles for ordering lists of words. The first is logical order where information is grouped according to various subjects, such as the names of birds, animals, trees, implements, etc. These are known as class glossaries, and although we can often find an individual category without great effort, it is sometimes difficult to search for individual items within a category, especially if it is large. Ælfric's Glossary is a well-known representative of this type. Ælfric issued his Glossary as a supplement to his Latin Grammar; however, it is not, as we would normally expect, a glossary to the Grammar, but rather an independently-compiled vocabulary list to aid the study of Latin (Butler 1981: 19). It is derived from Isidore's Etymologies, and Ælfric follows Isidore in ordering his material according to subject, such as kinds of birds, kinds of fish, names of plants, etc. Here the similarity between Ælfric and his source ends: while Isidore is encyclopedic, drawing upon arcane and fantastical material, Ælfric, in contrast, takes great care in winnowing his material to preserve only that vocabulary which is most useful for the beginning student of Latin (ibid.: 20). The approximately 1300 Latin words and their Old English translations are, for the most part, basic and familiar terms. The following grouping is typical: sanguis blod 'blood'; caro flæsc 'flesh, body'; cutis hyd 'hide'; pellis fell 'skin'; scapula sculdra 'shoulder'; dorsum hrycg 'back'; uenter wamb 'stomach'; brachium earm 'arm' (Zupitza 1880: 298.10). Here there is a mapping of Latin with standard Old English in a word-for-word translation. There are occasional embellishments and elaborations. In the section on units within groups, Latin nonna 'nun' is translated as arwyrþe wydewe oððe nunne 'venerable widow or nun' (ibid.: 299.14), a gloss which accurately reflects the status of the two groups likeliest to be found in monastic communities in pre-Conquest England; in the same section Latin infans is translated descriptively and linguistically as an unsprecende cild 'a child who does not speak, a speechless child' (ibid.: 301.7); or again, in the section on wild and domesticated animals, after we meet the pair Latin limax OE snægel 'snail', we meet another land creature with a shell; here, however, Latin testudo 'tortoise' is not rendered by one word but by an image of domestic possession: se þe hæfð hus 'he who has a house' (ibid.: 310.5). Flights of fancy seem to occur only in the depiction of the mythological creatures, the unicorn (ibid.: 308.12) and the griffin (ibid.: 309.3). And once there is a disarming acknowledgement of glossarial failure: in the section on trees Ælfric admits that he cannot find a translation equivalent for Latin cypressus 'cypress': næfð nænne engliscne naman 'it has no English name' (ibid.: 312.10).[7]

The Latin lemmas in Ælfric's Glossary are also of importance. R.L. Thomson noted in 1981 (Thomson 1981: 155) that Ælfric's Glossary displays a number of medieval Latin terms which have not been fully incorporated into the dictionaries. Ælfric's Latin contribution is on various levels: he provides further attestations of words which first appear in Britain between the sixth and tenth centuries; much-needed attestations for about twenty poorly-attested words; antedatings for a number of items in Latham's Revised Medieval Latin Word-List; and twenty-five new words not previously recorded, twelve of which are botanical.[8] Here is a wealth of material which, Thomson contends (158), was unlikely to contain words not in use in Ælfric's time if we presume his goal was to teach his students the basic vocabulary of Latin for their everyday monastic life. Thomson's point, although he does not state this explicitly, is that not all glossaries are the same. While we cannot attribute a record of normal usage to most glossaries, exceptions do occur, and one of them is Ælfric's Glossary. The evidence of the Latin vocabulary seems then to support the evidence of the English vocabulary: for this particular glossary, at least, we do not have a learned vocabulary but rather the vocabulary of ordinary life in the monastery. The number of versions extant, seven from the early medieval period (Cameron 1973: 86 [B.1.9.2]), tend to reinforce this conclusion.[9]

The importance of this class glossary in both the history of English and medieval Latin is without question, and we would certainly choose to include it in an Early Dictionary Database. What then are our resources for making it machine-readable? Unfortunately, the last published edition of the "only complete copy and probably the earliest" manuscript (Ker 1957: No. 362) of Ælfric's Glossary is Julius Zupitza's edition of 1880.[10] Marilyn Butler's edition of the early Middle English version of the Glossary exists only as an unpublished 1981 dissertation. Although there seems to be a growing interest in the problems of Anglo-Saxon glossography, as is demonstrated by the conference held in Brussels in 1986 devoted exclusively to this topic (Derolez 1992), there have been no new editions of Old English glossaries published in almost twenty years (Pheifer 1974, Stracke 1974). For the Electronic Corpus of the Dictionary of Old English we have had to input editions of glossaries which exist mainly as editions published in the last century or the first quarter of this century or as unpublished dissertations. This is not an ideal situation, but reflects the state of scholarship in this particular area of the discipline.

Alphabetical Glossaries

I would like to turn now to another kind of glossary. In addition to ordering lists of words by subject, a second principle for ordering them, and the one which eventually prevailed, is alphabetical. In these Old English glossaries where information is accessed by the alphabet, alphabetical order is limited to the first letter (A-order, where all the words beginning with A are filed together but in alphabetically-random fashion within A); to the first two letters (AB-order where all the words beginning with AB are filed together but in alphabetically-random fashion after the sequence AB); or to the first three letters (ABC-order). Limited alphabetical order has the advantage that in a long sequence of words, we can find an individual item with minimum inconvenience. It is not as efficient as absolute alphabetization, but it is efficient enough. Moreover, any major or minor displacements in the alphabetization scheme may be helpful in suggesting to us an order in which a particular glossary took shape. The Épinal Glossary, written in Anglo-Saxon England at the end of the seventh century (Bischoff 1988: 13), but now in Épinal, France, shows two alphabetical systems: a group of lemmas arranged in A-order according to the first letter of the alphabet; and a second group in AB-order following immediately after each letter (Pheifer 1974: xli-xlii).[11] This glossary clearly belongs to an early stage in the development of fully-alphabetized dictionaries, for here glosses gathered from various sources are only partially assimilated into the new construct. The presence of two alphabetical systems clearly demonstrates its process of accretion. Nevertheless, the Épinal Glossary is important not only because it is the earliest Latin-Old English glossary but also because of its influence: an independent copy was made on the continent and is now in Erfurt, Germany, the Erfurt Glossary; and a third version was revised and rearranged in England, and is now at Corpus Christi College, Cambridge, the Corpus Glossary.

The Épinal Glossary not only differs from Ælfric's class glossary in form but also in content and purpose. Although it encompasses some ordinary vocabulary, it has a number of rare and restricted words which are difficult because they are seldom encountered in normal use. In its level of difficulty, Épinal is more typical of Old English glossarial material than is Ælfric's Glossary. It is timely, then, to recall here some of the various anxieties associated with these difficult glossaries within a manuscript culture:

One might also add for this particular glossary that the application of reagents by earlier scholars and the subsequent staining of the manuscript (Pheifer 1974: xxii) have impeded rather than aided present scholarly endeavors.

The Épinal Glossary contains some 3200 entries; 970 entries, a little more than 30%, have Old English glosses.[12] The material is arranged in six columns, with the repeating pattern of a column of lemmas, a column of glosses. Unlike Ælfric's Glossary which is based on one source, Épinal draws on a number of sources: the Hermeneumata Glossary, Virgil scholia, the Hisperica Famina, the Vulgate, Gregory's Dialogues, Phocas' Ars Grammatica, the Abstrusa-Abolita glossaries, Aldhelm, Rufinus, Orosius, and, of course, Isidore, among others (Pheifer 1974: xliv-lvii). It is an impressive list and Épinal has been described as an "extract-glossary" (ibid.: liii) because its compiler obviously drew from one source and then another, and some of these sources would have been other glossary collections, known as glossæ collectæ (ibid.: liv).

What types of words did the glossator cull from his many sources? To test the range of the vocabulary in Épinal, I looked at all its Old English glosses, beginning with the letter A, a letter of average length in the Old English alphabet, with about 1500 headwords. Glosses in Épinal beginning with the letter A are filed under 37 different headwords in Fascicle A of the Dictionary of Old English. Even a limited look at the distribution of the material suggests much about the nature of the Épinal Glossary.

Nine words, 24% of the sample, are found only in Épinal and other glossaries. Among these are the expected animal and plant names, such as cweorna 'squirrel' (EpGl 776 aqueorna) and alorholt 'alder copse' (EpGl 47 alterholt); and ailments of various sorts, such as ampre1 'varicose vein' (EpGl 943 amprae) and angseta 'carbuncle, boil' (EpGl 633 angseta). Also to be expected is the technical vocabulary such as lgeweorc 'material for producing fire, tinder' (EpGl 416 algiuueorc) and a verb depicting a handicraft, sowan 'to sew' (EpGl 660 asiuuid). What was surprising, though, were the two verbs of appropriation and violence restricted in use to the glossaries: gnettan 'to usurp' (EpGl 968 agnaettae), derived from the noun gnett 'usury, interest' and rfsan 'to cut off, cut short, destroy' (EpGl 370 araepsid) -- no doubt both verbs reflect the tenor of their sources. And, unfortunately, there was one puzzle we could not solve in Épinal where the Latin lemma frixus 'roasted, fried' is clear enough but we do not know where the Old English figen comes from, and it is possibly a corrupt form.

There are another ten words (26%) whose use is mainly restricted to glossaries, glosses and poetry. Among these is the botanical term te 'oat; wild oat' (EpGl 460 atae); the technical terms wel 'hook', glossing harpago 'hook, talon' (EpGl 30 auuel uel clauuo), and m which is perhaps a 'branding iron' (EpGl 183 haam); and the legal term nweald 'absolute power' glossing monarchia 'monarchy' (EpGl 483 anuuald).[13] Most interestingly, fully 40% of the vocabulary of the letter A in the Épinal Glossary is part of the general vocabulary of Old English, found in a range of texts and in the main senses of well-attested words of ten or more occurrences. Among these are the common animal and plant names, such as apa 'monkey or ape' (EpGl 692 apa); alor 'alder tree' (EpGl 36 alaer); apuldor 'apple tree' (EpGl 497 apuldur); the adjective nhende 'one-handed' (EpGl 487 anhendi); and those verbs of debilitation: seolcan 'to become sluggish, indolent' (EpGl 391 asolcaen); slacian 'to slacken, become weak' (EpGl 350 aslacudae); swindan 'to grow weak, waste away' (EpGl 914 asuundnan).[14] Yet it is the first group of words I spoke of, those terms limited exclusively to the glossaries, particularly the unsolved puzzles, which confer on Épinal the reputation as a hard-word glossary.

* * * * *

Henry Sweet, that astute but often irascible lexicographer of Old English, once opined that the process of translation from Latin to Old English gave rise to "a certain number of words which are contrary to the genius of the language, some of them being positive monstrosities" (Sweet 1896: viii). He denigrated "these unnatural words" because they often created "unmeaning compounds" (ibid.: viii). While his charge may have some validity for those Old English words formed by translating the component elements of the Latin in a mindless and mechanical way (such as prædicere 'to predict' being rendered by OE beforan cweþan), it is certainly not true of the glossaries we have looked at in this analysis. The glossary produced by Ælfric is what we would expect from the most gifted prose stylist in the Old English period. It is an educational tool which keeps its audience firmly in mind, deploying language which is both clear and standard. My analysis of the Old English vocabulary beginning with the letter A in the Épinal Glossary suggests that even the most restricted vocabulary, those words limited in use to the glossaries themselves, show natural patterns of word-formation, for many of the compounds found there are shaped in an intelligent manner from native elements. We do not find in either glossary the "unnatural", "unmeaning" language so scorned by Henry Sweet.

I would like to conclude with a comment on methodology. I would have written this paper more laboriously and much more tentatively were it not for the Dictionary of Old English Corpus in Electronic Form. This database has allowed me to access at least one copy of each Old English text and whatever Latin was attached to it in a manuscript. My searches on the Latin were not limited to the glossaries themselves but also encompassed the interlinear glosses (such as the interlinear glosses to the Psalter) which comprise 24% of our corpus. I was able, therefore, to place the evidence concerning particular words in the glossaries against the evidence of the full corpus, both Old English and Latin, in order to make judgements about ordinary and learned vocabulary, frequent and rare occurrence, standard and non-standard use. Even more significantly, electronic searches have freed us from the logical order of the class glossaries and the alphabetical tyranny of most word lists. At the end of the twentieth century, we have reconfigured these medieval research tools into our own idiom so that they again speak -- this time to us -- with the authority and presence which they possessed a thousand years ago.


[1] I wish to thank Ian and David McDougall for help on particular points but especially for bringing Lloyd W. Daly's monograph and Tom McArthur's book to my attention. I wish also to thank the Social Sciences and Humanities Research Council of Canada and the National Endowment for the Humanities for their support of the research of the Dictionary of Old English project.

[2] These figures are derived from the Directory of the Electronic Corpus. Later figures in this paper are drawn from the Electronic Corpus itself.

[3] Ker 1957: No. 317 is followed in the assignment of dates.

[4] Ker 1957: No. 317 states that "[t]he orthography of the English glosses is throughout extremely confused".

[5] In the fragment extant today, only the sequence n-o-p-q is found (Franzen 1991: 196-7, App. 1, MS G).

[6] For a different perspective, see McArthur (McArthur 1986: 77) who suggests it is the printing press which led to the dominance of alphabetical order.

[7] Butler (Butler 1981: 22) also treats the glosses to Latin testudo and cypressus but with different emphasis.

[8] This information is extracted from Thomson's list of vocabulary items, pp. 156-60.

[9] Butler (Butler 1981: 19) notes that the two sets of later medieval excerpts and five transcripts from the sixteenth and seventeenth centuries posit the existence of four other eleventh-century manuscripts otherwise unknown.

[10] A new edition of this text is in preparation by Ronald Buckalew for the Early English Text Society, but publication is not expected for several years.

[11] Six of the letters, D, E, H, X, Y, Z, do not have this second sequence in AB-order; noticed by Pheifer (xlii).

[12] Pheifer 1974: xxi; the precise figure for the Old English material was drawn from the Directory to the Electronic Corpus.

[13] The other words inthis category are: dwinan, ferian, gnidan, ambiht, anmd, styntan.

[14] The remaining words in this group are: flian, amber, anga, nwillice, reccan, attor, windan, wyrdan.