SCOTS Project - www.scottishcorpus.ac.uk Document : 2 Title : Finding the words for it Author(s): Prof Christian Kay Copyright holder(s): University of Glasgow: Copyright © 2004 The University of Glasgow. All rights reserved. Text [CENSORED: placename] Science Festival 4/4/97 Screen 1 Finding the words for it: past, present and future with the "Historical Thesaurus". [CENSORED: forename] [CENSORED: surname], [CENSORED: placename] Abstract: Computers nowadays play a key role in the compilation of dictionaries, providing both a vast array of source materials and flexible ways of retrieving information. The "Historical Thesaurus of English" is a database of the vocabulary of English from its Anglo-Saxon roots to Present Day English. In addition to its interest for historians of the language, this thesaurus presents fascinating insights into the lives of past speakers, which are often revealed through the words they used. Historical lexicography also presents the researcher with particular problems of definition and classification. Such issues will be discussed, and illustrated from thesaurus sections such as "Medicine" and "Humankind". 1. Introduction One area of the humanities where computers have had an enormous impact is in the compilation of dictionaries. As lexicographers have pointed out down the ages, their craft involves a large amount of painstaking and repetitive work. This is still undoubtedly the case, but both the initial labour and the end-product have been greatly improved by technology. In the initial stages of making a dictionary, we now have access to huge databanks of quotations from texts of all types; this enables us to claim with more justification than in the past that the dictionary is representative of the language. While the dictionary is in progress, we have in databases a flexible means of storing, recalling and manipulating information, so that nowadays lexicographers can often operate from a single work-station. We can produce and revise a paper dictionary much more easily; a second or third edition is no longer a major publishing event. And we have alternative means of publication, on disk or CD-ROM or increasingly over the Internet. 2. The Historical Thesaurus of English The dictionary I am principally involved with takes, or will take, advantage of all these electronic aids. It also has problems and characteristics peculiar to itself. These are revealed in its title: Screen 2 Thes Frontispiece In the first place, this dictionary is a thesaurus; rather than listing words alphabetically, it groups them according to their meanings, in categories such as Medicine or Humankind or Feelings. (click on Thesaurus: bit of classification) In the second place, this dictionary is historical: it contains not only modern English words, but words from the entire recorded history of English, beginning with Old English, the language of the Anglo-Saxons. (click Historical: list of gin words) It thus offers the scholar a bird's-eye view of the development of the English vocabulary. At the same time, because words reveal so much about the development of a society and its culture, the thesaurus contains a good deal of information of interest to historians. 3. The Vocabulary As a result of historical events stretching back 1200 years, the vocabulary of English is enormously large, rich and varied. The original Germanic language of the Anglo-Saxon settlers has been subjected to three main waves of influence, Scandinavian and French as a result of invasion, and Latin as a result of intellectual developments during the Renaissance. There have also been other influences from around the world, not least from other varieties of English, such as American and Australian, during the modern period. This point can be illustrated from virtually any section of the thesaurus. The one I have chosen is the section on gout, from the Medical Category, which has a certain macabre fascination. Screen 3: Gout (Click on toe to flash, then noun, adj) Commentary fotadl fotcopu fotwaerc lipa drop only 2 qu, OE + 1559, prob. both gout gout: Fr. goutte - drop (concretions dropped into bloodstream) podagre: L > Gk; gout in the feet, generalised to gout anywhere. Common from early ME. Also many adjectival forms. joint-sickness: Elyot Dictionary, 1545, translating arthritica passio (Castel of Helth) leaping gout - runneth from one joint to another arthritis: gen inflammation of joints, spec. gout. Only 3 qu's in all. Rheumatism v. common from 1688. Classified together - probs of differentiation. matching ajectives 4. Problems Work on a Thesaurus has two main problems, which may be simply stated. The first, which is common to all lexicography, is determining the meanings of words. The second, peculiar to thesaurus-makers, is placing the word in an appropriate category. It is axiomatic that every word has to go somewhere. Screen 4 What does it mean? Where shall I put it? For historical lexicographers, these problems are often compounded by the nature of the evidence. Without any native speakers of the language to guide us, but instead relying on often imperfect written sources, it can be extremely difficult to determine a word's meaning. (click what does it mean: def of folding) Certain words and definitions will probably haunt me for the rest of my life. Late one night, for instance, when I was struggling to classify agricultural terms (a subject about which I know very little) I came across the word folding, defined as "The action of folding sheep". This I completely misinterpreted, having visions of a demented Medieval shepherd trying to cram sheep into a parcel. (click on sheep 1) A little reflection, and consultation of other dictionaries, produced a more sensible meaning. (click on sheep 2) (if time: where does it go - classification again) An equally problematic group of words was early terms for clothing, such as those rather vaguely defined by phrases such as "an outer garment, a cloak or cape, a mantle, robe or pall". What exactly does such a garment look like? Who wore it and when? Painstaking research, of a kind that lexicographers rarely have time to do, may enable us to find out. If not, we are left with a problem of categorisation. For the modern period we can have a broad category of outer garments, with subcategories such as coat, cloak, cape, jacket, since these objects are for us clearly distinguished. For earlier periods, this may well be impossible. In such cases, rather than classify over-specifically, with the risk of subsequently being proved wrong, we retreat to the more general category. 5. Social Information. The examples I have mentioned should already have made the point that our thesaurus contains sociologically interesting information. This is often even more striking in categories with a large number of words. In Humankind, the basic category for woman contained at the last count 105 words. It is sobering to examine these and discover that at least 20% of these are derogatory in meaning. Screen 6 Woman words (click on woman in middle) Commentary if time. Many to do with clothing, animals, size; quite a lot 1/2 quotes mumps: contemptuous/mock endearment; Mumpsimus - glum person, or to mump/mope rowen: rough ground: partridge living on it; woman moll: moll cut-purse, character in 17thc drama placket: apron, petticoat modicum: small person (more or less disparagingly) partlet: proper name of any hen; Chaucer's Dame Pertelote periwinkle: plant (playful) uptails: name of a vulgar song cow: coarse or degraded woman fusby: no suggestions, 2 quotes biddy: Irish maidservant (derogatory) 6. Conclusion Although initially a pencil and paper operation, the HT now makes considerable us of computers. About 70% of our material is held electronically in an Ingres database, which has recently been redesigned. We expect to publish, somewhere around the millenium, both as a book and on CD. We expect that by then CD technology will have advanced to the point where the OED and our thesaurus can interact on the same disk. This will provide scholars with a resource which previous generations could only dream about. If you would like to see more of our work meanwhile (a) we have already published a separate Thesaurus of Old English. Screen 7 A Thesaurus of Old English Jane Roberts and Christian Kay with Lynne Grundy King's College, London, 1995 (b) we are running a demonstration for the rest of the day along at the Royal Scottish Museum. We look forward to seeing you there. Improvements: more detailed agriculture classification under "problems" picture of hen with woman words The Historical Thesaurus of English Summary of Classification Section I: THE EXTERNAL WORLD (largely complete) 1. The Earth 2. Life 3. Sensation & Perception 4. Matter 5. Existence 6. Relative Properties 7. The Supernatural 8. Possession Section II: THE MIND (in progress) 1. Mental Processes 2. Emotion 3. Good or Bad Opinion 4. Aesthetic Opinion 5. Will 6. Endeavour 7. Language Section III: SOCIETY (largely complete) 1. Social Groups 2. Habitation 3. War 4. Government 5. Law 6. Education 7. Religion 8. Communication 9. Travel and Transport 10. Work 11. Leisure Department of English Language University of [CENSORED: placename] Revised 1995 This work is protected by copyright. All rights reserved. The SCOTS Project and the University of Glasgow do not necessarily endorse, support or recommend the views expressed in this document. Information about document and author: Text Text audience Adults (18+): Audience size: 1 Text details Method of composition: Wordprocessed Word count: 1450 Text publication details Published: Text type Article: Prepared text (e.g. lecture/talk, sermon, public address/speech): Prose: nonfiction: Other: Talk Author Author details Author id: 606 Title: Prof Forenames: Christian Surname: Kay Gender: Female Decade of birth: 1940 Educational attainment: University Age left school: 18 Upbringing/religious beliefs: Protestantism Occupation: Academic Place of birth: Edinburgh Region of birth: Midlothian Birthplace CSD dialect area: midLoth Country of birth: Scotland Place of residence: Glasgow Region of residence: Glasgow Residence CSD dialect area: Gsw Country of residence: Scotland Father's place of birth: Leith Father's region of birth: Midlothian Father's birthplace CSD dialect area: midLoth Father's country of birth: Scotland Mother's place of birth: Edinburgh Mother's region of birth: Midlothian Mother's birthplace CSD dialect area: midLoth Mother's country of birth: Scotland Languages: Language: English Speak: Yes Read: Yes Write: Yes Understand: Yes Circumstances: All Language: Scots Speak: No Read: Yes Write: No Understand: Yes Circumstances: Work