SCOTS
CMSW

Introduction

The Corpus of Modern Scottish Writing (CMSW) is an electronic corpus of written and printed texts from the period 1700-1945, complementing the Helsinki Corpus of Older Scots (1450-1700) and the Scottish Corpus of Texts and Speech (1945-present day). CMSW contains over 350 documents, containing approximately 5.5 million words of text overall.

CMSW’s documents range from printed novels, to written correspondence, to newspaper and magazine articles, to legal material such as wills and sasines. Our documents have been sourced from partners such as the Mitchell Library in Glasgow, the National Library of Scotland, and University of Glasgow Archive Services.

All our documents are assigned a year group (1700-1750, 1750-1800, 1800-1850, 1850-1900, and 1900-1945). Year groups are decided based on a document’s date of publication (if printed), and of writing (if handwritten); if this date is unknown, we have assigned a year group based on our best estimate. Documents are also classified as belonging to one of nine genre groups (administrative prose, expository prose, personal writing, instructional prose, religious prose, verse/drama, imaginative prose, journalism, and orthoepists). More information is available from the corpus details page.

Background

The development of English in Scotland has long been a controversial topic, to the extent that even the language labels are contested. The Modern Scots period is conventionally dated from 1700 to the present day. It therefore begins with the last stages of the standardisation of written English, as well as the onset of the ‘Vernacular Revival’ in literary Scots that produced writers like Robert Burns.

Language use in Scotland in the modern period can be described as a continuum with Standard English at one end, and social and regional varieties of Broad Scots at the other. Writers vary their performance along that continuum, to a greater or lesser extent, depending on their social background and the context of writing. It is generally thought that out of the interaction between Broad Scots and written Standard English, the hybrid prestige variety of today’s Scottish English emerged.

However, there has been comparatively little study of how this happened, beyond some detailed analysis of the evidence of spelling reformers of the 18th century, mainly in relation to changes in pronunciation of the period. By creating a searchable digital archive of Scottish writing from this key period, we lay the foundations for a new account of language development in Scotland. Initial research using this resource will focus on the vexed issue of spelling variation.

In comparison with the period post-1700, the Older Scots period (1375-1700) is well-served, with studies of Anglicization during this period supported by the Helsinki Corpus of Older Scots, and the ongoing development of various sub-corpora of correspondence in Older Scots. The CMSW project has broken new ground by filling the chronological gap between the HCOS and the SCOTS resources, thus making available to scholars and others a complete historical record of a major language variety whose development parallels and interacts with Standard English.

Language commentators

Around 1 million words of CMSW is made up of material from orthoepists, or language commentators. Researchers can therefore compare the orthoepists’ pronouncements and recommendations for how language is and should be spoken to the language in actual use at the time, and trace language change with this in mind.

Spelling and variation

CMSW contains documents in Scottish Standard English, documents in different varieties of Scots, and documents which may be described as lying somewhere between Scots and Scottish Standard English. While Scottish Standard English has a standard written form, Scots does not. This means that the corpus contains a wide range of variation in spelling. We hope to offer a means of searching for all of the variant spellings automatically in the future. In the meantime, we recommend the online Dictionary of the Scots Language as an excellent source of possible variants.

Transcription

While best efforts have been made to transcribe the full content of all documents, occasionally words or whole sections are missing or illegible. Where this is the case, we have still included the transcription, although we have endeavored to include the full text of all documents wherever possible.

Research focus

The linguistic research undertaken by the CMSW team as part of the project aims primarily to account for the structures of Modern Scots orthography, with a view to enhancing automatic identification of spelling variants. This groundwork will lead to further linguistic analysis, for example of the relationship between orthography and phonology, of the extent to which Modern Scots orthography is lexicalised in different phases within the period, and of how particular orthographies are motivated by stylistic or philological intentions on the part of the author.

People

The website was redesigned in November 2013 by Brian Aitken, Digital Humanities Research Officer.