Register variation explains stylometric authorship analysis

Research output: Contribution to journalArticlepeer-review

29 Downloads (Pure)

Abstract

For centuries, investigations of disputed authorship have shown that people have unique styles of writing. Given sufficient data, it is generally possible to distinguish between the writings of a small group of authors, for example, through the multivariate analysis of the relative frequencies of common function words. There is, however, no accepted explanation for why this type of stylometric analysis is successful. Authorship analysts often argue that authors write in subtly different dialects, but the analysis of individual words is not licensed by standard theories of sociolinguistic variation. Alternatively, stylometric analysis is consistent with standard theories of register variation. In this paper, I argue that stylometric methods work because authors write in subtly different registers. To support this claim, I present the results of parallel stylometric and multidimensional register analyses of a corpus of newspaper articles written by two columnists. I demonstrate that both analyses not only distinguish between these authors but identify the same underlying patterns of linguistic variation. I therefore propose that register variation, as opposed to dialect variation, provides a basis for explaining these differences and for explaining stylometric analyses of authorship more generally.
Original languageEnglish
JournalCorpus Linguistics and Linguistic Theory
Volume0
Issue number0
Early online date2 Jan 2023
DOIs
Publication statusE-pub ahead of print - 2 Jan 2023

Keywords

  • forensic linguistics
  • idiolect
  • language variation and change
  • multidimensional analysis
  • variationist sociolinguistics

Fingerprint

Dive into the research topics of 'Register variation explains stylometric authorship analysis'. Together they form a unique fingerprint.

Cite this