The Emotional Content of Children's Writing: A Data-Driven Approach

Yuzhen Dong*, Ya-Ling Hsiao, Nilanjana Banerji, Kate Nation

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    19 Downloads (Pure)

    Abstract

    Emotion is closely associated with language, but we know very little about how children express emotion in their own writing. We used a large-scale, cross-sectional, and data-driven approach to investigate emotional expression via writing in children of different ages, and whether it varies for boys and girls. We first used a lexicon-based bag-of-words approach to identify emotional content in a large corpus of stories (N>100,000) written by 7- to 13-year-old children. Generalized Additive Models were then used to model changes in sentiment across age and gender. Two other machine learning approaches (BERT and TextBlob) validated and extended these analyses, converging on the finding that positive sentiments in children's writing decrease with age. These findings echo reports from previous studies showing a decrease in mood and an increased use of negative emotion words with age. We also found that stories by girls contained more positive sentiments than stories by boys. Our study shows the utility of large-scale data-driven approaches to reveal the content and nature of children's writing. Future experimental work should build on these observations to understand the likely complex relationships between written language and emotion, and how these change over development.
    Original languageEnglish
    Article numbere13423
    Number of pages20
    JournalCognitive Science
    Volume48
    Issue number3
    DOIs
    Publication statusPublished - 18 Mar 2024

    Bibliographical note

    Acknowledgments:
    This research was funded by a British Academy Post-Doctoral Fellowship (PF2/180013) awarded to Yaling Hsiao, a grant from the Nuffield Foundation (EDO/43392) to Kate Nation, and resources made available to Nilanjana Banerji by the Department of Children's Dictionaries and Children's Language Data at Oxford University Press. Data and code associated with this paper are available on the Open Science Framework website (https://osf.io/ywkrj/). The Oxford Children's Language Corpus is a growing database of writing for and by children developed and maintained by Oxford University Press for the purpose of children's language research.

    Keywords

    • Children's writing
    • Emotion
    • Language production
    • Natural language processing
    • Data-driven approach
    • Sentiment analysis

    Fingerprint

    Dive into the research topics of 'The Emotional Content of Children's Writing: A Data-Driven Approach'. Together they form a unique fingerprint.

    Cite this