Analysis Results
This section presents three chapters of analysis built from a 25-topic LDA model (MALLET, 10,000 iterations) applied to a corpus of 1,100 deduplicated articles from Chronicling America, covering American newspaper coverage of Chinese children between 1880 and 1885. The articles were retrieved using seven keyword queries and subsequently deduplicated to isolate original reporting from reprinted wire copy. After excluding five noise topics (OCR artifacts and advertising boilerplate), the model yields 20 substantive topics grouped into 9 thematic categories. A parallel full corpus of 1,525 articles retaining all reprints enables a comparison between what journalists originally wrote and what readers across the country actually encountered.
In This Section
Mapping the Discourse: An overview of the full thematic landscape: which topics dominated, how search keywords shaped the corpus, and what the press as a whole was talking about when it talked about Chinese children.
The Reprint Effect: A comparison of what journalists wrote versus what readers encountered. The newspaper reprinting network amplified short, novelty-driven items and suppressed long-form local reporting.
Education at the Center: A close analysis of four education sub-discourses (the CEM recall, public school admission, classroom instruction, and missionary schooling) across time, region, and keyword.