Do men write more about coffee?
Based on HathiTrust + NovelTM,BookCorpus,Blog Authorship Corpus,TED Talks data.
When you read a lot of one genre, the tropes get heavy-handed fast. For me that genre is detective fiction, the Adrian McKinty series and Volker Kutscher’s Babylon Berlin novels in particular. The setup is always the same: a tough, male anti-hero arrives in a new city (fine), drinks whiskey (fine), scores girls (still fine), and reaches for coffee constantly (not fine). The constant coffee drinking felt like an inherently masculine thing, so I checked whether male authors actually write the word coffee more than female ones.
Turns out my hunch was wrong:
Coffee and tea are female beverages, beer and whiskey male.
The data sources
Before focusing on literature, four datasets of varying size and origin let us check whether the pattern holds across contexts. For coffee, the answer is clear: female authors use the word roughly 30% more often than male authors. TED talks are the outlier, which makes sense because they’re spoken word and the smallest dataset.
Gendered beverages
From here we look at literature only, using the HathiTrust + NovelTM corpus. It covers about 125,000 anglophone novels from 1700 to 2010, with author gender inferred from first names using NomQuamGender.
Beverages split clearly by author gender:
This split has been stable across centuries. The use of the word coffee has been growing since 1820, and female authors stay ahead of male ones throughout. The one switch is wine: female authors overtake male ones around 1970.
Anatomy of the gap
But maybe women write more romance, romance has more coffee scenes, and the gender imbalance is really a genre gap. For about 5,700 of the HathiTrust books we have NovelTM genre tags.
The 12 biggest genres show the gap holds in most of them, and detective novels sit near the top for coffee references (about 33% above the corpus average).
One step past genre, we look at what other words show up on pages that mention coffee. For the 40 with the biggest male-female gap, we check whether female or male authors use each one more often.
Food, setting, and container words lean female (kitchen, breakfast, cream, sugar, cups, butter, milk). Consumption verbs lean male (drank, drinking, drink).