Wordle.org via Science / AAAS
This graphic is a visual representation of the frequency of words that appear in contemporary English-language books. The size of each word is proportional to its frequency. The most common words, such as "the" and "a," are omitted.
Want to be famous? Don't pursue a career in the sciences. That's one of the key findings from a new study that tasked computers to pick out cultural trends from about 4 percent of all the words ever printed in books.
The researchers created a software program to analyze words and phrases in a database compiled from Google's controversial project to digitize every book ever written.
The result is a scientific "tool that can be useful in the humanities," study lead author Jean-Baptiste Michel, a postdoctoral researcher at the Department of Psychology and Program for Evolutionary Dynamics at Harvard University, told me.
He and his colleagues have dubbed the approach "culturomics," making an analogy to genomics, the study of genomes. The tool provides insight to topics as diverse as humanity's cultural memory, the adoption of technology, fame, and the effects of censorship and propaganda.
For example, to highlight the effect of censorship, the researchers searched for references to Jewish artist Marc Chagall in English and German books. In both languages, his name rises rapidly in the 1910s and continues to rise in English for several decades. However, it all but disappears from 1936 to 1944 in Nazi Germany.
Science / AAAS
This chart shows the usage frequency of "Marc Chagall" in German (red) as compared with English texts (blue). Chagall, a Jewish artist censored by the Nazi regime, virtually disappears from German writings during the Third Reich (the time frame shaded red), even as his fame continued to rise in the English-speaking world.
"The results we have on censorship are kind of remarkable — just the unbelievable extent to which government censorship can utterly obliterate someone from the public discourse," Erez Lieberman Aiden, a study co-author in Harvard's School of Engineering and Applied Sciences, told me.
Michel added that the censorship analysis "is a beautiful example for how this tool can really help advance big questions in the humanities in a way that doesn't negate other techniques that exist in this field."
Google today launched Culturomics, a website that accompanies the study published today in Science. There, users can type in a word or phrase and see how its usage frequency has changed over the past few centuries.
For example, a person who types in the word "pizza" will learn that pizza became popular in the U.S. in 1950. This might prompt a user to look into the history of pizza, whereupon they'd discover that its popularity is linked to the occupation of Italy by American troops during World War II who wanted the food at home.
"It is really an amazing way to browse history and to discover interesting facts about the past," Michel said.
Science / AAAS
This chart tracks the frequency of references to various foods between 1800 and 2000. "Steak" and "sausage" (blue and green) are perennial favorites, overtaken first by "ice cream" (red), and later by two Italian imports, "pizza" and "pasta" (purple and yellow). "Hamburger" (cyan) became widespread in the 1930s, and "sushi" (black) is just now making its move.
In their paper, Michel and colleagues tracked trends such as the frequency of use of dates, words, and phrases between 1800 and 2000. Among their findings: We are forgetting the past faster with each passing year, but knowledge about innovations is spreading faster than ever.
When it comes to celebrity, people reach fame younger in life and become more famous than their 19th-century predecessors, but the fame is shorter-lived. However, people who reach fame later in life, such as U.S. presidents, have longer-lasting fame than actors, who peak in their 20s.
Being a fame-seeking scientist, however, "doesn't make sense by any stretch of the imagination," said Aiden. "You end up less famous than everyone else and you have to wait longer than anybody else [to get it], so in that sense, science is probably not the thing to do if your goal is to become famous."
More about language and science:
- A baby's babble leads to language
- English won't dominate as world language
- 'Woe is us' — bad grammar permeates language
- Computer program helps decode ancient texts
John Roach is a contributing writer for msnbc.com. Connect with the Cosmic Log community by hitting the "like" button on the Cosmic Log Facebook page or following msnbc.com's science editor, Alan Boyle, on Twitter (@b0yle).