
Danny Moloshok / Reuters
Do your Twitter updates betray where you're tweeting from? Scientists say they can.
On the Internet, nobody knows you're a dog, but on Twitter, your tweets likely reveal where you are. Computer scientists report that the microblogging service reflects regional dialects and slang.
In northern California, for example, when something is cool, it's tweeted as "koo," while in southern California, it's "coo," post-doctoral fellow Jacob Eisenstein and his colleagues at Carnegie Mellon University found. The word "something" is tweeted as "sumthin" in most parts of the country, but New Yorkers favor the term "suttin" instead. LOL, the acronym for "laughing out loud," is common on Twitter almost everywhere but Washington, D.C., where the cruder "LLS" takes precedence.
How they did it
For the study, Eisenstein and his co-authors collected a week's worth of Twitter messages in March 2010 and selected geotagged messages from users who wrote at least 20 tweets. That gave them a database of 9,500 users and 380,000 messages.
They then analyzed the raw text in those messages with a model trained to pick out regional differences such as favored Twitter slang terms ("hella" in Northern California, "wasssup" in New York) as well as sport-team preferences (for example, the Celtics in Boston, the Knicks in New York, the Cavs in Cleveland).
The researchers found that Twitter postings also reflect well-known regionalisms from spoken speech, such as Southerners' "y'all" vs. Pittsburghers' "yinz," and the regional-based references to soda vs. pop vs. Coke.
The model, verified with the geotag information, could predict the location of a microblogger in the U.S. to within 300 miles.
Eisenstein et al. / CMU
Researchers clustered Twitter users based on the regional terms they included in their tweets. This map shows how tweets were clustered to reflect different characteristic regions, including Northern and Southern California, Chicago, the Lake Erie region, Boston, New York, Washington, Northern vs. Southern states, and Florida.
Evolving language
"The study shows that people continue to develop new ways of using language, regardless of whether they're talking over lunch or exchanging messages on Twitter," Eisenstein told me via e-mail today.
"But we don't know whether the geographical specificity of these new forms are simply the result of random variation propagating through social networks that are geographically local, or whether it represents an inherent need to express our regional and community affiliations using language."
Written language is traditionally more homogenized than spoken language, but Eisenstein theorizes that Twitter is more reflective of regional dialects because tweets are more informal and conversational. "It will be interesting to see what happens. Will 'suttin' remain a word we see primarily in New York City, or will it spread?" Eisenstein mused in a news release sent out today.
Eisenstein is presenting the study Saturday at the Linguistic Society of America annual meeting in Pittsburgh. A copy of the paper is available here.
Frontiers of language:
- Twitter's a hit in Japan
- Klingon opera opens
- Teens' online lingo leaves parents baffled
- E-mail @ symbol is different around the world
- Chinese-language tutor sought for U.S. panda
In addition to Eisenstein, the authors of "A Latent Variable Model for Geographic Lexical Variation" include Brendan O'Connor, Noah A. Smith and Eric P. Xing, all from Carnegie Mellon University. The research was supported in part by funding from Google, the Air Force Office of Scientific Research, the Office of Naval Research, the National Science Foundation and the Alfred P. Sloan Foundation.
John Roach is a contributing writer for msnbc.com. Connect with the Cosmic Log community by hitting the "like" button on the Cosmic Log Facebook page or following msnbc.com's science editor, Alan Boyle, on Twitter (@b0yle).


Gotta have a pretty lousey life to need to twitter people.
Dear Twitter Kettle,
You are black.
Love, Newsvine Pot.
This from someone who goes online to insult people he doesn't know anything about? Hilarious.
Wow kinda scary when you think about it. Wow.
I don't understand why people can't just type words as they are supposed to be spelled.(By the way, it took me less than half a minute to type both of these sentences with all the correct spellings.)
But you forgot to put a space between the period and the bracket...
@btcoates, Thank you for helping to preserve the dying English language. Accolades to you :)
@punk chemist, those are parentheses (), not brackets [], {}
@observer, very true, I had a bit of a brain fart and couldn't think of the correct word. However, I think you missed the irony there; if you're going to praise yourself on a perfectly correct sentence, you should at least get it right.
and technically, parentheses are a type of bracket.
@punk chemist, no, you are just being an argumentative muppet. He has a good point. His leaving the space out is a mistake, rather than a lack of knowledge. The time saving gained from the use of shortcuts is negligble. The reason it is a common occurrence is because many people just can't spell properly, often due to not reading enough, and assume the generalities as they see them more often via electronic communication.
As per usual for the average internet dweller you would rather divert attention from the fact at hand and confuse it with some parallel, but unrelated point. Almost as pathetic as the general population's spelling ability but there it is...
So I guess if I tweet about "something" I reveal that I'm just old?
I wish here has spell check and it is just so terrible. But, thanks for dictionary.com...
I love the First amendment; and I will take Fifth for the worse part.
And for the rest of us who don't write in slang?
Asides from local references and explicit talk about my location, where I live may not be obvious from what I type. And typing like I have a character limit is probably a consequence of texting on a clamshell phone or having a twitter account, rather than giving obvious clues as to your location.
I think it would be a more telling study if it included education, socio-economic and ethnic heritage in the demographic matrix. As a New Yorker I personally do not know anyone or of anyone that says or writes 'wasssup' or 'suttin'.
A tweet is from all walks of life. We have the freedom to express however we choose. It is no matter where we r from.
Spend time finding a cure for cancer or improve our society please.
Unless you are involved in cancer research why don't you quit life and spare the world of the ecological foot print you are causing? Would be more pleasant than reading the rubbish you post.
My favorite tweet is pwain M&Ms. My favorite healthy tweet is yogurt covered waisins.
Most who text use abbreviations. The limited length of tweets would encourage this. As a parent I have to work hard to keep up with my son's abbreviations.
Anong
Why, YOU are doing such a great job!