Culturomics – 500 billion words start a trend

December 18, 2010
By

My brother-in-law sent me this one from the New York Times (thanks Ade!) and it blew me away. I’m guessing that people already know about the controversial project by Google to digitise every book in the world. If you don’t, it’s easy to find out a bit about it. Just Google it. *sigh*

Now, from that effort, a huge, and I mean monstrously, giganto-huge, database has been made from nearly 5.2 million digitised books. That database is now available to the public for free downloads and online searches. Before you panic that every book ever written is now available for free (which is what a lot of people fear) take a moment to understand the nature of the database. It consists of the 500 billion words contained in books published between 1500 and 2008 in English, French, Spanish, German, Chinese and Russian. That word-mine comprises words and short phrases as well as a year-by-year count of how often they appear. The potential use for this in cultural studies and humanities is mind-boggling.

“The goal is to give an 8-year-old the ability to browse cultural trends throughout history, as recorded in books,” said Erez Lieberman Aiden, a junior fellow at the Society of Fellows at Harvard. He calls this method of mass, high speed analysis “culturomics”: the application of high-throughput data collection and analysis to the study of human culture.

There are those that have reservations about the efficacy of the project and those that question the team involved, suugesting that not all the right kind of experts are represented. But you always get that among academics. They can be a bitchy bunch.

The New York Times article closes with this gem:

The warehouse of words makes it possible to analyze cultural influences statistically in a way previously not possible. Cultural references tend to appear in print much less frequently than everyday words, said Mr. Michel, whose expertise is in applied math and systems biology. An accurate picture needs a huge sample. Checking if “sasquatch” has infiltrated the culture requires a supply of at least a billion words a year, he said.

Read the whole article for a much clearer idea of what’s happening. There are links in the article to the full Science journal paper (available free to everyone, although you have to register) and an online tool to search the Google database for the use of any particular word or phrase over time. I can see myself wasting a lot of time with this.

.

Share and Enjoy:
  • digg Culturomics   500 billion words start a trend
  • delicious Culturomics   500 billion words start a trend
  • facebook Culturomics   500 billion words start a trend
  • stumbleupon Culturomics   500 billion words start a trend
  • linkedin Culturomics   500 billion words start a trend
  • myspace Culturomics   500 billion words start a trend
  • reddit Culturomics   500 billion words start a trend
  • slashdot Culturomics   500 billion words start a trend
  • technorati Culturomics   500 billion words start a trend
  • rss Culturomics   500 billion words start a trend
  • twitter Culturomics   500 billion words start a trend

Leave a Reply

Your email address will not be published. Required fields are marked *


Welcome

The website of author Alan Baxter

Alan Baxter, Author

Author of horror, dark fantasy & sci-fi. Kung Fu instructor. Motorcyclist. Dog lover. Gamer. Heavy metal fan. Britstralian. Misanthrope. Learn more about me and my work by clicking About Alan just below the header.

Subscribe to my Mailing List: For occasional news, special offers and more. When you click the Subscribe button you will be sent to a confirmation page.

------------------------------

Contact

Contact Me


Our world is built on language and storytelling. Without stories, we are nothing.

------------------------------

TOP POSTS OF OLD

An archive page of some of the most popular blog posts can be found by clicking here. Enjoy.

Stalk Me

Find me on various social networks. Hover over the icon for a description:

@AlanBaxter on Twitter Like me on Facebook Friend me on Goodreads

My Amazon author page My Tumblr of miscellany My Pinterest boards



feedburner

Listen to my podcast Australian Dark Fiction News & Reviews



National Archive

This website is archived by the National Library of Australia's Web Archive

Pandora