Culturomics? What is it? Wikipedia gives us a very concise explanation:
Culturomics is a form of computational lexicology that studies human behavior and cultural trends through the quantitative analysis of digitized texts. Researchers data mine large digital archives to investigate cultural phenomena reflected in language and word usage. The term is an American neologism first described in a 2010 Science article called Quantitative Analysis of Culture Using Millions of Digitized Books, co-authored by Harvard researchers Jean-Baptiste Michel and Erez Lieberman Aiden. Michel and Aiden helped create the Google Labs project Google Ngram Viewer which uses n-gram's to analyze the Google Book digital library for cultural patterns in language use over time.
|one graphic of this data mining algorithm|
click to enlarge
This research began with the publication of a paper titled, Quantitative Analysis of Culture Using Millions of Digitized Books. The abstract explained things quite clearly.
We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of ‘culturomics,’ focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.But this idea was logically expanded to begin to model the behavior not only of ideas but of the actions of those who espouse them - human behavior. After all, if books could be scanned, why not newspapers, magazines, websites, etc.? Just recently, in September 5th, Kalve H. Leetaru, published an article, titled, Culturomics 2.0: Forecasting large-scale human behavior using global news media tone in time and space. Again, the abstract of the article does a good job of explaining the concept:
News is increasingly being produced and consumed online, supplanting print and broadcast to represent nearly half of the news monitored across the world today by Western intelligence agencies. Recent literature has suggested that computational analysis of large text archives can yield novel insights to the functioning of society, including predicting future economic events. Applying tone and geographic analysis to a 30–year worldwide news archive, global news tone is found to have forecasted the revolutions in Tunisia, Egypt, and Libya, including the removal of Egyptian President Mubarak, predicted the stability of Saudi Arabia (at least through May 2011), estimated Osama Bin Laden’s likely hiding place as a 200–kilometer radius in Northern Pakistan that includes Abbotabad, and offered a new look at the world’s cultural affiliations. Along the way, common assertions about the news, such as “news is becoming more negative” and “American news portrays a U.S.–centric view of the world” are found to have merit.The article goes on to detail this idea even further. Culturomics was based initially to understand "digested history" as in books. The article however, goes on to point out that, "People take action based on the imperfect information available to them at the time, and the news media captures a snapshot of the real-time public information environment." News sources indicate a lot more than just "facts." The research in this area goes as far back as 1977 with the publication of a paper titled, The Many Worlds of the World's Press, published in the Journal of Communication, by George Gerbner and George Marvanyi. The 2011 article, citing this 1977 paper states, "News contains far more than just factual details; an array of cultural and contextual influences strongly impact how events are framed for an outlet's audience, offering a window into national consciousness." They are looking to predict social behavior, "A growing body of work has shown that measuring the 'tone' of this realtime consciousness can accurately forecast many broad social behaviors, ranging from box office sales to the stock market itself."
The central question the paper asks is the same question of this series. "Can public tone of the global news data forecast even broader behaviors, such as the stability of nations, the location of terrorist leaders, or even offer new insight on conflict and cooperation among countries, as accurately as it predicts movie sales of stock movements?" We shall find out.