Tuesday, September 13, 2011

Culturomics: Computers That Predict The Future? 2

Culturomics 2.0 - the new way to analyze the world.


In our previous part in this series, we covered some of the basic theory behind Culturomics, from its beginnings in the humanities.  Now it has entered the world of intelligence.

Three Case Studies
The paper previously cited by Kalev Leetaru cited three case studies to validate his computational methods - the Egyptian revolution, the Libyan revolution and the location of Bin Laden.  It also covers the political stability of Saudi Arabia.    Leetaru is a professor at the University of Illinois, being a senior research scientist at the Institute for Computing in the Humanities, Arts, and Social Science.  Although this program is portrayed as a "humanities" program, and does have that component in liberal arts academia, it has been transformed as an intelligence tool to attempt to help the United States government predict and anticipate events.  This of course, serves the U.S. intelligence interests and by extension and implication, any covert activities on the part of the CIA.  So this program is no doubt funded, at least in part, by military interests.

IBM's PowerXCell81 CPU
used in Nautilus
The program uses the famed Nautilus supercomputer, which is run by the National Institute for Computational Sciences (NICS), at the Oak Ridge National Laboratory (ORNL).  The relationship between the military and the American scientific establishment has a long tradition and many threads.  Often there is cross funding, or cooperation between the two making many scientific research institutions virtually extensions of Defense Department research.1, 2(see 1972)

IBM's Nautilus
Supercomputer
This supercomputer built by IBM, is billed as the most energy efficient computer in the world and one of the fastest, in the top 500 supercomputers in the world.  The supercomputer, an SGI Altix UV 1000 system has 1024 cores of Intel Nehalem EX processors, 427 terabytes storage space, 4 terabytes of global shared memory and can do 8.2 trillion calculations per second (teraflop).3  How Leetaru used this system is impressive.
Kalev Leetaru of the University of Illinois in Urbana-Champaign combined three massive news archives totaling more than 100 million articles worldwide to explore the global consciousness of the news media. The complete New York Times from 1945 to 2005, the unclassified edition of Summary of World Broadcasts from 1979 to 2010, and an archive of English-language Google News articles spanning 2006 to 2011 were used to capture a cross-section of the U.S. media spanning half a century and the global media over a quarter-century. 
Advanced tonal, geographic, and network analysis methods were used to produce a network 2.4 petabytes in size containing more than 10 billion people, places, things, and activities connected by over 100 trillion relationships, capturing a cross-section of Earth from the news media. A subset of findings from this analysis were then reproduced for this study using more traditional methods and smaller-scale workflows that offer a model for a new class of digital humanities research that explores how the world views itself.
We cite a video, with Dr. Leetaru explaining "crisis mapping" and its automation.  If you cannot see the embedded video, here is the link: http://youtu.be/v4-eLnqS-SE.

Culturomics 1.0 project
click to enlarge
"The observer influences the events he observes by the mere act of observing them or by being there to observe them." Janov Pelorat Foundation's Edge
Culturomics 1.0 Upgraded to 2.0
This expansion to newspapers, magazines and websites is the distinctive move to the upgrade of 2.0.  It seeks to know not what people thought after reflection, but what large groups of people are thinking in real time.  This system is not intended to size up what an individual thinks, it is intended to understand and predict mass movements.  Like all computer models, it suffers from the butterfly effect.  It cannot predict things that will happen in the distant future.  It is not clear in the paper just how far ahead they can calculate.  It is easy to look back and see what movements came on the stage of history.  But predicting the future through algorithms, at least the present time is to deterministic and simplistic shell for the complexity of reality.

Nevertheless, governments all over the world are spending large funds to be able to  have a mathematical crystal ball into the future.  Indeed, for some this is very dictator's dream.   For others, hopefully it is a way for governments to change what they do to keep their people happy.

As usual, our science fiction writers are way ahead.  This entire system resembles the psychohistory of Issac Assimov's Foundation Trilogy novels written in 1951!  We find it interesting to note the description of psychohistory given in wikipedia:
Psychohistory is a fictional science in Isaac Asimov's Foundation universe which combines history, sociology, and mathematical statistics to make general predictions about the future behavior of very large groups of people, such as the Galactic Empire. It was first introduced in the five short stories (1942–1944) which would later be collected as the 1951 novel Foundation.
Does this sound familiar?  It does to us! This supercomputer is Asimov's Prime Radiant - the device that stores all the formulas and which can amend them then adjust them when needed.

Case 1: Egypt
According to Leetaru, Facebook and other social media helped to organize resistance to Mubarak in Egypt, but it was not the main communication medium for the Egyptian people.  
One of the first sites secured by the Army when it entered Cairo was the state television headquarters, and TV programming focused on the lawlessness caused by the protests while highlighting the steps the government was taking to restore peace (Fahim, 2011). Indeed, state television’s coverage of the protests, depicting them as “foreign and violent” or ignoring them altogether, isolated the protesters and helped the regime regain its balance in the early days of the protests (Fahim, et al., 2011). Organizers later conceded that relying on social media alone to get their message out, even in a country as wired as Egypt, was not enough and traditional mainstream news media remains the dominate force in driving public opinion in that country (Fahim, et al., 2011).
 The data fed into the computer for this paper did not include social media.  Leetaru explains why:
Citizen media also presents many unique challenges to computational analysis. While some platforms like Twitter do provide programmer interfaces to their content, and blogs are available through several blog aggregators with RSS feeds, other platforms like Facebook actively prevent crawling even for academic study (Warden, 2010). In addition, social media and other localized indicators tend to be in vernacular languages, making use of localized slang or idiomatic expressions, requiring significant translation effort. Social media also show strong geographic disparity, with Twitter users in California and New York producing more content per capita than anywhere else in the United States or even Europe (Signorini, et al., 2011), while questions have been raised as to whether Twitter captures world events as well as it does entertainment and cultural news (Taylor, 2011).
Leetaru sums up what he came up with his studies on the Egyptian crisis resulting in the removal of Mubarak. 
On 25 January 2011, popular dissent with the Egyptian state culminated in mass protests that continued through President Mubarak’s resignation on 11 February. Figure 2 shows the average tone by month from January 1979 to March 2011 of all 52,438 articles captured by SWB mentioning an Egyptian city anywhere in the article...Only twice in the last 30 years has the global tone about Egypt dropped more than three standard deviations below average: January 1991 (the U.S. aerial bombardment of Iraqi troops in Kuwait) and 1–24 January 2011, ahead of the mass uprising. The only other period of sharp negative moment was March 2003, the launch of the U.S. invasion of neighboring Iraq.
Tone of coverage mentioning Egypt, Summary of World Broadcasts January 1979–March 2011 (January 2011 is 1–24 January). Y axis is Z–scores (standard deviations from mean).

Leetaru justifies this information in the graph and how it could have helped government officials to understand what was going on in Egypt.
Despite being hailed as a social media revolution, monitoring the tone of only mainstream media around the world would have been enough to suggest the potential for unrest in Egypt. While such a surge in negativity about Egypt would not have automatically indicated that the government would be overthrown, it would at the very least have suggested to policy–makers and intelligence analysts that there was increased potential for unrest.
In our next part in the series we will continue with the examples of Libya and Bin Laden. 

No comments: