Tuesday, September 27, 2011

FuturICT's Crisis Observatory 5a: Data Mining, Reality Mining & Privacy

Have we inadvertently built a global time bomb, as our ad hoc global patchwork of devices make autonomous decisions based on their subjective interpretation of their surrounding world?

Indexing the real world using location data for predictive analytics... Sense Networks
Living Earth Simulator
Just how can this Living Earth Simulator be constructed?  What will be the nuts and bolts of such a global system?  The goal will be to measure the "state of the world" in real time.

This will require real time data mining or reality mining.  Reality and data mining are really not the same things, even though we give that impression. In 2009, a paper was published in the journal Science, entitled Computational Social Science, listed the different ways in which this network could be created, and indeed, is already being created.

  1. Video recording and analysis of the first two years of a child'e life - this would be to be able to catch autism and understand it's relationship to language.
  2. Examination of group interactions through e-mail data - This would be to answer the following questions: Do work groups reach a stasis with little change, or do they dramatically change over time?  What interaction patterns predict highly productive groups and individuals?  Can the diversity of news and content we receive predict our power or performance?
  3. Examination of face-to-face group interactions over time using sociometers - These are small electronic packages worn like a standard ID badge which capture proximity, location, movement and other facets of individual behavior and collective interactions.
  4. Macro communication patterns - Using phone records and call patterns Google and Yahoo collected chats would be used to answer questions like, what flow patterns are associated with high performance at the individual and group levels?
  5. Tracking movement - GPS technologies track the movements of people and physical proximities over time allowing inference of cognitive relationships, such as friendships, answering questions like how might a pathogen, such as influenza, driven by physical proximity, spread through a population?
  6. Internet - The Internet offers an entirely different channel for understanding what people are saying, and how they are connecting, answering questions, tracing for example, the spread of arguments/rumors/positions in the blogosphere.
Dr. Pentland has already moved on this idea in forming a company called Sense Networks.  Cell phones will be the sociometers, and in some areas, already are.  Already this company using cell phones, has experimented in San Francisco with something called Citysense.  Dr. Pentland describes what the application does.
...it evolves searching to sensing. Citysense passively "senses" the most popular places based on actual real-time activity and displays a live heat map. The application intelligently leverages the inherent wisdom of crowds without any change in existing user behavior, in order to navigate people to the hottest spots in a city. And it's not dependent on having a critical mass of users on the system...The application learns about where each user likes to spend time – and it processes the movements of other users with similar patterns. In its next release, Citysense will not only answer "where is everyone right now" but "where is everyone like me right now." Four friends at dinner discussing where to go next will see four different live maps of hotspots and unexpected activity. Even if they're having dinner in a city they've never visited before.
This application is founded on base technology called Macrosense.   What does it do?
The Macrosense platform cleans, processes and analyzes live incoming data against years of historical learning – and other external sources – to deliver alerts of unusual activity and customer intelligence in real-time. Macrosense databases are uniquely designed to process spatiotemporal data efficiently for rapid decision-making support. Sense Networks has created powerful analytical models that are applied at each step in the data flow. From spatial data cleaning algorithms to identify and correct errors in raw location data streams to machine learning-driven behavioral analytics to measure and predict where people are and where they will go, Macrosense is tuned to transform raw location data into meaningful intelligence. The powerful algorithms continuously learn as new data arrives into the system.
The site has some video maps for how this is formed.  We post them for your perusal.

One can see the power of this system to understand people's habits, movements and other traits.  As computing power increases, so will the predictions and analytics of this software.

We include a short video which is obviously oriented towards selling software involved in predictive analytics.  Nevertheless, it is a concise explanation of data mining.  If you cannot see the embedded video, here is the link: http://youtu.be/cIcmd5zfu3c.

One the of the approaches that will be used by FuturICT will be the Human Dynamics Laboratory's method lead by Professor Alex Pentland of the MIT Media Lab.  We will let Professor Pentland explain reality mining in his own words.
Reality mining is just like data mining, where you and look at the data, where you try to find patterns, make predictions and understand what's going on, except, instead of being applied to text and web pages, things that are  already digital, we're trying to find patterns in real life.  We do this by putting sensors on people.  Cell phones are an example of sensors.  People know where they are.  They know who's around them.  So by looking at the data that comes off the cell phone, you can tell a lot about people, where they go, who they hang out with, even whether they're having a good time.
Dr. Pentland continues with how this data is to be interpreted.
It's actually a much more difficult thing, in terms of the interpretation of the data. Both because the data is not just ones and zeros, these analog signals, and because the interpretation is about things we care about, like who are our friends, are we working, etc.  Reality mining uses cell phones, uses communication systems which provides a real sense of social context, so applications can be socially aware.
We include a video of him explaining these things.  If you cannot see the embedded video, here is the link: http://bit.ly/rmpRMi.

We include a 30 minute video on "reality mining," presented by Dr. Pentland.  If you cannot see the embedded video, here is the link: http://youtu.be/AvJyz2PhjX0.

If all of this makes you a bit nervous, you are not alone.  In our final installment in this series, we shall speak of the one thing that might stop all of this - privacy issues.  We are in need of a new deal on privacy.

Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabási, A., Brewer, D., & ... Van Alstyne, M. (2009). Computational Social Science. Science, 323(5915), 721-723.

1 comment:

Anonymous said...

I couldn’t resist commenting