Friday, August 26, 2011

Beyond von Neumann: IBM's Cognitive Computers 3b

We continue in this article with details of DARPA's project SyNAPSE.
What makes a cognitive computer different from a regular one?  Do they really go beyond the von Neumann architecture?

There is an algorithm that has been developed over the last several years by IBM researchers called BlueMatter.  In 2008, they, according to a paper, titled, Blue Matter: Scaling of N-body simulations to one atome per node, published in the IBM Journal of Research and Development (vol. 52, Issue 1/2, January 2008) described this algorithm.  For those of us who are not computer scientists, this explanation may get a bit detailed, but we shall do our best.

The Blue Matter algorithm developed by IBM is a numerical simulation of a molecular system.  The point of this simulation is to understand the details of the structure and dynamics of biomolecules.  The paper states,
Such simulations are used to sample the configurations assumed by the biomolecule at a specified temperature and also to study the evolution of the dynamical system under some specified set of conditions.
The paper continues bu detailing of the challenges of bimolecular simulation.
Among the many challenges facing the biomolecular simulation community, the one that stresses computer systems to the utmost is increasing the timescales probed by simulation in order to better make contact with physical experiment. Even sampling techniques that do not themselves yield kinetic information can benefit from an increased computation rate for a single trajectory.
The authors go on to explain classical bimolecular simulation. This classical simulation involves both what is called the "Monte Carlo" simulation and Molecular Dynamics.  We shall discuss both methods.

The Monte Carlo method was named this by John von Neumann Stanislaw Ulam and Nichols Metropolis while working on the Manhattan Project in the 1940s.  The Monte Carlo algorithm is a random algorithm used to simulate physical systems where a deterministic algorithm (an calculation where given a particular input the same output will always occur) will not do.  This algorithm has been found useful for simulating systems such as fluids, disordered materials, strongly coupled solids and cellular structures.

The second bimolecular simulation approach is named Molecular Dynamics.  This is a deterministic method as contrasted with the Monte Carlo system.  The Molecular Dynamics model was devised during the 1950s and 1960s by physicists.  There are a host of software programs used for bimolecular simulations.  For those who may want to see a visual explanation for these two methods we supply some videos which may assist.  We should note that the Monte Carlo Method is not only used in science, but also in finance to predict probabilities of the effects of business decisions among other things.  The method is the same.  If you cannot see the embedded video, here is the link: http://bit.ly/pPUB53.

IBM's Blue Matter Algorithm
According to the 2008 paper cited above, IBM researchers state that they have been focusing their efforts on molecular dynamics approach.  This is not say however, that at certain times, they do not combine molecular dynamics with the Monte Carlo method.    
Classical biomolecular simulation includes both Monte Carlo and molecular dynamics (MD). The focus of our work has been on MD, although the replica exchange or parallel tempering method, which combines MD with Monte Carlo-style moves, has been implemented in Blue Matter as well.
Of course, like all simulations they face the N-body problem.  For those of you who are not acquainted with this problem we shall explain.  The N-body problem has to do with moving bodies and how they interact with each other, whether we are dealing with stars and planets, or molecules and atoms.  The dynamics of the interactions and movements of these bodies is VERY complex.  So complex, that after we account for 3 bodies, we have no sure prediction of their interactions.  Dr. Sverre Aarseth from the Institute of Astronomy in Cambridge explains,
The N-body problem dates back to Newton, who formulated the law of gravitational attraction. It is the problem of describing the motion of N bodies under their mutual influence. For example the Earth moves around the Sun while at the same time the Moon revolves around the Earth. In a nutshell, this constitutes the three-body problem. 
This seemingly simple problem has fascinated mathematicians past and present. Indeed for 3 bodies or more there are no general solutions! The trajectories of the bodies depend on many things, namely their masses, coordinates and velocities at the beginning. In fact the motion can look very different if those inital conditions are modified, even by the slightest change.
He goes on to say to explain the challenges of computer simulations of dynamic bodies in general. 
Because of this we need to perform computer simulations to describe the motion reliably. This is done by numerical integrations in which the accuracy of the computer and the calculations are crucial to the reliability of the N-body code. 
The essential point of such an orbit integration is to advance the solutions by small time intervals. At each point of the simulation the instantaneous particle forces are calculated and the particles take a small step under that force. By the time they have moved, the forces of the particles on each other have changed so they are calculated again and the next step is taken. The motion is essentially cut into very small segments. In this way, the motion of each body can be represented accurately by a continuous curve. The art of scientific computing is to employ an efficient method for obtaining reliable solutions over long times.
Chaos theory and the butterfly effect kick into these discussions.  In a physics course at the University of Buffalo, a professor writes,
This N-body problem is one of the oldest problems in physics. After Newton solved the 2-body problem exactly, numerous scientists and mathematicians attempted to nd exact solutions to the 3-body problem. So far, no non-trivial exact solutions to the 3-body problem have been found. Mathematicians have been able to prove that the problem is non-integrable, and that many 3-body trajectories are chaotic and cannot be computed numerically.
In the 2008 IBM research paper previously cited, Blue Matter models use the general principles of computer simulations of real life physical phenomena.
Ideally, a simulation would model a protein in an infinite volume of water, but this is not practical. Instead, the usual approach is to use periodic boundary conditions so that the simulation models an infinite array of identical cells that contain the biomolecules under study, along with the water. This is generally preferred over simulating a single finite box in a vacuum (or some dielectric) because it eliminates the interface at the surface of the simulation cell. The choice of simulation cell size is important. If the cell is too large, unnecessary computations will be done; too small, and the interactions between biomolecules in different cells of the periodic array can introduce artifacts.
To go more technical, the paper goes on to describe the mathematical techniques used to get around the N-problem issue, or at least to reduce its effect.
The most commonly used techniques for handling the long-range interactions with periodic boundary conditions are based on the Ewald summation method and particle-mesh techniques that divide the electrostatic force evaluation into a real-space portion that can be approximated by a potential with a finiterange cutoff and a reciprocal space portion that involves a convolution of the charge distribution with an interaction kernel [6–9]. This convolution is evaluated using a fast Fourier transform (FFT) method in all of the particle-mesh techniques, including the particle-particle particle-mesh Ewald (P3ME) technique [6, 9] used by Blue Matter.
 After discussing methods "..for increasing the effective rate at which the kinetics of the systems evolve.." they mention "kinetic acceleration" and "multiple timestepping algorithms."  In the end they decide that no matter how careful one is in splitting correctly the electrostatic forces, "...the use of multiple timestepping leads to significantly larger drifts in the total energy for constant particle number, volume and energy (NVE) simulations" as opposed to using the "velocity Verlet integrator."

For those who are not mathematically literate, this is all very analogous to sampling.  The movement of real molecules is sampled.  The more it is sampled, the more accurate it will be over time.  In between samples, the algorithms must continue on their own.  It is during this time that they can depart from what is happening in the analog/real world.  Obviously, the more powerful a computer is used and the more sophisticated algorithms are used the less deviation from the real world phenomena.  In theory, a perfect digital simulation will never be reached, but it is hoped that in time it will become close enough to the real physical phenomena that no one will be able to tell the difference in outcomes of both the model and the reality.  The authors of the paper allude to this when they say,
Carrying out such a long timescale simulation also potentially exposes correctness issues with the implementation of MD in a particular application. In this context, correctness refers to the degree to which the numerically simulated trajectory is representative of an actual trajectory within the model (potential surface) used. For a constant energy simulation, the size of both the short-term fluctuations and especially the long-term drift in the energy can be used as indicators of correctness or the lack thereof...
In our next installment we shall discuss more of this Blue Matter algorithm.  Stay tuned. 

1 comment:

Anonymous said...

Way awesome, some legitimate details! I value you generating this informative article obtainable, the remainder of the internet site can be substantial top quality. Possess a exciting.