S3 Home‎ > ‎Our Work‎ > ‎Areas of Expertise‎ > ‎

Modeling

Minimalist Description

Models are abstractions of the world that strip away all of the irrelevant features and leave only the parts that play a role in solving the problem at hand. Models can take many forms:
  • Entity Relationship Models: Describe the different relationships between pieces of a system, such as whole/part relationships, "is a" versus "has a" relationships, etc. These are important in modeling software systems among other things.
  • Network Models: Describe a system in terms of a graph. Typically, the relationships are weighted. For instance, the likelihood of two people to be compatible with one another.
  • Mathematical Models: Describe the behavior of a system or a component as an equation or a set of equations. For example, a Lotka-Volterra model that describes the populations of two kinds of bacteria competing for the same food source.
  • Stochastic Models: Describe the statistical dependencies between random variables. For instance, the likelihood of catching the flu, conditioned by the frequency of interaction with large groups, and whether or not you've had a flu shot. The figure at the right is a Probabilistic Graphical Model (PGM) for an association problem, where four observations are associated with three objects. The graph explicitly defines the dependencies between observations and objects. "Loopy Belief Propagation" can then provide the joint distributions. 


Our purpose in modeling is to understand systems. A "model" of a system has:

  1. State Variables: The quantities or qualities that describe the system. Together, these describe the system sufficiently to predict the values of the state variables at some future point in time, assuming that no external forces are applied to the system. So if the system is a deep space object, like the Voyager 1 probe moving through interstellar space at 30,000 miles/hour in a particular direction, its current position and velocity (state variables) are sufficient to predict, with confidence, its position and velocity at a future point in time. 
  2. Assumptions: Claims as to what matters to predicting the state variables and what doesn't. For the Voyager probe, we might assume that the probe is far enough from the Sun that the Sun's gravity has no significant impact on its path. 
  3. Relationships: Equations or other logical inferences that describe the state variables. In the case of the Voyager probe, P(t)=P0+tV0; V(t)=V0 may be a sufficient relationship. These relationships are consistent with the assumptions. If the probe was nearer to the Sun, we would need different relationships that include gravitational forces.
  4. Parameters: The quantities or settings that govern the (theoretical) relationships. These are different from the state variables because the parameters affect how we choose to model the state variables and not the state variables themselves. For our Voyager model, there are no parameters. But if the probe was in orbit around the Sun, its inclination, eccentricity, etc. and the Sun's mass would be system parameters (we chose a deep space probe to keep it simple).
  5. Controls: The knobs we can turn to affect the state variables. When we described the state variables, we had an escape clause: "assuming that no external forces are applied to the system." Controls allow us to apply an external force to the system. Control theory is the mathematical theory of how to turn the knobs to move the state variables to desired values. If the scientists at the Jet Propulsion Laboratory want to change the Voyager probe's position or velocity, they can fire its thrusters. After 2040 when Voyager 1 runs out of rocket fuel, it won't have any controls. But right now we can still turn that knob to change the probe's state variables (Voyager 2 is expected to run out of rocket fuel in 2034, though both craft are likely to lose electrical power before then, making it impossible to turn that dial). 
All of these things taken together constitute the model of a system. Different kinds of systems call for different kinds of models, as above in the minimalist description. We note that while the model describes the cause-and-effect relations, it does not, in and of itself, specify the effects of those relations. The model is enough to specify the effects, but doesn't necessarily specify the effects directly. That is simulation. Sometimes simulation is trivial. In the case of the Voyager space probe, simulation is as simple as substituting a value of t in the equation P(t)=P0+tV0. But if the probe was in orbit, simulation might require a complex numerical solution of a system of differential equations (real planets and real stars have complex, asymmetric gravitational fields that draw orbits slightly away from simple ellipses).

Models are important enough that the Voyager probes were launched with some models aboard, in case any intelligent life form ever found one of the probes and wondered what it was or where it came from. That isn't likely for at least 40,000 years when Voyager 1 will come within about 10 trillion miles of the star AC+79 3888 in Camelopardalis. But if some Camelopardalian finds the "Golden Record" aboard the probe, they will be able to reason about where to find us from a model of the Sun's position with respect to the local pulsars (lower left in the accompanying figure). The other drawings on the "Golden Record" can also be treated as models. In the upper left is a model of the phonograph player that the Camelopardalians would need to understand in order to play the record. In the upper right is a model of the audio signal they would expect to find if they played the record, and in the lower left is a model of the hydrogen atom that they could use to understand the way we represent math and science. 

If models are good enough to communicate with alien life forms, they can help us communicate with clients. The world is much more complicated when we don't have a way to break it down into pieces that fit neatly together. The Natural Philosophers call that process "Reductionism". There are other communication tools, but Reductionism is a powerful one. We can communicate with our clients about their issues by reducing them to modeling questions. If we can use the state variables, assumptions, relationships, parameters, and controls to build a language, we can hold a useful conversation. Our clients can use that language to understand and regulate the flow of our work and maximize its value.



S3 Data Science, copyright 2015