So imagine you're trying to figure out whether someone you like (romantically) returns your interest. Obviously this is a pretty complicated question, with many factors at play, and you also don't just want a yes-no answer: love and life are uncertain, so you want a probability. So you say to yourself, as Romeo once did, "Is there a intuitive mathematical structure I could use to reason probabilistically about this problem?"
As it happens, the answer is "Yes: a probabilistic graphical model." Let's start with a very simple scenario: you're trying to figure out if she likes you based on whether or not she smiles at you in the hallway. (I'm going to use female pronouns here to keep things concrete because I think this topic is tricky and abstract enough already.) But there's a complicating factor: her dog may've died. People don't tend to smile when that happens. We can represent this scenario using the diagram below:
An arrow from one circle to a second circle means that the probability of the second depends on whether the first is true. So the probability that she smiles at you depends on whether she likes you and whether her dog died. But the probability that her dog died does not depend on whether she likes you. Makes sense, right? You can also take an arrow from circle 1 to circle 2 to mean "event 1 affects event 2", and we call circle 1 the "parent" of circle 2. The redder the circle is, the more likely it is; mouse over circles to see probabilities. For example, the model believes there's a 35.8% chance she'll smile at you in the hallway.
What did we actually have to give the model to compute that number? Three things: one for each circle.
|Did Her Dog Die?||Does She Like You?||Probability She Smiles At You|
You'll notice that the third ingredient is more complicated: that's because whether she smiles at you has two parents (circles that point to it), while the other two circles have no parents: in general, we have to specify how likely we think each event is to occur given its parents. Once we've given the model those probabilities, it uses the laws of probability to compute all the other probabilities we're interested in. (If you're interested in the mathematical details, see .) The probability we really care about is how likely she is to like you given that she smiled at you in the hallway, and this model allows us to easily calculate that as well. Click on the "she smiles in you at the hallway" node to set it to 100% (it should turn red): this corresponds to telling the model, "Okay, I know she smiled at me. How does that affect the probability she likes me?" We call this "observing evidence".
As you can see (seriously, please actually click on the circle; I suffered for this visualization ) the "She Likes You" circle gets redder when we set the "She Smiles At You In Hallway" node to on -- it goes from 10% to 25%. (Also, the probability that her dog died goes down -- from 1% to 0.4% -- which makes sense: given that she's smiling at you, it's less likely that her dog died.) If we click on the "She Smiles at You" again, which sets it to 0% -- in other words, we know she didn't smile at you -- then the probability that she likes you drops to 1.6%. Click on the circle a third time to set it back to its original state, where we haven't observed any evidence. (Again, for mathematical details, see .)
Basically, each circle can be in one of three states: we know it happened, we know it didn't happen, or we don't know whether it happened or not, and we're trying to compute the probability that it did given the things we do know. Try playing with the model a bit, and you'll find that it allows us to easily draw other intuitive conclusions. For example, if you set both "Her Dog Died" and "She Smiles at You" to "on", the probability that she likes you goes way up -- to 35.7% -- basically because if she's smiling at you even though her dog died, that's gotta be a good sign. On the other hand, if we set "She Smiles at You" to off, that's a bad sign, but if we then set "Her Dog Died" to on, it becomes a less bad sign: you'll see the probability that she likes you increase from 1.4% to 5.8%. The model is saying, "Don't worry, bro! She didn't ignore you because she doesn't like you -- it's because she's bummed about her dog" . I find this reasoning eerily human -- having a probabilistic graphical model is almost like having a friend, which probably explains why I like them.
I think this is enough to digest in one post, so please play with the models and let me know if anything's unclear. If you made it this far, congratulations! This topic is taught in Stanford's CS 228, which has a reputation for difficulty. But this math is so elegant and widely applicable -- in fields from biology to psychology to economics -- that it really ought be presented to a wider audience. (I'm also planning to deliver this material as a lecture to a bunch of (precocious) high school students, so I would appreciate any feedback.) As a reward for making it this far, here's a more intricate model -- click on things! -- which I'll discuss in more detail in a later post.
All visualizations created by Emma Pierson.
 Using these tables, we can calculate the probability that, for example, she smiles at you in the hallway: it's...
Probability her dog DID die * Probability she DOES like you * Probability she smiles at you given that her dog DID die and she DOES like you +
Probability her dog DIDN'T die * Probability she DOES like you * Probability she smiles at you given that her dog DIDN'T die and she DOES like you +
Probability her dog DID die * Probability she DOESN'T like you * Probability she smiles at you given that her dog DID die and she DOESN'T like you +
Probability her dog DIDN'T die * Probability she DOESN'T like you * Probability she smiles at you given that her dog DIDN'T die and she DOESN'T like you
In other words, you sum over all four possible scenarios of whether she likes you and whether her dog died and figure out how likely each is and how likely she is to smile at you in each scenario using the tables. So, for example, there's only a 1/1000 chance that she likes you and her dog died (1% * 10%) but there's a 50% chance she'll smile at you in that scenario, so its total contribution to the sum is 1%*10%*50% = .5%. When we add up all four entries, we get 35.8%.
 Creating this visualization was quite an adventure:
 We can compute the probability that she likes you given that she smiled at you as follows. Let's abbreviate she likes you as <3 and she smiled at you as :D, so the probability she smiles at you is P(:D). Then the probability that she likes you given that she smiled at you is
P(:D AND <3) / P(:D)
(I only do equations with emoticons.) In other words, it's the probability that she likes you AND that she smiles at you divided by the probability that she merely smiles at you. If you think about it, this makes sense: for example, if there's a 60% chance she smiles at you, and a 20% chance she smiles at you AND she likes you, then if you know you're in the 60% where she smiles at you, there's a 20/60 = 1/3 chance she likes you. Now we need to know how to compute P(:D AND <3) and P(:D). The naive way to do it is just to compute all the probabilities of all possible worlds, and add up the ones we want...
|Did Her Dog Die?||Does She Like You?||Did She Smile At You?||Probability of this world|
|No (99%)||No (90%)||No (70%)||99% * 90% * 70%|
|No (99%)||No (90%)||Yes (30%)||99% * 90% * 30%|
|No (99%)||Yes (10%)||No (10%)||99% * 10% * 10%|
 It's worth stopping to think about why whether her dog died can affect the probability that she likes you given that she didn't smile at you, even though the dog and her feelings for you are clearly causally unrelated. The key is that both her dog and her feelings could explain her not smiling, and if we know the dog died, we worry less about her feelings. We call this phenomenon "explaining away", and we do it all the time: if I hear a noise in the kitchen at midnight, I get freaked out, but if I remember that my friend is staying over, I calm down; my friend has "explained away" the noise.