## Project 3.1: Probability

DUE DATE EXTENDED TO 10/25
Due 10/25 at 11:59pm: Submit your assignment in a .pdf or a .txt file. You may submit one assignment per group, as always, though we recommend everyone work through this assignment on their own before meeting with a partner.

Question 1 (2 points). Consider the following 3-sided dice with the given side values.  Assume the dice are all fair and all rolls are independent.

A: 2, 2, 5
B: 1, 4, 4
C: 3, 3, 3

1. What is the expected value of each die?
2. Consider the indicator function better(X,Y) which has value 1 if X>Y and value -1 if X<Y.  What are the expected values of better(A, B), better(B, C), better(C, A)?  Why are these sometimes called non-transitive dice?

Question 2 (2 points). Assume that a joint distribution over two variables, X = {x, ¬x}and Y = {y, ¬y} is known to have the marginal distributions P(x) = P(¬x) = P(y) = P(¬y). Give joint distributions satisfying these marginals for each of these conditions:

1. X and Y are independent
2. Observing Y=y increases the belief in X=x, i.e. P(x | y) > P(x)
3. Observing Y=y decreases the belief in X=x, i.e. P(x | y) < P(x)

Question 3 (2 points). On a day when an assignment is due (A=a), the newsgroup tends to be busy (B=b), and the computer lab tends to be full (C=c).  Consider the following conditional probability tables for the domain, where A = {a, ¬a}, B = {b, ¬b}, C = {c, ¬c}.

P(A) P(B|A) P(C|A)
 A P a 0.20 ¬a 0.80
 B A P b a 0.90 ¬b a 0.10 b ¬a 0.40 ¬b ¬a 0.60
 C A P c a 0.70 ¬c a 0.30 c ¬a 0.50 ¬c ¬a 0.50
1. Construct the joint distribution out of these conditional probabilities tables assuming B and C are independent given A.
2. What is the marginal distribution P(B,C)?  Are these two variables absolutely independent in this model?  Justify your answer using the actual probabilities, not your intuitions.
3. What is the posterior distribution over A given that B=b, P(A | B=b)?  What is the posterior distribution over A given that C=c, P(A | C=c)?  What about P(A | B=b, C=c)?  Explain the pattern among these posteriors and why it holds.

Question 4 (2 points). Sometimes, there is traffic (cars) on the freeway (C=c).  This could either be because of a ball game (B=b) or because of an accident (A=a).  Consider the following joint probability table for the domain, where A = {a, ¬a}, B = {b, ¬b}, C = {c, ¬c}.

P(A, B, C)
 A B C P a b c 0.018 a b ¬c 0.002 a ¬b c 0.126 a ¬b ¬c 0.054 ¬a b c 0.064 ¬a b ¬c 0.016 ¬a ¬b c 0.072 ¬a ¬b ¬c 0.648

1. What is the distribution P(A,B)?  Are A and B independent in this model given no evidence?  Justify your answer using the actual probabilities, not your intuitions.
2. What is the marginal distribution over A given no evidence?
3. How does this change if we observe that C=c; what is the posterior distribution P(A | C=c)?  Does this change intuitively make sense?  Why or why not?
4. What is the conditional distribution over A if we then learn there is a ball game, P(A | B=b, C=c)?  Does it make sense that observing B should cause this update to A (called explaining-away)?  Why or why not?

Question 5 (2 points). Often we need to carry out reasoning over some pair of variables X, Y conditioned on the value of other variable E.

1. Using the definitions of conditional probabilities, prove the conditionalized version of the product rule: P(x, y | e) = P(x | y, e) P(y | e)
2. Prove the conditionalized version of Bayes' rule: P(y | x, e) = P(x | y, e) P(y | e) / P(x | e)

Question 6 (2 points). Suppose we wish to calculate P(C=c | A=a, B=b).

1. If we have no conditional independence information, which of the following sets of tables are sufficient to calculate P(C=c | A=a, B=b)?
1. P(A, B), P(C), P(A | C), P(B | C)
2. P(A, B), P(C), P(A, B | C)
3. P(A, B, C)
4. P(C), P(A| C), P(B | C)
5. P(C | A, B), P(A)
2. Which are sufficient if we know that A and B are conditionally independent given C?