Section Handout #7 Solutions: Probability

Will you add a note on the pattern for the students? Also you didn't write anything in 3c) 3d): P(A | B=b, C=c) = 0.22, not 0.105... No idea how you got 0.105... The distribution over A was 0.018, 0.064 unnormalized (directly from the joint).

In Class

Question 1. Consider the following 3-sided dice with the given side values.  Assume the dice are all fair (each side has probability ) and all rolls are independent.

A: 2, 2, 5
B: 1, 4, 4
C: 3, 3, 3

1. What is the expected value of each die?
2. Answer: 3 for A,B,C

3. Consider the indicator function better(X,Y) which has value 1 if X>Y and value -1 if X<Y.  What are the expected values of better(A, B), better(B, C), better(C, A)?  Why are these sometimes called non-transitive dice?
better(A,B) = (1) (5/9) + (-1) (4/9) = 1/9
better(B,C) = (1) (2/3) + (-1) (1/3) = 1/3
better(C,A) = (1) (2/3) + (-1) (1/3) = 1/3

So, A is better than B, B is better than C, and C is better than A.

Question 2. On a day when an assignment is due (A=a), the newsgroup tends to be busy (B=b), and the computer lab tends to be full (C=c).  Consider the following conditional probability tables for the domain, where A = {a, ¬a}, B = {b, ¬b}, C = {c, ¬c}.

P(A) P(B|A) P(C|A)
 A P a 0.20 ¬a 0.80
 B A P b a 0.90 ¬b a 0.10 b ¬a 0.40 ¬b ¬a 0.60
 C A P c a 0.70 ¬c a 0.30 c ¬a 0.50 ¬c ¬a 0.50
1. Construct the joint distribution out of these conditional probabilities tables assuming B and C are independent given A.
2.   Answer: We use the chain rule, P(A,B,C) = P(A) P(B,C | A) and then use the independence of B and C given A in this problem to derive P(A,B,C) = P(A) P(B|A) P(C|A)
P(B,C|A) P(A, B, C)
 A B C P a b c 0.63 a b ¬c 0.27 a ¬b c 0.07 a ¬b ¬c 0.03 ¬a b c 0.2 ¬a b ¬c 0.2 ¬a ¬b c 0.3 ¬a ¬b ¬c 0.3
 A B C P a b c 0.126 a b ¬c 0.054 a ¬b c 0.014 a ¬b ¬c 0.006 ¬a b c 0.16 ¬a b ¬c 0.16 ¬a ¬b c 0.24 ¬a ¬b ¬c 0.24
3. What is the marginal distribution P(B,C)?  Are these two variables absolutely independent in this model?  Justify your answer using the actual probabilities, not your intuitions.
4.   Answer: We marginalize out A using the P(A,B,C) table from part a).

 B C P b c 0.286 ¬b c 0.254 b ¬c 0.214 ¬b ¬c 0.246

We can marginalize out C to get P(B=b) = 0.5 and marginalize out B to get P(C=c) = 0.54. We have P(B=b,C=c) = 0.286 which is not equal to P(B=b) P(C=c) = 0.27. So B and C are not independent.

5. What is the posterior distribution over A given that B=b, P(A | B=b)?  What is the posterior distribution over A given that C=c, P(A | C=c)?  What about P(A | B=b, C=c)?  Explain the pattern among these posteriors and why it holds.
6.   Answer: We use P(A|B=b) = P(A,B=b) / P(B=b). We get P(A,B) by marginalizing out C from the full joint P(A,B,C) from part a).
P(A,B) P(A|B=b) P(A,C) P(A|C=c) P(A|B=b, C=c)
 A B P a b 0.18 ¬a b 0.32 a ¬b 0.02 ¬a ¬b 0.48
 A B P a b 0.36 ¬a b 0.64
 A C P a c 0.14 ¬a c 0.40 a ¬c 0.06 ¬a ¬c 0.40
 A C P a c 0.26 ¬a c 0.74
 A B C P a b c 0.44 ¬a b c 0.56

Homework

Question 1. Assume that a joint distribution over two variables, X = {x, ¬x}and Y = {y, ¬y} is known to have the marginal distributions P(x) = P(¬x) = P(y) = P(¬y). Give joint distributions satisfying these marginals for each of these conditions:

1. X and Y are independent
 X Y P x y 0.25 x ¬y 0.25 ¬x y 0.25 ¬x ¬y 0.25

3. Observing Y=y increases the belief in X=x, i.e. P(x | y) > P(x)
 X Y P x y 0.50 x ¬y 0.0 ¬x y 0.0 ¬x ¬y 0.50

5. Observing Y=y decreases the belief in X=x, i.e. P(x | y) < P(x)
 X Y P x y 0.0 x ¬y 0.5 ¬x y 0.5 ¬x ¬y 0.0

Question 2. Sometimes, there is traffic (cars) on the freeway (C=c).  This could either be because of a ball game (B=b) or because of an accident (A=a).  Consider the following joint probability table for the domain, where A = {a, ¬a}, B = {b, ¬b}, C = {c, ¬c}.

P(A, B, C)
 A B C P a b c 0.018 a b ¬c 0.002 a ¬b c 0.126 a ¬b ¬c 0.054 ¬a b c 0.064 ¬a b ¬c 0.016 ¬a ¬b c 0.072 ¬a ¬b ¬c 0.648

1. What is the distribution P(A,B)?  Are A and B independent in this model given no evidence?  Justify your answer using the actual probabilities, not your intuitions.
 A B P a b 0.02 ¬a b 0.08 a ¬b 0.18 ¬a ¬b 0.72
We have that P(A=a) = 0.2 and P(B=b) = 0.1. We see that P(A,B) = P(A) P(B) for each setting of A and B.

3. What is the marginal distribution over A given no evidence?
4.   Answer: We have that P(A=a) = 0.2 and P(A=¬a) = 0.8.

5. How does this change if we observe that C=c; what is the posterior distribution P(A | C=c)?  Does this change intuitively make sense?  Why or why not?