Section Handout #7 Solutions: Probability

Will you add a note on the pattern for the students? Also you didn't write anything in 3c) 3d): P(A | B=b, C=c) = 0.22, not 0.105... No idea how you got 0.105... The distribution over A was 0.018, 0.064 unnormalized (directly from the joint).

In Class

Question 1. Consider the following 3-sided dice with the given side values.  Assume the dice are all fair (each side has probability ) and all rolls are independent.

    A: 2, 2, 5
    B: 1, 4, 4
    C: 3, 3, 3

  1. What is the expected value of each die?
  2. Answer: 3 for A,B,C

  3. Consider the indicator function better(X,Y) which has value 1 if X>Y and value -1 if X<Y.  What are the expected values of better(A, B), better(B, C), better(C, A)?  Why are these sometimes called non-transitive dice?
  4. Answer:
      better(A,B) = (1) (5/9) + (-1) (4/9) = 1/9
      better(B,C) = (1) (2/3) + (-1) (1/3) = 1/3
      better(C,A) = (1) (2/3) + (-1) (1/3) = 1/3

      So, A is better than B, B is better than C, and C is better than A.

Question 2. On a day when an assignment is due (A=a), the newsgroup tends to be busy (B=b), and the computer lab tends to be full (C=c).  Consider the following conditional probability tables for the domain, where A = {a, a}, B = {b, b}, C = {c, c}.

P(A) P(B|A) P(C|A)

A

P

a 0.20
a 0.80

B

A

P

b a 0.90
b a 0.10
b a 0.40
b a 0.60

C

A

P

c a 0.70
c a 0.30
c a 0.50
c a 0.50
  1. Construct the joint distribution out of these conditional probabilities tables assuming B and C are independent given A.
  2.   Answer: We use the chain rule, P(A,B,C) = P(A) P(B,C | A) and then use the independence of B and C given A in this problem to derive P(A,B,C) = P(A) P(B|A) P(C|A)
    P(B,C|A) P(A, B, C)

    A

    B

    C

    P

    a b c 0.63
    a b c 0.27
    a b c 0.07
    a b c 0.03
    a b c 0.2
    a b c 0.2
    a b c 0.3
    a b c 0.3

    A

    B

    C

    P

    a b c 0.126
    a b c 0.054
    a b c 0.014
    a b c 0.006
    a b c 0.16
    a b c 0.16
    a b c 0.24
    a b c 0.24
  3. What is the marginal distribution P(B,C)?  Are these two variables absolutely independent in this model?  Justify your answer using the actual probabilities, not your intuitions.
  4.   Answer: We marginalize out A using the P(A,B,C) table from part a).

    B

    C

    P

    b c 0.286
    b c 0.254
    b c 0.214
    b c 0.246

    We can marginalize out C to get P(B=b) = 0.5 and marginalize out B to get P(C=c) = 0.54. We have P(B=b,C=c) = 0.286 which is not equal to P(B=b) P(C=c) = 0.27. So B and C are not independent.

  5. What is the posterior distribution over A given that B=b, P(A | B=b)?  What is the posterior distribution over A given that C=c, P(A | C=c)?  What about P(A | B=b, C=c)?  Explain the pattern among these posteriors and why it holds.
  6.   Answer: We use P(A|B=b) = P(A,B=b) / P(B=b). We get P(A,B) by marginalizing out C from the full joint P(A,B,C) from part a).
    P(A,B) P(A|B=b) P(A,C) P(A|C=c) P(A|B=b, C=c)

    A

    B

    P

    a b 0.18
    a b 0.32
    a b 0.02
    a b 0.48

    A

    B

    P

    a b 0.36
    a b 0.64

    A

    C

    P

    a c 0.14
    a c 0.40
    a c 0.06
    a c 0.40

    A

    C

    P

    a c 0.26
    a c 0.74

    A

    B

    C

    P

    a b c 0.44
    a b c 0.56

Homework

Question 1. Assume that a joint distribution over two variables, X = {x, x}and Y = {y, y} is known to have the marginal distributions P(x) = P(x) = P(y) = P(y). Give joint distributions satisfying these marginals for each of these conditions:

  1. X and Y are independent
  2.   Answer:    

    X

    Y

    P

    x y 0.25
    x y 0.25
    x y 0.25
    x y 0.25

  3. Observing Y=y increases the belief in X=x, i.e. P(x | y) > P(x)
  4.   Answer:

    X

    Y

    P

    x y 0.50
    x y 0.0
    x y 0.0
    x y 0.50

  5. Observing Y=y decreases the belief in X=x, i.e. P(x | y) < P(x)
  6.   Answer:

    X

    Y

    P

    x y 0.0
    x y 0.5
    x y 0.5
    x y 0.0

Question 2. Sometimes, there is traffic (cars) on the freeway (C=c).  This could either be because of a ball game (B=b) or because of an accident (A=a).  Consider the following joint probability table for the domain, where A = {a, a}, B = {b, b}, C = {c, c}.

P(A, B, C)

A

B

C

P

a b c 0.018
a b c 0.002
a b c 0.126
a b c 0.054
a b c 0.064
a b c 0.016
a b c 0.072
a b c 0.648
 
  1. What is the distribution P(A,B)?  Are A and B independent in this model given no evidence?  Justify your answer using the actual probabilities, not your intuitions.
  2.   Answer:

    A

    B

    P

    a b 0.02
    a b 0.08
    a b 0.18
    a b 0.72
    We have that P(A=a) = 0.2 and P(B=b) = 0.1. We see that P(A,B) = P(A) P(B) for each setting of A and B.


  3. What is the marginal distribution over A given no evidence?
  4.   Answer: We have that P(A=a) = 0.2 and P(A=a) = 0.8.

  5. How does this change if we observe that C=c; what is the posterior distribution P(A | C=c)?  Does this change intuitively make sense?  Why or why not?
  6.   Answer:
    P(A,C) P(A|C=c)

    A

    C

    P

    a c 0.144
    a c 0.136
    a c 0.056
    a c 0.664

    A

    C

    P

    a c 0.514
    a c 0.486


  7. What is the conditional distribution over A if we then learn there is a ball game, P(A | B=b, C=c)?  Does it make sense that observing B should cause this update to A (called explaining-away)?  Why or why not?
  8.   Answer:
    P(A | B=b, C=c)

    A

    B

    C

    P

    a b c 0.22
    a b c 0.78
    Yes, the knowledge that there was a game should explain why there is traffic instead of an accident.