1. Pr[F] = Pr[E]. The location doesn't matter: there are 52 possibilities for the 3rd card, and 13 of them are a spade, so Pr[F] = 13/52 = 1/4. Here's a more careful approach. The experiment is that we choose a random permutation of the 52 cards. There are 52! possible outcomes, and each outcome can be represented as a sequence (w_1,..,w_52) of cards, where w_i is the ith card from the top. All outcomes are equally likely, so we have a uniform distribution. The event E is a set of outcomes, namely: E = {(w_1,..,w_52) : w_1 is a spade} and F = {(w_1,..,w_52) : w_3 is a spade} Consider the function f:E->F given by f(w_1,..,w_52) = (w_2,w_3,w_1,w_4,w_5,w_6,..,w_52) This maps every outcome in E to an outcome in F, and vice versa, so f is a bijection. That means that |E| = |F|, so Pr[F] = |F|/|Omega| = |E|/|Omega| = Pr[E] = 1/13. Alternatively, we could count the number of outcomes (w_1,..,w_52) in F directly. We find that there are 13 choices for w_3 that make the 3rd card a spade, and then there are 51 choices for w_1 (everything but what we chose for w_3), 50 choices for w_2 (everything but what we chose for w_3 and w_1), 49 choices for w_4, 48 choices for w_5, etc., so in all we find that |F| = 13 * 51!. Hence Pr[F] = |F|/|Omega| = (13 * 51!)/52! = 13/52 = 1/4. Comment: This illustrates a principle known as the "principle of deferred decisions". Often, it is useful to think of a probabilistic experiment as a sequence of random choices. One way to think of the experiment here is that first the top card is chosen randomly from all 52 cards, then the next card is chosen randomly from all remaining 51 cards, etc. The principle of deferred decision says that we can instead choose these cards in any other order that is convenient to us, so another way to think of the experiment is that first we choose the 3rd card, then we choose the top card, then the 2nd card, and so on. The latter way of thinking about it is more convenient for this problem, so we use that. The principle of deferred decisions says you can use whichever is most convenient. If you like, you can think of the principle of deferred decisions as a "lazy" version of the experiment, where we make a choice only when we need it for the problem being analyzed. (This is "lazy" in the computer science sense.) 2. E[f(X)] is larger: E[f(X)] = 1/2 * (0-2)^2 + 1/2 * (1-2)^2 = 2.5. f(E[X]) = f(1/2) = (1/2 - 2)^2 = 2.25.