I’m struggling teaching probability in AP Statistics for a lot of reasons, that I may enumerate and reflect on in a later post, but for now, here’s a problem from my text’s tests (numbers and exact context changed) that I found particularly troublesome for students.

“A grocery store examines its shoppers product selection and calculates the following: The probability that a randomly-chosen shopper buys milk is 0.34. The probability that the randomly chosen shopper buys bananas is 0.18. The probability that the shopper buys both is 0.11. Make a Venn Diagram and Two-Way table to represent these probabilities.”

We did several problems like this in class (though not enough, and not explicitly enough) and they mostly did pretty good. This particular problem, which I gave them on a quiz, caused a nearly full-class freak out. I got a lot of 2-way tables that looked like this:

Milk | Bananas | ||

Buy | .34 | .18 | .53 |

Don’t Buy | .66 | .82 | 1.26 |

1 | 1 |

instead of this:

Buys Milk | Does not buy milk | ||

Buys Bananas | 0.06 | .12 | .18 |

Does not buy Bananas | .28 | .44 | .82 |

.34 | .66 | 1 |

Some students started making the first table, realized it didn’t follow the normal properties of a 2-way probability table, and freaked out, but without hints and prodding from me they had a VERY hard time figuring out the correct things to go in the headers.

I had never seen this before.

I think the problem comes in the analysis of the question to figure out what the *variables* of this scenario are (there should always be two for a two way table) and what the *outcomes* are. We have talked in class about thinking of a variable as being like the question we ask, and the outcomes are the possible answers. So in this case the correct interpretation is two different questions: “Did you buy milk?” and “Did you buy bananas?”, each with two different possible outcomes.

They instead imagined the question “What did you buy?” with a more open-ended set of possible answers. So they thought milk and bananas belonged on the same row/column. Some realized this caused issues with mutual exclusivity (though not as many as I’d hope) and ended up making a **one-way table** with the variable “bought” and the answers “milk” “bananas” “both” “neither”. Which , if you mean “milk and not bananas” when you say “milk” at least allows for mutual exclusivity and a sum of 1, but it does lose out on most of the advantages of a 2-way table.

I will test today by asking the same questoin with a different scnenario as a warmup, replacing “bought bananas” with “were men”. I think this will help them realize that there are actually two DIFFERENT QUESTIONS rather than two different answers to the same question, and it may help them clarify their thinking.