Sunday, July 3, 2022

Information in the Noise

Wow, a non-math nerd posting twice about math. What's that about?

Derek Muller of Veritasium posted a video about the 100 prisoners puzzle (I like "puzzle" in this context better than "riddle" or "problem"). Unlike my earlier post, I have no complaints about this video. Derek is one of the top-tier educational YouTubers, and he did a fantastic job of explaining it. (As before, I'm not going to explain it here; watch his video. Seriously, just watch it.)

So why do I feel the need to comment? I guess I feel I have a small but interesting (to me) tidbit to add.

Derek et al. describe the puzzle's "linked list" solution (my name) as giving a counter-intuitive result, and I guess I have to agree. The numbers are distributed to the boxes randomly, so how could any strategy give a prisoner a better chance of success than random selection? IT'S RANDOM!!!!!!

AN INTUITIVE UNDERSTANDING

And here's my tidbit: it's not as random as it seems. For this puzzle, the numbers are assigned randomly to boxes, without replacement. I.e., you won't find a given number in more than one box, and no number between 1 and 100 is skipped. This is obvious for the setup of the puzzle, but randomizing without replacement puts constraints on the system. Those constraints add information to the noise.

If prisoner number 13 randomly opens box 52, he knows he has a one in 100 chance of seeing his number in that box. He opens it and sees the number 1. He now knows FOR SURE that no other box has the number 1 in it. So his second random choice will have a one in 99 chance of being his number. Each choice gives some information that affects the probability of the next choice. (I.e., the samples are not independent.)

It is these constraints that lead directly to the cycles that are at the heart of the puzzle. And clever people have calculated the probability of having a cycle greater than 50 to be about 0.688. So the "linked list" strategy has ~= 0.312 probability of the prisoners being set free. That's the point of Derek's video.

Let's ruin the puzzle for a moment. Let's assign a random number between 1 and 100 to each box with replacement. It's entirely possible, even probable, that you'll have duplicates (the same number in more than one box) and skips (a number that is not in any box). One effect of this change is that the numbers will no longer necessarily be arranged in cycles. You can have many numbers NOT in a cycle. So the "linked list" solution to the puzzle doesn't improve your chances of survival over pure chance. Getting rid of the "without replacement" constraint removes the information from the noise.

This is how I get an intuitive feeling that you can have a much higher probability of success with the "linked list" solution to the original puzzle - you're taking advantage of the information that's in the noise.

WITH REPLACEMENT

What about my ruined version, where the numbers are assigned to boxes with replacement? To start with, let's calculate the probability that you get a distribution of numbers in boxes that is even possible for the prisoners to win (i.e., every number 1-100 is assigned exactly once). My probability-fu is weak, but I'll try. I think it is (100!)/(100**100) ~= 9.33e-48. Wow, that's a really low probability.

On the off chance that you get a solvable distribution, the probability of success with the linked list solution is ~= 0.312. So the total probability of success for my ruined version, WITH the linked list solution, is ~= 6.4e-43. If instead the prisoners choose their boxes randomly, then it's ~= 7.36e-73.

The prisoners had better appeal to Amnesty International.

No comments: