The Two Envelopes problem

You are the subject of an experiment. You are presented with two closed envelopes, prepared by a group of people I’ll call Team E. One of the envelopes contains twice as much money as the other, but you don’t know the actual dollar amounts.

You must choose one envelope (at random — there’s no other way). Call your chosen envelope “A”, and the other one “B”.

Now you’re given a choice: You can open envelope A and keep the money in it, or you can switch to envelope B. Your goal is to maximize your expected winnings. Should you switch?

This is a well-known logic problem, known as the Two Envelopes problem, or the Two Envelopes paradox. (Wikipedia article here.)

Argument #1

Let $A be the amount in envelope A. If you keep A, the expected value of your winnings is $A. If you switch to B, there’s a 50% chance of getting $2A, and a 50% chance of $0.5A, for an expected winnings of ($2A + $0.5A)/2, or $1.25A. Since $1.25A is greater than $A, it is rational to switch.

Suppose you switch. Now in possession of the unopened envelope B, you’re offered the option of switching back to envelope A. Using the same reasoning as before, the expected value keeping B is $B, and the expected value of switching to A is $1.25B, which is larger, so you should switch back.

And this switching back and forth could continue forever, increasing your expected winnings exponentially.

Argument #2

There are lots of ways to approach the problem that suggest that switching doesn’t change your expected winnings. For example, let S be the smaller of the two dollar amounts. Regardless of whether you keep or switch, your expected winnings are the same: (S + 2S)/2, or 1.5S.

Argument #3

Stop overthinking it! You have no useful way to distinguish the envelopes, so no strategy could possibly be better or worse than any other strategy. No need to do any calculations.

All these lines of reasoning seem reasonable, but they don’t all give the same answer.

I hope we can agree that Argument #1 is suspicious, at least the “switching back and forth forever” part of it.

There’s probably nothing wrong with Argument #2, but it does not give us any clue as to what might be wrong with #1.

Entire math papers have been written to explain the problems with Argument #1, filled with calculations of conditional and prior probabilities and that sort of thing. But I think the real challenge is twofold:

  • Challenge #1: Explain the flaw in Argument #1, in a way that is convincing and easy to understand, even for someone who is not a mathematician specializing in probability theory.
  • Challenge #2: Explain the flaw in Argument #1, in a way that is robust. It should still work, without major changes, even if the original problem is modified in various ways.

A remarkable thing about the Two Envelopes problem is how resistant it is to Challenge #2. For every solution to one variant of the problem, it seems like there’s a slightly different variant that it doesn’t solve.

My thoughts on challenge #1

I tried to find a way to approach Challenge #1 in a way that works for me.

Consider the following two things:

  1. Whether you’ll gain money by switching from A to B
  2. The absolute dollar amount in A

Is it possible that there’s a correlation between them?

Yes, it’s more than possible. It depends on Team E’s methodology, but it seems inevitable that there will end up being some correlation. More than likely, in a general sort of way, the larger the dollar amount in A, the lesser the probability that you’ll gain money by switching to B.

Switching from A to B will gain you money 50% of the time, but the cases where it does are correlated with the cases where the amount in A is relatively small.

I think that’s maybe the best straight-to-the-point resolution of the paradox. But unfortunately, it has a large asterisk attached to it. I’ll get back to that soon. First, I offer another way of thinking that may shed some light on what’s wrong with Argument #1.

A-tokens and B-tokens

Suppose that, in addition to money, you’re also awarded some number of “A-tokens”. If you open envelope A, you always get 1 A-token. If you open envelope B, you get either 2 A-tokens or 0.5 A-tokens, depending on the amount in envelope B relative to the amount in envlope A.

We can also imagine “B-tokens”, awarded to you based on your winnings relative to the amount that turns out to be in envelope B.

If, instead of money, your goal is to maximize the expected number of A-tokens you win, then your best strategy is, in fact, to always open envelope B. That way, you’ll win 1.25 A-tokens on average, versus 1.0 if you always open envelope A.

The reasoning here is basically the same as the reasoning used in Argument #1 (though now it might not make sense to switch back to envelope A, since you know what’s in it). So it seems that Argument #1 is actually trying to maximize A-tokens, instead of money. The rationale for switching back to envelope A then adds to the confusion, by adding B-tokens to the mix.

I won’t pursue this line of thinking further, but it helps me to realize that Argument #1 might be subtly conflating two or three different value systems, and maximizing the wrong one.

Further analysis

It’s not enough that there be some correlation. There has to be a sufficient amount of it. Switching might not improve your expected winnings by the full 25% that Argument A suggests, but if it’s more than 0%, the paradox lives. There’s some wiggle room.

Change the game a little. Give the player full knowledge of the algorithm Team E will use to select the amounts. And we’ll consider what happens if the player is allowed to peek inside envelope A, and then make a one-time decision to keep, or switch.

With these advantages, the player ought to be able to legitimately calculate the expected value of keeping vs. switching, and choose the best one.

If there is a limit to the amount of money that Team E can put in an envelope, nothing too weird happens. It can be arranged so that the player should almost always switch, but it will be balanced out by the larger amounts of money at stake in the rare times that he should keep.

Now, what if Team E has unlimited funds?

Unlimited money game #1

Suppose Team E’s algorithm gives it a 3/4 chance of putting ($1, $2) in the envelopes, a 3/16 chance of ($2, $4), a 3/64 chance of ($4, $8), etc. That is, a 3 \over {2^{2N}} chance of (\$ 2^{N-1}, \$ 2^N), for N = 1, 2, 3…

If the player peeks in envelope A and sees $1 (which will happen 3/8 of the time), he should switch. If he sees any other amount, there’s a 12/15 chance that envelope B contains $0.5A, and a 3/15 chance it contains $2A. That works out to an expected value of $0.8A, so he should keep.

If the player doesn’t peek in the envelope, his expected winnings will be the same no matter what he does. The expected value of this game is finite (whether or not the player peeks), so we can do the calculations to verify this.

So, there’s no apparent weirdness here, even with no maximum dollar amount.

Unlimited money game #2

This time, Team E’s algorithm gives it a 1/2 chance of ($1, $2), a 1/4 chance of ($2, $4), a 1/8 chance of ($4, $8), a 1/16 chance of ($8, $16), etc.

If the player peeks in envelope A and sees $1 (which will happen 1/4 of the time), he should switch. If he sees any other amount, there’s a 2/3 chance that envelope B contains $0.5A, and a 1/3 chance it contains $2A. That works out to an expected value of $1.0A, so it doesn’t matter if he switches or not. He can do whatever he wants.

So he may as well simplify his strategy, and switch no matter what. In fact, why even peek in the envelope, if you’re always going to do the same thing no matter what’s in it? But wait. If you don’t peek, then by Argument #3 above, all strategies are equally good. Always-switch can’t be better than always-keep. Yet we just showed that always-switch dominates always-keep. It’s sometimes better, and never worse.

So, something weird is going on here. My argument about correlation takes a big hit.

The expected value of this game is infinite, which must be a big part of the problem. Note that the series diverges to infinity in an “orderly” fashion, like 1+1+1+1+… does, where the size of a term is bounded.

I suspect the correlation argument does work provided the expected value is finite, and Team E’s algorithm is well-defined. But all bets are off if that’s not the case.

Unlimited money game #3

This time, Team E’s algorithm gives it a 1/3 chance of ($1, $2), a 2/9 chance of ($2, $4), a 4/27 chance of ($4, $8), an 8/81 chance of ($8, $16). I.e. a {2^N}\over{3^{N+1}} chance of (\$ 2^{N-1}, \$ 2^N) for N = 1, 2, 3…

If the player peeks in envelope A and sees $1 (which will happen 1/6 of the time), he should switch. If he sees any other amount, there’s a 3/5 chance that envelope B contains $0.5A, and a 2/5 chance it contains $2A. That works out to an expected value of $1.1A, so he should switch.

In other words, he should always switch, no matter what he sees in envelope A. It’s never even a tie. So why even peek? Just switch.

The expected value of this game is infinite. This series diverges in a more “disorderly” fashion, like 1+2+3+4+… does, where the size of a term is unbounded.

Any purported resolution to the Two Envelopes problem has to figure out how to explain this one. Blaming it on infinity is not wrong, but I don’t find it completely satisfying.

Real world considerations

In this post, I’ve ignored mundane issues like how much money can actually fit in an envelope (can you detect that one envelope is fatter?), or the divisibility of money (can an envelope contain half a cent?), or the fact that things like money and computer memory are finite.

However, especially for variants of the problem with no maximum dollar amount, it isn’t necessarily okay to ignore such things.

If you write a computer program to simulate the unlimited-money games discussed here, you will not be able to guarantee that your program will always work. Absurdly small though the probability may be, there will always be a nonzero chance that it will run out of memory and fail to correctly simulate an instance of the game.

And the prize money involved in those theoretical instances where it fails could be so incredibly high that it overwhelms the money involved in all the instances where it succeeds.

So perhaps we’re saved by the fact that infinite tasks are not possible in this universe. But even that’s not a fully satisfying resolution. After all, this is a thought experiment, so couldn’t we just put it in a hypothetical universe where infinite tasks are possible? Or are such universes always logically inconsistent? I don’t know, but I’m going to leave it at that.

[See also my related post: The “guess what number I’m thinking of” problem.]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s