Matching Game Perfect Play

narthur · May 14, 2022, 11:42am

Simplified problem statement

Consider the single-player version of Memory, where you try to uncover matching pairs of cards by flipping over two at a time. If they match, remove them – if not, put them back. And assume you have a perfect memory and play perfectly. Given p pairs of cards (n=2p cards total), how many turns does it take to find all the matches? Give the full probability distribution in terms of p.

Original problem statement

Given a simple single-player matching game composed of pairs of cards randomly arranged in a grid of dimensions x,y, define f(x,y) which returns the probability distribution of required turn counts to complete the game, given perfect play and recall. A single turn is taken by flipping two cards, and flipping them back down if they don’t match, or removing them from the grid if they do.

byorgey · May 14, 2022, 12:01pm

Ooh, interesting problem. I know what I will be thinking about while I sit through graduation this morning. If you get a match, do you get a “free turn”? Or is one turn = flipping two cards?

narthur · May 14, 2022, 12:05pm

I think for the purpose of the original puzzle there are no free turns. Of course, feel free to figure it out that way, too!

dreev · May 14, 2022, 6:38pm

Great problem! I have questions:

Is this the kids’ game apparently known by 9 different names? (You should link us to your implementation! It’s really well done!)
Is 2-by-2 the first nontrivial case?
Does griddedness matter? Does anything change if we put the 2-by-2 grid into a 1-by-4 line?
This function is meant to return the whole probability distribution, not just the expected number of turns if you pick the given card?
Sanity check: before any cards have been turned over, the function returns the same thing for all cards, right?
Is the problem to find that function given an arbitrary set of known-so-far cards?

narthur · May 14, 2022, 9:33pm

Yes! And sure.

Yes. f(1,2) and f(2,1) would both produce a result of {1: 1} (100% probability of a solution in a single turn), and I’m tempted to say f(1,1) would be undefined.

Probably not, but I’m not sure. Does f(1,4) === f(2,2)?

Yes. I think using x,y as the variable names was probably misguided, since it implies coordinates. l,h would probably be more appropriate, as they are representing the dimensions of the grid, not a selection on a specific turn.

I think your last two questions may stem from the same misunderstanding.

Given f(2,2), the result would be a list of turn counts, and the probability of a solution being found at each number of turns before any cards are revealed. Something like:

f(2,2) = {
  2: .25
  3: .75
}

Where the sum of the probabilities equals 1. (Also, that particular distribution was a guess, and may very well be incorrect.)

I suspect that your line of questioning hints at some interesting directions to extend the problem, but hopefully my answers clarify the intent of the original problem.

poisson · May 15, 2022, 1:03am

Yes it seems that f(x, y) = f(1, xy)

It feels like the annoying thing is the possibility that you reveal a card, then second reveal a card matching one you already know. Otherwise would only need to track number of cards known and total in the recursion. But i guess this only doubles the number of possible states.

dreev · May 15, 2022, 1:29am

Sounds right, @poisson! So we can make this just be a function of n, the number of cards.

Another sanity check: Playing optimally is straightforward (if you have a perfect memory). Just flip over a card you haven’t seen before, then flip over its match if you’ve seen it. Otherwise, flip over another one you haven’t seen. You’ll either get lucky and that’ll be a match, or you’ll memorize 2 new cards and repeat the process. I guess eventually there may not be any unseen cards, only seen ones. But in that case you know exactly where all the remaining matches are, so just gather them up.

dreev · May 15, 2022, 1:44am

I think the answer for 4 cards is {2: 1/3, 3: 2/3}. I.e., a 1/3 probability of getting maximally lucky and finishing in 2 turns and a 2/3 probability of getting maximally unlucky and taking 3 turns.

That’s because the possible configurations are:

AABB
ABAB
ABBA

It’s without loss of generality (WLOG) that the first card is A. (Whatever it is, just call it A!) That means there are 3 possible place’s for A’s match. You flip over A first and then have a 1/3 chance of hitting its match (the other A). If you do, you’re home free – 2 turns. If you don’t (2/3 chance) then you only wasted 1 move and are home free anyway. Because you just learned everything you need to know. You know where one A is and where one B is. Flip over an unseen card on the next turn and you know its match. That’s turn 2. The last two cards remaining obviously are the other pair, so that’s turn 3.

I don’t know yet how messy this is going to get with 6 cards, let alone n. It should be fun to simulate at least!

narthur · May 15, 2022, 10:11am

Some additional assertions given f(l,h):

n = l*h # number of cards
p = n/2 # number of pairs
min = p # maximally lucky number of turns

If we’re maximally unlucky, we will find all singles before finding a single pair. Which means that ceil(p/2) turns will be spent gathering information, and then p turns will be spent harvesting. So:

max = ceil(p*1.5) # maximally unlucky number of turns

Which holds for f(2,2), where we only have min and max.

min = 2
max = 3

So every distribution should be composed of the following set of turn counts, where p = l*h/2:

[p, p+1, p+2, ... ceil(p*1.5)]

n	p	buckets	count
4	2	`[2,3]`	2
6	3	`[3,4,5]`	3
8	4	`[4,5,6]`	3
10	5	`[5,6,7,8]`	4
12	6	`[6,7,8,9]`	4

So the number of buckets equals:

(ceil(p*1.5) - p) + 1

byorgey · May 15, 2022, 8:09pm

A few observations. First, inspired by @dreev 's listing of possible configurations, consider this similar game: there is a row of n cards laid out face down. You play just like the usual memory game, except that you must flip them over in order from left to right. The only exception is that when the first card you flip over on a turn is a card you have seen before, you may go back and flip over its match instead of flipping over the next card in the row. But otherwise you must always flip over the next card in the row that you haven’t flipped. (If the second card you flip over on a turn is one you’ve seen before, you’ll need another turn to claim it and its match anyway, so it doesn’t matter if you do it right away on your next turn or wait until the very end to claim all the (now known) pairs.)

I claim this game is actually equivalent to the original game. The reason is that you are going to flip over the unknown cards in some order anyway, and if you have no information about the cards before you start, flipping them over in an order of your choosing is no different than being given a specified order at the start of the game.

I find this perspective helpful though, because we can now think about all the possible permutations of cards, and how many turns are needed for any given ordering.

This is actually not true. It’s not maximally unlucky to keep flipping only cards you’ve never seen before, because at least it gives you information you didn’t have before. What’s maximally unlucky is to keep flipping over cards you’ve already seen before, but only as the second card on a given turn, so you still need to use another turn to get that match anyway and you don’t get to save a turn. The shortest example I can come up with that illustrates this is an ordering like ACBCDAEBDE. Notice that the first four turns have A, B, D, E as the first cards flipped up, all of which are the first occurrence of that card. So you spend four turns flipping over AC, BC, DA, and EB, plus an extra three turns to get the pairs of C’s, A’s, and B’s. Then on your last two turns you will flip over the D’s, then the E’s, for a total of 9 turns, which is bigger than ceil(3/2 * 5) = 8. In general I think we can continue this pattern to force a worst case of n-1 turns.

There’s definitely something here involving the parity of the second occurrences of cards. If the second occurrence of a card is in an odd position, you will flip it up first and you get to save a turn. If it is in an even position, you’ll flip it up second and thus need to spend an extra turn to get that pair anyway. Except that when you do save a turn, it means you only flipped up one unknown card so it swaps the parity of all subsequent cards.

byorgey · May 16, 2022, 7:27pm

Some very naive code for simulating the game and computing probabilities is here: matching/Matching.hs at main · byorgey/matching · GitHub It’s naive because it does not take into account various symmetries like the fact that it doesn’t matter which of two matching cards comes first, or how we label the cards; it simply gets all possible permutations of AABBCCDD… and then computes the turns needed for each. Taking symmetries into account will introduce some complexity but should make it much more efficient; I might do that later.

In any case, I’ve computed the following distributions for p=2 through 5 (p is the number of pairs, i.e. n = 2p):

p=2: {2: 1/3, 3: 2/3}
p=3: {3: 1/15, 4: 8/15, 5: 6/15}
p=4: {4: 1/105, 5: 20/105, 6: 70/105, 7: 14/105}
p=5: {5: 1/945, 6: 40/945, 7: 370/945, 8: 504/945, 9: 30/945}

p=6 seems out of reach for my naive code since it would involve computing all 12! = 479 million permutations of AABB…FF.

A few observations/conjectures:

I have not written all the above probabilities in lowest terms, because it seems like all the probabilities can be written with a denominator of (n-1)!!, that is, 1*3*5*...*(n-1).
In particular it seems that the probability of needing exactly p turns is exactly 1/(n-1)!!.
Also, I conjecture that the probability of needing the maximum n-1 turns is (2^p - 2)/(n-1)!!.

The OEIS hasn’t turned up anything useful in terms of the numerators of the probabilities, though I did notice there are several central binomial coefficients involved, namely, 20 and 70, and 504 is 2*252 which is another one. I don’t know yet if this is a coincidence.

Actually, thinking about it a bit more, conjecture 2 is easy to prove: out of the n! = (2p)! possible permutations of cards, the ones which only need exactly p turns are the ones of the form CCBBAADD… where every pair occurs together. The number of such orderings of cards is p! 2^p (we first choose one of p! possible orderings for the pairs, then for each pair choose one of two orders for the cards in the pair). Then the probability is (p! 2^p) / n! = 1/(n-1)!!.

byorgey · May 17, 2022, 2:01am

A bit more progress: matching/Matching.hs at main · byorgey/matching · GitHub now has some updated, more efficient code, with which I was able to compute additional probabilities up through p=8. Here are the additional probabilities, but listing only the numerators; the denominators are all equal to (2p-1)!!.

p=6: 1, 70, 1330, 5894, 3038, 62
p=7: 1, 112, 3780, 38304, 76692, 16120, 126
p=8: 1, 168, 9156, 175364, 930580, 832992, 78510, 254

OEIS did turn up something useful for the numerators in the second column, i.e. the probabilities for needing p+1 turns: 2, 8, 20, 40, 70, 112, 168… seems to be the sequence A007290 - OEIS with formula 2 \binom n 3. Hence we can conjecture that the probability of needing p+1 turns (one more than the minimum of p) is \displaystyle \frac{2 \binom{p+1}{3}}{(2p-1)!!}. I do not yet know how to explain this or whether it can be seen as an instance of a more general formula for the other probabilities.

dreev · May 17, 2022, 7:18am

[edge of seat!]

Nice work on this!

What about expected value? From your numbers above, I’m getting this:

p	E[turns]
2	2.66667
3	4.33333
4	5.92381
5	7.55238
6	9.16248
7	10.7792
8	12.393

For those following along at home, expected value refers to the average number of turns you would take if you played again and again. So with 16 cards (8 pairs) in the deck, a perfect player would take 12.39 turns on average.

(Also for those who don’t know, OEIS is truly a marvel of the modern world.)

byorgey · May 17, 2022, 10:00am

Ah, yes, expected value is interesting to look at too. I get the same numbers as you, and here are the numerators in exact form for p=1 through 8:

1, 8, 65, 622, 7137, 95244, 1456653, 25120890

Again, the denominators are (2p-1)!!.

byorgey · May 17, 2022, 1:21pm

I’m trying to look for patterns in the sequence of probability denominators for p+2, namely,

6, 70, 370, 1330, 3780, 9156

Based on the pattern for p+1, I conjectured that they might be some constant times some offset of \binom p 4 or \binom p 5, but that doesn’t seem to be the case. Factoring them doesn’t seem to reveal much that jumps out at me (9156 even has 109 as a prime factor). Taking successive differences yields

64, 300, 960, 2450, 5376

which are all highly composite (the largest prime factor any of them have is only 7). This seems promising but I haven’t been able to figure out any particular pattern.

Edited to add: these successive differences divided by two are A004256 - OEIS , i.e. they are given by the formula n^2 (n+1) (n+2)^2 / 3. Tantalizing, but still not sure what to do with that.

Next step is probably to try to come up with some kind of recurrence that could be computed via dynamic programming.

poisson · May 17, 2022, 9:14pm

I’m going to post a short summary of what I’m trying to do in the hopes that it will let me stop thinking about this for now and get back to work.

Consider the current state of the system. It is specified by the number m of pairs of cards still in play, the number k of cards we know, and the bit s\in{0,1} which is 1 if we know a pair of numbers. We may denote the probability distribution for the number of turns t to completion by \tilde{P}{\left(t | m, k, s\right)}.

Note immediately that if s=1 we immediately complete that pair, \tilde{P}{\left(t| m, k, 1\right)} = \tilde{P}{\left(t-1| m-1, k-2, 0\right)} \equiv P{\left(t-1|m-1,k-2\right)}.

With this in mind we may immediately write down a recurrence relation for P. The base cases are P{\left(t| 1, k\right)} = \delta_{t,1} and P{\left(t|m,m\right)} = \delta_{t,m}. The recurrence relation, I believe, is

P{\left(t| m, k\right)} = \frac{k}{2m-k} P(t-1| m-1,k-1) + \left(1-\frac{k}{2m-k}\right)\frac{k}{2m-k-1} P(t-2|m-1,k) \\+ (1-\frac{k}{2m-k})\frac{1}{2m-k-1} P(t-1|m-1,k) +(1-\frac{k}{2m-k})(1-\frac{1}{2m-k-1}) P(t-1|m,k+2)

Now we must solve this recurrence.

I was hoping for some way to eliminate a variable because it’s easier to find explicit solutions to recurrences with two variables on the internet. But after wasting a lot of time thinking about it I couldn’t. So it seems my next step (shudder) is going to be to try to compute a three-variable generating function for P. But since it will be the solution to a second-order PDE without constant coefficients, I am not exactly optimistic that this will work.

dreev · May 17, 2022, 10:53pm

Ooh, this looks promising! Potentially Wolfram’s RSolve function can do this…

It took me a while to figure out that this is the Kronecker delta function. I.e, the probability of taking t more turns when there’s only one more pair face down is 1 if t=1 and 0 otherwise. Similarly, the probability of taking t more turns when there are m more pairs in play, all of which you know what they are, is 1 if t=m and 0 otherwise.

byorgey · May 17, 2022, 10:57pm

Nice! I got this to work after a bit of head-scratching. We need a few more base cases, namely P(0|0,k) = 1, and P(t|m,k) = 0 if m > 0 and t \leq 0. (Also, the base case for P(t|m,m) is redundant.) I then found one small bug/typo in your formula: the second coefficient of the last term should be (1 - \frac{k+1}{2m-k-1}) instead of (1 - \frac{1}{2m-k-1}) since it is the probability of picking a second card which is neither the match of one of the k we know, nor the match of the first card we turned up. But after that it agrees exactly with the probabilities I computed previously. Later I’ll switch it from plain recursion to use dynamic programming which should let us quickly compute the value for any reasonably sized values of t, m, and k.

byorgey · May 18, 2022, 11:02am

Using @poisson 's recurrence with dynamic programming, we can now very quickly compute much larger values. For example, the probability of needing exactly 73 turns in a 10x10 grid of 50 pairs is

P(73|50,0) = \frac{27048558261039196039458690722034818353316913726155081113048}{89348548795725575772606474990495142838821626120878245196044921875}

This doesn’t necessarily get us any closer to a general formula but it is fun.

dreev · May 18, 2022, 10:23pm

Just to replicate the progress, if I put the following into Mathematica, I get the same answers as @byorgey is getting.

Take P\left(t\;\middle|\;m,k\right) to be the probability of needing exactly t turns given m pairs still face down and k of those 2m cards already seen.

P\left(0\;\middle|\;0,k\right) = 1
P\left(1\;\middle|\;1,k\right) = 1
P\left(t\;\middle|\;m,k\right) = 0 if t\leq 0 or m \le 1
P\left(t\;\middle|\;m, 2m\right) = 0
P\left(t\;\middle|\;m, 2m-1\right) = 0
P{\left(t\;\middle|\;m, k\right)} = \frac{k}{2m-k} P\left(t-1\;\middle|\;m-1,k-1\right) \\\quad\quad\quad\quad\quad\quad+ \left(1-\frac{k}{2m-k}\right)\frac{k}{2m-k-1} P\left(t-2\;\middle|\;m-1,k\right) \\\quad\quad\quad\quad\quad\quad+ \left(1-\frac{k}{2m-k}\right)\frac{1}{2m-k-1} P\left(t-1\;\middle|\;m-1,k\right) \\\quad\quad\quad\quad\quad\quad+\left(1-\frac{k}{2m-k}\right)\left(1 - \frac{k+1}{2m-k-1}\right) P\left(t-1\;\middle|\;m,k+2\right)

Or in Wolframaic:

p[t_, m_, k_] := p[t, m, k] = Which[
  t == 0 && m == 0, 1, (* no more pairs so no more turns *)
  t == 1 && m == 1, 1, (* 1 more pair so 1 more turn *)
  t <= 0 || m <= 1, 0, (* no other way to have 0 more turns or 1 more pair *)
  k >= 2 m - 1, 0,     (* prevents division by zero *)
  True, k/(2m-k) p[t-1, m-1, k-1]
    + (1-k/(2m-k)) k/(2m-k-1) p[t-2, m-1, k]
    + (1-k/(2m-k)) 1/(2m-k-1) p[t-1, m-1, k]
    + (1-k/(2m-k)) (1-(k+1)/(2m-k-1)) p[t-1, m, k+2]
]

(I get 80.1732 turns in expectation for a 10x10 grid of 50 pairs.)

I don’t think Mathematica can actually solve the recurrence relation.

And, wait, the plot thickens! Or straightens? I made a plot of the expected number of turns as a function of the number of pairs and it turns out it’s pretty much linear. The function is extremely well-approximated by 1.6138329300699403p - 0.5206273290971469. What do we make of that?

PS: I refined that linear regression to try to focus on what the function is converging to in the limit and am getting this: 1.6137060532056546696p - 0.51154898513237434532.

Topic		Replies	Views
Bit-Probing The Secret List, or Cantor vs Kronecker Buzzles	28	525	January 21, 2025
Old Year's Resolutions AKA I challenge you! Or something Akrasia	37	1708	October 7, 2022
Spot-It puzzle via Adam Wolf Buzzles	11	773	November 27, 2019
20th anniversary of The Joel Test for software teams Tech	24	1328	February 5, 2021
🦜 Parrots Buzzles	10	535	December 17, 2023

Matching Game Perfect Play

Simplified problem statement

Original problem statement

Related topics