The Inclusion-Exclusion Principle

This isn’t a puzzle but I needed to get my head around this for other puzzle-related purposes and ended up writing a little essay that I think does a nice job of making the inclusion-exclusion principle more intuitive (and then also I guess proves it) so I wanted to share. Call it a lemma for an upcoming puzzle.

Say we have 3 sets – A, B, and C – and we want to know the size of the union of them. I.e., how many things are in any of sets A, B, and C? Well, if we add the size of set A plus the size of set B plus the size of set C then we’ve overcounted things that are in, for example, both A and B. And we’ve triple-counted things that are in all 3 sets. See the leftmost Venn diagram here (image from Wikipedia):

If we subtract all the pairwise intersections, we’re almost good. Except now we’ve failed to count anything that’s in all 3 sets. See the middle Venn diagram above.

Great, so add back in the intersection of all 3 (right Venn diagram) and we’re golden.

Sanity check interlude: blond/blue noses

In a population of a million people, let’s count how many people have either blond hair or blue eyes or a nose. Half a million have blond hair, a quarter of a million have blue eyes, and all million have a nose. Add those up and we get 1.75 million people. Obviously impossible with only a million total people. For starters, we’ve double-counted these groups:

  1. People with blond hair and a nose (.5M)
  2. People with blue eyes and a nose (.25M)
  3. People with blond hair and blue eyes (call this x, something between .25M and .5M depending on blond/blue overlap)

Furthermore, we’ve triple-counted people with blond hair and blue eyes and a nose – this is x again, the number of blond-and-blue-eyed people (they all have noses, duh).

So we get our final answer like so:

  • 1) Add the individual set sizes: .5M + .25M + 1M
  • 2) Subtract the pairwise overlaps:
    • 2a) The blond+nose people: .5M which includes the blond+blue+nose people (x)
    • 2b) The blue+nose people: .25M which includes the blond+blue+nose people (x) again
    • 2c) The blond+blue people: x (which is of course also the blond+blue+nose people)
  • 3) Notice that we triple-counted x in step 1 and then triple-subtracted it in step 2, so at this point the blond+blue+nose people aren’t counted at all
  • 4) So add back the blond+blue+nose people (x)

Putting that all together: .5M + .25M + 1M - .5M - .25M - x + x = 1M.

Which is to say that all 1 million people have either blond hair or blue eyes or a nose. Sanity confirmed!

That one was obvious but that same procedure would work for nonobvious questions like how many people have blond hair or blue eyes or a unibrow.

Generalize from 3 sets to n

In full generality, for the size of the union of n sets, we want to:

  1. Add up the individual sizes – the \binom{n}{1} = n size-1 subsets
  2. Subtract the sizes of all \binom{n}{2} pairwise intersections
  3. Add back the sizes of all \binom{n}{3} triple-wise intersections
  4. Subtract the sizes of all \binom{n}{4} quadruple-wise intersections

All the way until we add or subtract the \binom{n}{n} = 1 intersection of all n sets.

But why does that work? Consider some element u of the union that happens to be a member of k of the n sets. When we go through that adding and subtracting, how many times, on net, have we counted u? Hopefully one time! Let’s check. The following parallels the numbered steps above:

  1. Since u is a member of k sets, we count it \binom{k}{1} = k times at this stage.
  2. How many of the size-2 intersections include u? All the \binom{k}{2} pairs. Because the only way to get u is to intersect two sets that u is in both of.
  3. Similarly for the size-3 intersections: \binom{k}{3} of these include u so that’s how many times u is counted at this stage.
  4. You guessed it, \binom{k}{4} of the size-4 subsets contain u because only subsets of the k sets containing u, when intersected, will still contain u.

In general then, at stage i, u is counted \binom{k}{i} times. With the alternating signs for the adding and subtracting, here’s the net number of times u is counted:

\sum_{i=1}^{k} (-1)^{i+1} \binom{k}{i}.

Now we’ll use (drumroll) the good old binomial theorem to show that that’s 1. Here’s the binomial theorem in general:

(x+y)^k = \sum_{i=0}^k \binom{k}{i}x^{k-i}y^i.

Now watch what happens when we pick x=1 and y=-1:

(1-1)^k = \sum_{i=0}^k \binom{k}{i} 1^{k-i} (-1)^i.

The first thing to notice is that (1-1)^k is zero. And then we just need to massage that summation a little to make it match the one we care about. Namely, separate out the first term, flip the sign of everything by changing the (-1)^i to (-1)^{i+1}, and, of course the 1^{k-i} is just 1 so that goes away:

\begin{align*} 0 &= \sum_{i=0}^k (-1)^i\binom{k}{i} \\ &= (-1)^0\binom{k}{0} + \sum_{i=1}^k (-1)^i\binom{k}{i} \\ &= 1 - \sum_{i=1}^k (-1)^{i+1}\binom{k}{i}. \end{align*}

Amazing. That summation is exactly the one we want – the number of times our arbitrary u is counted on net – and 1 minus that summation is 0. So the summation is 1!

Final formula

All that was just to show that alternate adding and subtracting of bigger and bigger subset intersections works for counting every element in the union exactly once.

Now we’ll just write that as a single formula for the size of the union of n sets, A_1 through A_n:

\begin{align*} \left| \bigcup_{i=1}^{n} A_i \right| &= \sum_{i=1}^{n} |A_i| \\ &\quad - \sum_{1 \leq i < j \leq n} |A_i \cap A_j| \\ &\quad + \sum_{1 \leq i < j < k \leq n} |A_i \cap A_j \cap A_k| \\ &\quad \cdots \\ &\quad \pm (-1)^{n+1} |A_1 \cap A_2 \cap \cdots \cap A_n|. \end{align*}

Or as a single nested sum:

\begin{align*} \left| \bigcup_{i=1}^{n} A_i \right| = \sum_{j=1}^{n} (-1)^{j+1} \sum_{1 \leq i_1 < i_2 < \cdots < i_j \leq n} \left| A_{i_1} \cap A_{i_2} \cap \cdots \cap A_{i_j} \right|. \end{align*}

Or even more compactly, instead of first summing over the sizes and then constructing the subsets of each size, we can directly sum over all the nonempty subsets like so:

\begin{align*} \left| \bigcup_{i=1}^{n} A_i \right| = \sum_{\emptyset \neq S \subseteq \{1, \ldots, n\}} (-1)^{|S|+1} \left| \bigcap_{i \in S} A_i \right|. \end{align*}
1 Like

Nice! I also wrote a series of ten blog posts explaining + proving PIE a while back:

Covers some similar things as your essay but there’s a bit of extra perspective.

1 Like

My goodness that is an amazing sequence. That is the kind of math writing I aspire to. The visual proofs / intuition pumps are just gorgeous. And that’s really smart to prove PIE inductively. I’m relieved to see that the tiny subset of your tour de force that I inadvertently reproduced here seems to match up well.

A couple screenshots from @byorgey’s PIE series to spruce up this thread and pique people’s curiosity:

Very much the tip of the iceberg. Thank you so much for sharing this!