One of my favorite competitive programming tasks (IOI 2011). You donāt have to actually write any code ā just come up with a way to do the task. Suitable for a) people familiar with binary, or b) programmers, or c) people who kinda liked math in school.
Problem statement for non-programmers (rephrased)
Jaxon has an N byte file, which he wants to send to Lachlan. They both know how big the file is.
They donāt have internet (b/c Australia) so they are going to use parrots. Each parrot can memorize a single byte. Bytes are just whole numbers from 0 to 255.
However, parrots donāt necessarily arrive in the same order they departed. So if the file is [1,2,3], Jaxon canāt just take three parrots and teach them the numbers 1, 2, 3 ā because the parrots might arrive in the opposite order 3, 2, 1 (or in any other order). Heāll need to be cleverer.
Naturally Jaxon will have to use more parrots than N. Itās alright. We love you anyway, Jaxon. Would be nice to use as few parrots as possible though. Are you listening, Jaxon. Listen to me.
If itās unclear, you can read the full original statement here. I got rid of some extra cases because I donāt like them.
Problem statement for programmers
Write an encoder and decoder for messages (arrays of bytes). Your decoder will be given the length of the original message.
encode(msg: uint8[]): uint8[]
decode(n: int, data: uint8[]): uint8[]
The decoder must be able to correctly decode the message even if the data is randomly shuffled after encoding. That is, decode(length(msg), random_shuffle(encode(msg))) == msg for all msg.
Levels of difficulty
ONE BROWNIE POINT: the file is N \leq 16 bytes long, and Jaxon can use at most 10 \times N parrots. Doesnāt have to be optimal.
TWO BROWNIE POINTS: the file is N \leq 32 bytes long, and Jaxon can use at most 10 \times N parrots. Also doesnāt have to be optimal.
RED VELVET CAKE: come up with the optimal solution that works for any N. Optimal means āminimize the maximum number of parrots youād possibly needā.
Solution for N = 2 (read if you didnāt get the task)
You can always use three parrots to send a two-byte file. Hereās how:
If the file is [A, B], you just teach two parrots A and one parrot B. This works because:
Even if the parrots get shuffled on their way to Lachlan, he can look at which byte is more common and heāll know it was the first byte of the file.
I donāt have a solution specifically for the second level, but the first and the third both seem pretty simple to brute force. (And if you can solve the third, I guess that also solves the second, if not as elegantly.)
For the first:
Use the first 7 bits of each byte for a sequence number, with the actual information only in the last bit of each byte. Using only one bit per byte means that you need 8 \times N parrots, and using a seven bit sequence number allows you to send a message requiring up to 2^7 = 128 pieces (parrots). If each parrot carries one bit, thatās \frac{128}{8} = 16 bytes, precisely what we need for the first level of the problem.
For the third:
With k parrots, you can distinguish between exactly \binom{k + 256 - 1}{k} possible message encodings, a la the classic āballs and urnsā combinatorics problem. Each parrot is a āballā and youāre putting it into one of 256 āurnsā, each labeled by an 8-bit integer, based on which byte the parrot carries. Decide on an order for these encodings, these 256-tuples of integers, say the lexicographical order. (The first element of such a tuple is the number of parrots carrying the byte 000000, the second the number of parrots carrying 000001, the third the number of parrots carrying 000010, etc.) Now you have a bijection between the \binom{k + 255}{k} possible message encodings and the natural numbers from 0 to \binom{k + 255}{k} - 1 ā which will be the decoded message. (Youāre given N when decoding the message, so you can, say, sort the set of length-N arrays of eight-bit integers lexicographically and use that number as an index into that ordering to recover the original array.)
To encode a message of length N, you need to choose a k such that 8^N = 2^{3N} is less than or equal to \binom{k + 255}{k}, so choose the least k such that \frac{log_{2}\binom{k + 255}{k}}{3} \geq N.
Actually implementing the bijection is slightly non-trivial, especially in 2011 when people mostly didnāt have access to bignums in competitions. But even with bignums itās still fun to implement. I didnāt participate in IOI but my classmate was a strong competitive programmer and he didnāt manage to come up with the bijection idea at all, IIRC.
When viewed as a pure math puzzle (as indeed you presented it) this problem is much easier to tackle than when approached as a competitive programing puzzle.
Viewing it as a math problem, I started by thinking āif I can find the number of possible encodings for any given number of parrots, I can put those encodings in a bijection with the natural numbers less than it.ā Thatās why I called this solution ābrute forceāāit enumerates all possible encodings and builds a lookup table matching that list to all possible messages that could be sent. With that as a starting point, all that was left was to identify that this was a combinatorics problem, which made it trivial to calculate that number.
If you start with the notion that you put the possible encodings in a bijection with a subset of the naturals, the rest of the problem is relatively easy. But approaching this from a competitive programing angle, youād probably start from the opposite directionāand that, I imagine, makes it a lot harder.
Ah, I finally got the third one while sitting in the car park waiting for my wife to give blood! So hereās what I came up with:
For N<=16: use the first 4 bits to encode which word (byte) of the file it represents. Then use the next 2 bits to encode which pair of bits in that word weāre sending. Use the final 2 bits to send those two bits. This uses 4 parrots per byte sent, and works for up to 2^4 words.
For N<=32: write the whole file as a bit string (max length 256 bits). Each parrot is a byte representing an index position in the bit string, 0-255. Send a parrot for each 1 in the bit string. On average youāll have to send 128 parrots for a 32 byte file, as half the bits will be 1s.
For arbitrary N: extending the previous method, break the file into successive 256 bit strings. Send 1 parrot for each 1 in the first 256 bits, 2 parrots for each 1 in the 2nd 256 bits, 4 for each in the 3rd, 8 for each in the 4th, and so on. So say I receive 23 parrots with the value 27: I know 23 is 10111 in binary, so there is a 1 in the 27th place in the 1st, 2nd, 3rd and 5th sequence of 256 bits. Parrots used is stochastic, depending on message, so Iām sure itās not optimal!
You are using 6 bits for position. This turns out to be more efficient than using 7 bits for position as @zzq did. In general if you spend X bits on position you can send 2^X things of size (8-X) bits. If you try for all possible X-es, only 6 and 7 bits let you transmit 16 bytes.
N\le32 ā also looks correct.
256 parrots in the worst case but itās still less than 10\times N, so well done. Cleverly uses the fact that the number of parrots doesnāt have to be constant. Thatās kinda cool actually.
Arbitrary N ā even more cleverly uses the same fact (Iām jealous), but yeah, itās not optimal.
(Just judging by the fact that it heavily privileges messages with lots of zeroes.)
If you want a slight (moderate?) hint re/ optimal solution, hereās a hint ā the optimal solution is entirely completely different. ā
Ah, I came up with a path to a solution thatās probably closer to optimal for arbitrary N. It goes something like:
We might be able to use Multisets! I needed to read up in Wikipedia on this, I admit - in programming, Iād think of them as the datatype Bag. Then our message is a Bag of parrots, containing some number of each parrot, where parrots have a value 0-255. So the message is a Multiset, with the cardinality of the Multiset large enough to cover the message space, with the message interpreted as a binary number with all the digits concatenated together. So some number in the range 0-N^2^3, for N digits. We establish a 1-1 mapping from all possible Multisets to the numbers in the range to N^2^3 (just some intuitive ordering of the possible multisets will do), then we can send the Bag of parrots, work out which multiset weāve got, and map back to the number, i.e. to the message. Seems like this should work!
Wikipedia tells me that I need to use the āmultiset numberā to work out how many parrots Iād need for any message, and I can write this down on a bit of paper, but my ability to write it onto this forum post is limited by my knowledge of math notation in markup, so this is left as an exercise for the reader
Right, now Iām going to read your hint, see if it was going in this direction or notā¦