Saturday, July 09, 2011

TopCoder open 2011 Round 3 Part 1: ArtShift

A common question I get is why didn't I participate in this year's TopCoder Open. The thing is that in order to submit problems to be used in the TCO you are not able to continue competing in the algorithm track. I made the scarily pragmatic choice of picking a decent probability of getting at least one problem approved for TCO over the very small probability to advance to the on site rounds.

The plan wasn't working out great until this week in which I finally got assigned a couple of problems for an online round (not qualification). Two of my problems were assigned to be the 250 and the 1000 pointers in this match.

This morning, I opened the 500 and the worry began that maybe the 1000 is too easy. It also seemed like the other writer had issues solving the 250, so maybe it was too hard as well.

250: ArtShift
Link to statement
I didn't really expect such a low success rate in this problem (51%). It turns out there was a good hidden story about corner cases that I was not aware off.

The main idea of this problem is to notice that in the "normal" case in which you have a sequence of n different colors. Let us say {1,2,..., n} Then the solution is:

Max( lcm(x1,x2,...) where x1+x2+...<= n )

Well, that's right. For example, imagine if the colors were {1,2,3,4,5}. Then an art shift that affects all 5 elements is : {1,2,3,4,0} and in total 5 different sequences will be reachable: {1,2,3,4,5}, {2,3,4,5,1} , {3,4,5,1,2}, {4,5,1,2,3} and {5,1,2,3,4}. Note that the shift just rotates the permutation by 4. And it causes a cycle between all five elements. Now, cycle is the great idea here. Imagine an art shift that was like {1,2,0,?,?} Then the cycle is only between the first 3 elements. The first three elements will cycle like this: {1,2,3,?,?}, {2,3,1,?,?}, {3,1,2,?,?}. For the remaining two indexes, we can make them belong to a cycle: An art shift {?,?,?,4,3} will make the last two indexes cycle like this: {?,?,?,4,5}, {?,?,?,5,4}, {?,?,?,4,5}, ... . If we combine the two ideas together we get the shift: {2,3,1, 4,5}, and the nice thing is that for each of the three parts of the first cycle, there will be two ways of the second cycle, in total we have 2*3 = 6 different sequences:
{1,2,3, 4,5}, {1,2,3, 5,4}
{2,3,1, 4,5}, {2,3,1, 5,4}
{3,1,2, 4,5}, {3,1,2, 5,4}

Now note that the order doesn't matter, we just need there to be two different cycles, one with three indexes and one with two indexes: {1,0, 4,2,3} will also allow 6 cycles. But if in a different case, one cycle had length 4 and the other had length 8. Multiply 4*8 would not work, because in fact we need the LCM(4,8) and not the product 4*8. So, for a total of n elements, we need to add the element to one cycle such that the LCM of all the cycle lengths is as large as possible.

Max( lcm(x1,x2,...) where x1+x2+... <= n )

Where lcm(a,b,...) is the Least common multiple of a,b,...

That solves the case with n uniquer colors. And it is also a common problem , so most people in the match probably knew it. But how does it help us in the problem where there are repetitions and only two colors?

The trick is to notice this: Imagine a sequence of all-white colors: WWWWWWWW . Then we cannot really make a cycle of size different than 1. But if we instead had WWWWWWWB then we can actually have a cycle of length 8. It turns out that all that matters is for there to be two different colors in the part of the sequence that we want to make a cycle. Assume that we made a cycle of length 2 out of WWWWWWWB, then the cycle would have to have used the sole black color and all that is left is 6 W: WWWWWW , we cannot do any cycle anymore, so we can only do at most one cycle of size larger than 1 in that case. Now, imagine WWWWWWBB, we can now do two different cycles: WWB and WWWB. Because this time there are two B characters. Since each cycle will need at least one white color and one black color, then the maximum number of cycles is min(count of W colors, count of B colors).

In fact, once we know that we can conclude that the answer is:

Max( lcm(x1,x2,...,xt) where x1+x2+...+xt and t<= min(count W, count B) ) <= n

All that is left to do is to find a quick way to get such a maximimum lcm. It turns out that bruteforce is fast enough. For example, a recurrence that remembers the current lcm value, the minimum unused number, the remaining number of cycles and the maximum cycle number we can use so far.


It turns out the second writer (All legendary soul-net) had issues with this problem because the original problem I mentioned earlier is a very well-known one and it is also well-known that the used cycle lengths would be primes or powers of prime numbers. But in the version with a limit on the number of cycles, this is not true. It also turned out to be the reason many people failed the problem. I really, didn't know about this. I was actually aiming to have a large success count, and thus if I knew about this I would have included a case that needed a composite non-prime power cycle length.

I initially thought that this problem was going to be a 500. The constraints were n<=100. In fact, bruteforce will work with 100 as well, but it needs some cropping.

(Part 2: The 1000 pointer)

No comments :