Saturday, June 11, 2011

SRM 509


The Editorial for SRM 509 has been published. Lately I have been writing many editorials, it appears that all the other approved editorial writers have gotten a life. This is probably a good moment to try a recap.

I felt the pressure since the beginning of the match, I had reached my maximum rating after two years, and I noticed that I have been in a very good streak. The streak will eventually end, and this match was decisive, I could either continue earning rating or once again drop back and delay the trip for another 2 years.

Div 1 500: PalindromizationDiv1
At first it seemed like a complicated problem, then I said to myself "Why? It is just the usual palindrome problem, try each substring in a dp). So I begun to try to write that problem. I eventually noticed that it was not as simple. While coding the simple solution, I could see that maybe it would be necessary to change the letters to other values before trying another change or a removal/addition. So I was doing a 26*26 loop to find the new values, that's when I thought to change the dp state to (substring limits, new value of word[a], new value of word[b-1]). So I implemented it, and I had bugs. I overcomplicated everything because I couldn't estimate a good bound for the final result, so my INFINITE constant was 64 bits long and I had to use 64 bits numbers everywhere. Of course, simple logic would tell you that the return type of this problem is int, and the problem setter did not specify what to do if the value is higher than the return value, so most likely the answer was going to fit an int.

Eventually, I noticed that the recurrence had cycles, and cycles + memoization do not get along. So I added another parameter that could be 00, 01, 10, or 11 (in binary) and held two boolean values, one for each side of the word if it is assigned or not). And it passed the examples, so I submitted it.

This is of course very wrong. But I really did not suspect it at all.

Div 1 250: LuckyRemainder
After checking the division summary, I thought to myself that it should be possible to score 240+ points in this problem, and perhaps necessary to get a good position and insurance in case the 500 points problem fails.

For some reason, after reading the problem I thought that you just needed to take each digit and multiply by 2n-1. This originated from misreading something. After implementing this solution, I was getting a wrong answer. So, I re-read the problem, and noticed that I was supposed to multiply by powers of 10. So I went to code the overcomplicated solution that I ended up submitting:

sum = 0;
for (int i=0; i<n; i++) {
//For X[i]:
for (int k=0; k<=n-i-1; k++) {
int y = pow(2,i);
//There are y ways to append the numbers to the left of X[i] to it.

int z = pow(10,k);
//We are considering only X[i]*10^K

// Combinations(n-i-1, k) ways to pick k digits out
// of the n-i-1 digits to the right of X[i].

sum += (y * z * C[n-i-1][k] * (X[i]-'0') ) % 9;
}
sum %= 9;
}



It ... was ... still ... failing the example cases. Before I melted down, I was lucky to find out that , at the end I was just doing a (return sum). What is the problem with that? Well, the issue is that it was supposed to be (return sum%9).

In fact, the lack of the %9 at the end was what caused the original X[i]*(pow(2,n-1)) solution fail. That solution was correct, although not because of how I misread the problem but because powers of 10 area always 1 modulo 9. If I actually had sum%9 in the return of the original problem, I could have gotten around 246 points out of sheer ant total luck. Instead I ended up with very low ~220 points.

Since I had very low scores for both the 250 and the 500, and many, many people submitted something for the 500, I was actually in a very low position. I decided to make sure the problems I correct. I kept triple checking everything and all seemed fine (It was not). When there were around 5 minutes left for the match, I decided to try to find challenge cases. When first reading the constraints for 500 I first thought that equal operations but with different costs were allowed, so I took care of that. When I tried to craft a case for solutions that did not consider that, it turns out that the constraints actually specified that they were not allowed.

Second try, I also noticed that change operations are uni-directional. I also noticed that when I changed the solution to believing that change operations work both ways, it still passed the examples. So I tried to craft a case that made such solutions fail. I ended up creating: "vex", {"change s v 1", "erase x 1", "add s 1"}. (Never miss a chance to sign your challenge cases when it is possible).

My original solution returned -1 in that case, but if I made that solution consider bidirectional change operations, it returned 3.

... Wait a second, that's when I noticed that 3 is actually the correct result for this case!. But not because of considering "change s v" equal to "change v s", but because it is possible to first erase x, then add s and change s to v. That is something I completely failed to notice! But at the same time, just after I noticed that my solution was wrong, coding phase ended.

Challenge phase
That test case led me to notice that there are many, many things to do in div1 500. It also helped me notice that the example cases are very weaks. Before the challenge phase I whispered to the admins that I think they are probably preparing themselves to watch a challenge phase spectacle.

With a failed 500 and a slow submission for 250, I knew that my last chance was to have a successful challenge, and that with so many submissions for 500 in my room, it was going to be perfectly possible.

For a second , I considered using my challenge phase on all the solutions, but I am not so aggressive. So, I just opened the solution for a blue coder, but not the lowest rated blue coder, because that's where aggressive challengers usually begin to challenge. After reading the solution a bit I noticed that it was only considering the add operation once and not combining it with erase or change. So, I gave it my challenge case and it was successful.

I tried to challenge more solutions, but I wasn't so lucky, every time I considered challenging someone, the solution already got challenged. I also noticed many clever things like using Floyd-Warshall, that is something I should have tried.

Outcome
Thankfully, my successful challenge was good enough to reach top 140. Which was exactly what I needed to reach 2100 rating. There is a red stripe in my rating graph! This was my dream objective of the first semester of 2011, so I am glad. But this will only increase the pressure in the upcoming matches.

The editorial
After the challenge phase ended I could not really write a blog post explaining the problems, for once because I still didn't have a good idea to solve div1 500. I knew how to solve it using transitive closure in a way like Floyd-Warshall, except that also reducing the add , remove, and change operations including the other operations in between but boy was that hard to implement and explain. I then had to go do a very time consuming, unrewarding group assignment.

I was once again, assigned an editorial. I still had the problem of not knowing an explainable solution for 500.

One of the many things I thought of while trying to come up with an easy solution for 500 was Dijkstra. If you consider it, it is never necessary to add a character to the left if the left most character is not equal to the character in that position in word. The same happens to the right side. So, you can represent a state as [a][b][wa][wb] , where a and b are the limits of a substring and wa, wb the current characters in those limits. Dijkstra because the graph is cyclic. So, in each state, you can remove characters, modify them or add characters (just make sure not to ever add a character to a side if the letter in that side is already different). I thought that there are 2500*26*26 different vertices and 50 edges, so Dijkstra was supposed to work. After implementing Dijkstra, I was disappointed to see that it timed out on exactly one case. Apparently the overhead was too large.

I eventually decided to open the writer's solution in the practice room. It was genius. The idea to use an empty letter as a valid letter, so an add operation is a change from LAMBDA to 'a'. This is the solution I explained in the editorial, and it makes everything so nicer.

Sunday, June 05, 2011

After IPSC

IPSC has ended and ever since I found this tournament years ago I do not miss it because it is always very interesting and fun of an experience. Since no other important tournament round happened this week, let us talk about it. The solutions booklet is already out, so I won't bother explaining the problems as the booklet is well-written.

Waking up at 5.55 AM
I am more of the 7:00 AM kind, you know. I would have probably had no problem waking up early for the tournament normally but yesterday I had to stay up late writing a certain editorial that will be published soon. I woke up at 5:55 AM but went back to sleep until 6:30-ish.

Then you have to download the tar.gz that contains everything useful to solve the problem. I slowly opened all of the problems and the submission and standings pages. Emptied my contest folder so that I could place source codes there. And finally took a look at what was the problem that was solved the most. It clearly was A

A - The rock, paper, lizard, Spock one
This was very easy. Even the large input. There are always two objects that beat any of the objects, so it is always possible to pick a different one than the previous one. A quick pair of points.

M
Last year I decided that it is best to avoid these problems that just give you negative penalty time. They are not so fun to solve and they don't really help that much to your ranking. The time you spend on them could have been used solving the other problems.

B
I thought to myself : "You are a problem setter and you love maze BFS problems, if you don't solve at least B1 you are a shame!" So, I gave it a try.

After trying a program that made random mazes and then called the c++ source code until max_size was 5000, I found a maze (absurdly quick). Submitted it and ... it was wrong. For some reason, the random code was reaching max_size > 5000, but when using the same maze in another instance of the program the result was 1029. At the end I noticed it. I do not know if this was intended and the problem setters are evil geniuses, but the c++ code does not clear the queue initially, so if you try the random approach, it will begin to do funny things after the first execution. After fixing the queue bug, it seemed that any random input would yield a size of around 1200 tops. So, I tried to find a pattern in small cases that would maximize the queue size. Eventually time ran out and I could not believe I had already used over an hour in this problem , I decided to try other things...

When I opened the solutions booklet, I got really excited at the beautiful fractals that were required to solve this problem, but I have to wonder exactly how would you generate them. Oh well.

The next easiest problem seemed to be D1.

D1
I noticed that you cannot evenly divide in two a case in which both the number of rows and columns are odd. Then I noticed that this is not only a necessary condition but also sufficient. A cool problem. I tried to give some thought to D2, but decided it is better to try the other easy problems first.

I1
You had to write a program in a strange ASM-like language. Negating 7 integers using only a NOT operation is actually very easy, just put them together in a single register using shift and or operations, negate the register and then revert the shift and or operations to return to 7 numbers.

So far the contest is going well.

F1
There are 312 choices for the decision matrix. And for each matrix, you could try the 8 different possibilities for the coin flips to calculate the probability that matrix yields. 312*8 is not that high, so a simple bruteforce through all matrixes and remembering the best one is enough.

The final result made plenty of sense. Whenever the other two players had a H, guess a tail, and if the other two players have a T, guess a head. Otherwise, pass. What happens is that whenever a guess is right, the other two players will pass (because they would see both a H and a T).

H1
Oh, that language was crazy. At first I did not get why was H2 supposed to be harder than H1 - Just type a - after the + and it would revert the effects of the + character. Anyway, I noticed that digits are allowed and that q can type the source code itself. Any digit before there is a buffer does nothing. So we could do TEAM_DISq. The q would print itself though, so we need to remove it somehow. I noticed the ! and ? commands which remove the last or first 47 characters, so I added other 45 useless characters to the source code and put a ! to the end. That was wrong, because ! is the one that removes the first 47 elements. I replaced with a ? before reading the result of the previous submission which said: Detected a ! in buffer.

It turns out the buffer cannot ever have non-alphanumeric characters. It wasn't a big deal, I just used the O command to remove the characters instead, but it removes positions that are primes or powers of two, so I placed the useless characters at those positions.

However, this condition I didn't notice at first is what explains why is H2 much harder, a + in the source code would make you unable to use q.

What now?
There was only one hour left, and it seemed that the next easiest problem was D2. I noticed that if at least one of r or c is even, it is also always possible to make a connected output. The problem was building it. After considering many cases I finally thought of something that could work. But while coding this I noticed that I forgot about how to handle cases that had r=1 or c=1. My construction rule didn't work. By then it was already 1 minute before the end of the contest, so I gave up.

The end
Final ranking position: 216. That was underwhelming. Wait, so you mean this was a team contest? Ok, what happens if I filter out the multiple-people teams? Position 39 out of at least 250, still fairy low . What? Do you mean a single Russian guy was able to get 20 points all by himself? Almost three times the amount of points I got? Oh...

The only thing that bugs me about ISPC is that there is never really time for me to try all of the problems. There is always good things that you miss even trying to solve. I guess that is because the contest is designed for three person teams. The problems are usually so interesting that it is hard to first read all the statements before starting to try to solve one, because you can't resist trying to solve the one you just read.

Thursday, June 02, 2011

Thoughts after SRM 508

It was a good match overall but I made a couple of mistakes that really penalized my score.

Div1 500 - YetAnotherORProblem
(Link to problem statement)

The sum must be equal to the binary OR between all the elements of the sequence. That is another way to say that at no moment should there be carry operations when adding the numbers together (in binary). As long as there are no carries, then the sum and the binary OR will be equal.

So, the trick is that for each bit number i, there are never two or more 1 bits in the i-th bit of any of the numbers in the sequence.

At first I thought of an idea for a dp solution: Keep a set of bit numbers that already contain a one, then go through each of the position in the sequence, pick a combination of ones and zeros for that position such that:
- The combination of ones and zeros is <= R.
- If the i-th bit is in the set of bits that already have a 1, then the i-th bit of the combination of ones and zeros must be 0. Else it can be any of them.

Then I noticed that the constraints (1 <= 1018) allowed very large numbers of at most 60 bits. 60 bits means that there are 260 different ses, and that is a lot.

After some minutes I eventually ended up noticing that all the constraints are rather large. Except the constraint for the number of elements in R, which can be at most 10. Always pay attention to the constraints and consider which one is suspiciously small. This constraint means that although the solution is not supposed to be exponential on the number of bits in the numbers in the input, it should be able to make it exponential on the number of elements in the R array.

So, the dp idea is basically the same, but in reverse. Instead of considering the numbers one by one. Consider the bits one by one. Start with the left-most bit. Keep a set of which of the N numbers are currently less than its respective value of R[i] (Once one of the left most bits is smaller, you can begin using any bit 0 or 1, but until then, in case the corresponding bit in R[i] is 0, you would need to use a 0 in that location. Since we are visiting the number of bits from 60 to 0, in each bit we are assigning something for the numbers of the sequence.

An obvious choice is to pick 0 for the i-th bit of every number in the sequence, this will alter our memory of which numbers in the sequence are already smaller. The other option is that exactly one of the elements in the sequence will have a 1 in the i-th bit. It also modifies the state of the set.

So we can have a recurrence that is not cyclic as it keeps decreasing the bit number. All we need is dp tho finish it. During the match, I initially had issues debugging. Besides of the typical mistakes when typing, I also made the mistake of using 1000000007 as modulo instead of 1000000009. Always pay attention to those small details.


struct YetAnotherORProblem
{
long long mem[1<<10][63];

vector<long long> R;
int n;
long long rec(int mask, int bits) {
long long & res = mem[mask][bits];
if ( res == -1) {
res = 0;

if (bits==0) {
// No more bits left, we can assume that
// the previous bits follow the property though.
res = 1;
} else {
int p = bits-1;
//Assign bits for the p-th position:

//all zero.
int zmask = mask;

// When leaving zero bits in all the numbers, it is possible that
// some of them in which the number in the sequence was still not
// less than seq[i]. Update the mask accordingly.
for (int i=0; i<n; i++) {
if ( !(mask&(1<<i)) && (R[i]&(1ll<<p)) ) {
zmask += (1<<i);
}
}
res = rec(zmask, bits-1);

//exactly one is 1:
for (int o=0; o<n; o++) {
int nmask = zmask;
if ( !(mask&(1<<o)) ) {
if (R[o]&(1ll<<p)) {
//We are basing the mask on the all 0 case
// but if the o-th number has 1 in this case
// then the mask should not change.
nmask -= (1<<o);
} else {
//The number must be <= R[i]
continue;
}
}
res += rec(nmask, bits-1);

}
res %= 1000000009;
}
}
return res;
}

int countSequences(vector<long long> R)
{
n = R.size();
this->R = R;
memset(mem,-1,sizeof(mem));
return (int)rec(0, 62);
}
};


Div1 250 - DivideAndShift
Link to problem statement

By the time I finished coding the 500-pointer, I was 15-th in the division and first in my room. It seemed many people were taking long to solve the 250. I knew that as long as I got around 220+ points in it, I would be fine.

I open the 250 and ... it seemed hard. But not for long. The main point I think is to notice that it is never necessary to perform shift operations before the division operations. For example, assume that you would do a shift before issuing a division that will turn a list of 10 numbers into 5 groups of 2 numbers. If M=1, a single shift right, will convert M to 2, and after doing the division, it will be the same as if you first did the division and THEN shifted right.

It is harder to see if M=1 and the shift is to the left. In this case, M would become 10. It seems like it modifies things, but in fact, it just changes the group you will keep, but after dividing and keeping the last group, M will become 2, which is the same as if you did the shift AFTER dividing the groups.

So, just try all possible ways to divide the number and then calculate the number of shifts required. The total cost would be the number of divisions plus the minimum number of shifts required. The minimum number of shifts is easy to get. If the groups contain X elements, then M%X will describe the position relative to the group in which M is placed. Everything becomes easier if you first decrement M and assume that positions are 0-based.

To do the divisions, note that after performing enough divisions of many prime numbers, the number of groups can be any divisor of N. So, you can just iterate through all the divisors d (can be done in O(sqrt(N)). Count the total number of divisions necessary to reach that divisor which is equal to the number of prime factors of (N/d), and then calculate the shift cost.

The minimum among all of those shift costs found is the result.

During the match, I needed plenty of time to find a bug that was causing my last example to fail. It was not after a long wait that I finally noticed - I was only performing left shifts in my solution because I hadn't noticed yet that right shifts were also enabled.

Always read the problem statement paying attention to details. The delay was so low that I dropped many positions from the very hopeful situation I initially had.

Div1 1000
After (and while) triple-checking everything with the 250 and the 500, I opened the 1000 pointer. It is probably one of the most horrifying problems. I sort of figured that the answer is probably much easier than the problem appears, but I guess appearances are important to me. Eventually, many people solved it correctly, so in a strange way, this problem is supposedly easier than most of the recent 1000s. That is strange.

Challenge phase
I was either going to have a great position or to fail 500 and get a very bad position, in either case, a successful challenge is not that helpful, and the penalty of a negative challenge seemed like too much. I decided to be passive and avoid any problem.

System tests
My stuff survived and that is great. I reached 2075 which is an all time maximum. Hopefully I will be able to reach 2100 before the first semester of the year. That would allow me to try 2150 for a year target.