Thursday, April 24, 2014

SRM 618: When python failed

This is the eighth of ten matches in which I promised to use python exclusively. This is also the first match in which I feel like I can claim I lost plenty of points because of this restriction.

Div1 250: The one with heteronormativity

So I am supposed to be solving this problem, and all I can think of after reading for the first time is that the problem's definition of family allows incest but is all about not allowing same-gender pairings :/ okay. ..

So a "family" is defined as a graph, in which each edge either has zero incident edges or exactly two incident edges. The two incident edges MUST be from two nodes with different genders. Also the graph must be topsorted. Given a graph of up of 100 nodes return if it could be a family or not. Basically, you have to assign genders to each node. It turns out that the input format guarantees that the graph is topsorted and that each node has 0 or 2 incident nodes. So all that we need to do is ask if we can assign the genders correctly.

If two nodes appear as the parents of a node, then their genders must be different. If we imagine a different graph, in which each of these pairs of nodes that must have a different gender are connected by a single edge. Then the graph must be bipartite. If the graph is bipartite, then the answer is that Yep, the graph could be a "family". Else not. Checking if a graph is bipartite is an easy application of the Depth-first search.

class Family:
 def isFamily(self, parent1, parent2):
    n = max(max(parent1),max(parent2)) + 1
    # Make a set of edges in the hopefully bipartite graph, they are undirected 
    E = set(zip(parent1, parent2)) | set( zip(parent2, parent1) )
    E.remove( (-1,-1) )
    
    # is the graph bipartite?
    color = [-1] * n
    result = [ "Possible" ]
    def dfs(x, c):
        if color[x] == -1:
            color[x] = c
            for (u,v) in E:
                if u == x:
                    dfs(v, 0 if c == 1 else 1)
        elif color[x] != c:
            result[0]  = "Impossible"
        
    for i in range(n):
        if color[i] == -1:
            dfs(i, 0)
        
    return result[0]

I took a bit of time mostly because of the above-mentioned distraction :P, but python helped me keep the code clean. I think it should pass unless 100x100 is suddenly too slow (I doubt it).

Div1 500: The one with strings.

You should count the number of maximal-length strings that use an alphabet of `n` letters and follow these properties:

  • No consecutive equal letters.
  • For each pair of (possibly equal) characters X and Y, string "XYXY" is NOT a subsequence of the string.

The first issue is how to know what is the maximum length possible. After plenty of analysis, I found that the maximum length possible is 2n-1. However, there are two methods to achieve it:

  • One way, is to surround a string of 2(n-1)-1 length using (n-1) alphabet characters, using two instances of one character different to those (n-1) ones. For example: A????????????A, where ????? contains no A.
  • The other way has 3 instances of the extreme character. A??????A!!!!!!!A, but also ?????? and !!!!!! cannot share characters.

So the rest is to count the number of valid patterns. It is not necessary to worry about which character goes to which position of the pattern, because we can just multiply the result by `n!`.

The rest is an `O(n^2)` dynamic programming, unfortunately, `n` is up to 5000 (I wish it was 1000), so no matter what I do, the thing will take at least 6 seconds. That's too much.

More info about the recurrence in the comments of the code:

MOD = 1000000007

def modFactorial(x):
    r = 1
    for i in xrange(1, x + 1):
        r = (r * i) % MOD
    return r
        

class LongWordsDiv1:
 def count(self, n):
    # The maximum number of letters is 2*n-1, although there are multiple ways 
    # to reach that maximum
    # Example, for n = 5
    # 012343210
    # 010234320
    # 010232420
    # 012103430
    # 012321040
    # 012131040
    # there should be 9, but can't think of the other 3
    
    dp = [0] * 5001
    dp[0] = 1
    dp[1] = 1
    for i in xrange(2,n+1):
        #1) 2n - 1 = 2 + (2(n-1) - 1)
        res = dp[i-1]
        #2) find a,b: 2n - 1 = 3 + 2a + 2b - 2
        #                 b  = (2n - 1 - 3 - 2a + 2) / 2
        #                 b  = (2n - 2a - 2) / 2
        #                 b  = n - a - 1
        a = 1
        while a < i:
            b = i - a - 1
            if b > 0:
                res += (dp[b] * dp[a]) % MOD
            else:
                break
            a += 1
        dp[i] = res

    return ( dp[n] * modFactorial(n) ) % MOD

It is possible I missed something in this analysis, though.

Monday, April 21, 2014

Topcoder SRM 617 recap: regular

This SRM took me by surprise, I think the programming contest calendar I use wasn't updated correctly or I didn't read it correctly the first time I read it, so I thought SRM 617 was another date. On Sunday, I got the Topcoder email and surprise!. It's a good thing that we no longer have that bug in which invitation emails don't get sent...

This is the 7-th of 10 SRMs in which I am using python exclusively. So far I don't think it has caused any negative impact to my rating. I think that unless you are in such a high rating position that you really need frequent div1 1000s, python's drawbacks are not a big deal and the pros tend to compensate for the cons :/

Div1 250: The one with cutting cake

You have the nonsensical notion of a one-dimensional cake of length `n`. Your friends are going to come visit, all you know is that the number of friends will be a proper divisor of `n`, but you are not sure which one. You want to make cuts to the cake in such a way that you can distribute the contiguous pieces evenly to your friends, no matter what the number of friends is. The pieces a friend receives may be of different length and the number of pieces each friend gets is not necessarily always the same, the only condition is that they are contiguous after cutting the cake and that the total length of cake each friend receives is equal. Return the minimum number of cuts needed so that this always works, regardless of the number of friends.

Well, the key is that the amount of cake should be the same for each friend, so for each `y` that is a proper divisor of `n` and `x = n/y`, there MUST exist cuts at `x,2x, 3x, ... `, The problem asks for the minimum, so we shouldn't do any more cuts than those `x,2x, 3x, ... `, for each `x`. The trick is that some values `kx_0` and `rx_1` may be the same for different divisors `y_0, y_1` . So we should just count all the numbers that are equal to some `k * (n//y)` where `k` is a integer, `y` is a divisor of `n` and `k * (n//y) < n`.

So simple simulation will be fine. During the match, I had the feeling that with `n <= 100000` and the 2 seconds limit, this simulation would be a bit slow for python. I did things like getting the divisiors in `O(sqrt(n))` time. And even after that, I made sure to test the worst input. The worst input is the number `n <= 100000` that has the most divisors. This is equal to the largest possible highly composite number or 83160 , in short. It turns out it wasn't necessary, also, with some knowledge of iterators and sets, the code can be much simpler than the code I submitted during the match:

class MyLongCake:
 def cut(self, n):
    cuts = set()
    for d in properDivisors(n):              # for each proper divisor d:
        cuts |= set( xrange(n/d, n, n/d) )   # add n/d, 2n/d, 3n/d, etc to set
    return len(cuts) + 1                     # return number of cuts + 1

def properDivisors(n):
    return (i for i in xrange(1, n) if n % i == 0)

During the challenge phase, you'd figure from reading other people's codes that the integers that get cut are exactly those who are not coprime with `n`. So result is actually `n - phi(n)`

Div1 800: strings

I opened this problem accidentally when I intended to open div1 500 :/ . It seemed like I had no idea how to solve it. Also, when a problem score is much smaller than usual, it is possibly much harder than usual (Sorry admins, everyone knows you are backward when choosing problem scores). so I skipped to div1 500, a bit too late, perhaps.

Div1 500: The one with pie and dolphins

You have 50 programmers, for each `i < n`, where `n <= 1000`, you should choose between:

  • Giving a dolphin to programmer `C_0[i]` and a pie to `C_1[i]`.
  • Or
  • Giving a dolphin to `C_1[i]` and a pie to `C_0[i]`.

After you give all these pies and dolphins, each programmer will calculate the absolute value of the difference between the number of pie and dolphin they got. You want to minimize the total sum of all these absolute values.

ooh boy. So I stared at the computer screen for more than half an hour. Everything I thought was non-sense for a good while. Eventually, I decided that the problem was most likely greedy. So the new issue was how to think of a greedy solution. If you process the decisions in the order of i, it is difficult to predict what will happen later and how it should affect your decision. But there is no need to do that. You can really do a decision whatever time you want. A decision will change the result in -2,-1,0,1 or 2, points. What we can try, is repeat this 1000 times: Pick the decision (of all not picked yet) that will change the score for the best. This is easily implemented through just simulation. Complexity is `O(T^2)` where `T` is the number of choices.

I was typing this, and had some bugs. It took longer than usual to debug because it is not easy to know if an answer is correct or not. (This was the first Topcoder problem ever that allowed multiple correct answers to the same test case). Eventually, I found out that in two lines I had i where I meant j. Too bad, coding phase already ended. I tried the solution in the practice room after fixing it , and it passes although the execution time gets awfully close to 2 seconds .

class PieOrDolphin:
 def Distribute(self, choice1, choice2):
    choices = zip(choice1, choice2)
    n = 50  # values of choice1, choice2 go from 0 to 49, we can assume 50 people
    pie = [0] * n
    dolphin = [0] * n
    res = [ 3 ] * len(choices)
    for i in range(0, len(choices)):
        best = (3, (0,0), -1, 3)
        for (j, (a,b) ) in enumerate(choices):
            if res[j] == 3:
                # try (a,b) or (b,a) for choice:
                for (x,y,z) in ( (a,b,1), (b,a,2) ):
                    # calculate how the score would improve
                    p = abs(pie[x] - dolphin[x] + 1) -  abs(pie[x] - dolphin[x])
                    q = abs(pie[y] - dolphin[y] - 1) -  abs(pie[y] - dolphin[y])
                    best = min(best, (p+q, (x,y), j, z) )

        (x , (a,b), bi, t) = best
      # print (a,b),' would improve by ', x, ' index ',bi
        res[bi] = t
        pie[a] += 1
        dolphin[b] += 1
    return tuple(res)

So why is this correct? Well, no idea, but I thought it should be correct because , in theore, each decision is used in the best way ever. And each decision should reduce the effectiveness of at most one decision, in theory. I have nothing formal to prove.

I know that the real answer involves turning the thing into a graph and doing some proofs to show that it is easy. I think that this graph solution eventually reduces to this greedy. But no idea yet.

So?

I think it was a fine problem set, if a bit unusual (odd problem scores, flexible answers in div1 500).

Wednesday, April 16, 2014

SRM 616 recap and python review

It was a fine match. Actually the first SRM problem set by tourist. Well, it is also the sixth match of 10 matches in my python challenge. Having finished 60% of it, I have a new perspective. At this moment, I think that when the challenge ends, in SRM 621, I might actually still use python. At least in problems where I am sure it can shine.

Div1 250 - The one with alarm clocks

Match starts, open the easy problem in division 1. Meet a simulation problem. problem statement. In this problem, you start with a value `S` at time 0. Every second, S is increased by `D`, then the alarms scheduled to run at that second ring. Each alarm that rings will decrease `S` by some special value specific to the alarm. We have the alarms periods (Between 1 and 10), first ring time (between 1 and its period) and volume (the value by which `S` is decreased). Each alarm rings at times: start, start + period, start + 2*period, etc. There are up to 50 alarms. When `S` becomes 0 or less , you wake up immediately. Find the maximum starting value of `S` such that you will eventually wake up. If no matter what value of `S` you pick, you wake up eventually, return -1.

We can combine the cycles together by taking the LCM of all their periods. Because the periods are not more than 10, the combined cycle length is not more than 2520 = 2*2*2*3*3*5*7. You can answer the question by just simulating the first LCM seconds. For more info, read the editorial.

This is a problem that does okay in python. Python can really help keep the code small in comparison to the other languages. I took some advantage of it during the match and got a good score. But in reality, my original code could be improved in many ways. For example, did you know there is a builtin GCD function?.

from fractions import gcd
def lcm(x,y):
    return x * (y / gcd(x,y))

class WakingUp:
 def maxSleepiness(self, period, start, volume, D):
    sleep = 0
    n = 0
    for i in xrange(1, reduce(lcm, period) +1 ):           # LCM is at most 2520
        sleep += D
        for (p,s,v) in zip(period, start, volume):
            if (i - s) % p == 0:
                sleep -= v
        n = min(n, sleep)
    return -1 if (sleep < 0) else -n

Many details, for example using reduce to get the lcm of all the periods. The zip function to iterate through the alarms without having to do much indexing...

A subtle one: To check if an alarm should ring at time i, we use (i - s) % p == 0. In c++ you would need : i % p == s % p. The reason is that python's % operator is not broken for negative numbers. Very handy.

For a comparison, try the fastest c++ submission. The algorithm is exactly the same, but you can see that it has to "weigh" more, it has many sources of complications. Having to use a loop to calculate the LCM of all periods, or using the for j = 0; j < n; j++ to get the special alarms.

Div1 500

After finishing div1 250, I moved to the next problem. It just seemed ... impossible. It turns out that it is just a mathy problem. I am not very good with them.

I skipped this problem during the match. Later I , of course, had to learn it to solve the editorial. Once I understood I implemented a python solution. It actually took a while to code, and I am not sure if it is really using all of python's potential. You can read the code at the editorial.

Div1 1000

I explained this problem in detail before: here

Just like there are topcoder problems that are not my style at all. There are some which are really my style. This was the rare div1 hard I actually solved the instant after reading it. Because I just like grid-bitmasks dp problems a lot.

There is a world of difference between having a correct idea for a problem and actually implementing it. I noticed the `O(n^5)` idea , while correct and possibly able to pass in c++, would fail utterly to run in time in python.

So let's dissect why the following python solution (closest I got to good performance) have much more issues passing than the c++ equivalent you can see in the editorial I linked.

sys.setrecursionlimit(10**6)

class ThreeLLogo:
 def countWays(self, grid):
    n = len(grid)
    m = len(grid[0])
    
    def allValidMasks():
        yield 0
        for i in xrange(m-1):
            yield (1 << i) 
            for j in xrange(i+1, m-1):
                yield (1 << i) | (1 << j)
                for k in xrange(j+1, m-1):
                    yield (1 << i) | (1 << j) | (1 << k)
    
    MAX_MASKS = 4090
    validMasks = [0] * MAX_MASKS
    getMaskId = dict()
    t = 0
    for mask in allValidMasks():
        validMasks[t] = mask
        getMaskId[mask] = t
        t += 1
     
    @MemoizedFunction( [n+1,m+1,4,MAX_MASKS,2] )
    def f(x,y, need, maskid, horz):
        mask = validMasks[maskid]
        if x == n:
            # base case
            return 1 if ( (mask == 0) and (need == 0) ) else 0
        elif y == m:
            return 0 if (horz == 1) else f(x + 1,0, need, maskid, 0)
        elif (1<<y) & mask:
            if (horz == 1) or (grid[x][y] == '#'):
                #must continue the left "L" shape, but also must continue the above one
                return 0
            else:
                # 1) Finish the L shape above.
                nmask = mask ^ (1 << y)
                res  = f(x,y+1, need, getMaskId[nmask], 1)
                # 2) Continue the L shape vertically
                res += f(x,y+1, need, maskid, 0)
                return res
        elif horz == 1:
            return 0 if (grid[x][y] == '#') else sum(f(x,y+1, need, maskid, s) for s in (0,1)) 
        else:
            # 1) Do nothing
            res = f(x,y+1, need, maskid, 0)
            # 2) Do something, Create a new L shape
            if (grid[x][y] == '.') and (need > 0) and (y < m - 1) :
                nmask = mask | (1 << y)
                res += f(x,y+1, need - 1, getMaskId[nmask], 0)
            return res
     
    return f(0,0, 3, 0, 0) 


# decorator that adds memoization to a function using a list of bounds
# ej: @MemoizedFunction( [10, 10, 10] )
def MemoizedFunction(bounds):
    def deco(f):
        mem = [-1] * reduce( lambda a,b: a*b, bounds )
        def g(*args):
            p = 1
            k = 0
            for i in xrange(len(args)):
                (a,b) = (args[i], bounds[i])
                k = a * p + k
                p = p * b  
            if mem[k] == -1:
                mem[k] = f(*args)
            return mem[k]
        return g
    return deco

Well, things that were already mentioned, the default recursion limit is a bit large. Python is just slower than c++, mostly because its code does much more than c++. It has this dynamism that makes it so powerful but also slower. In an i5 computer, it takes around 10 seconds, while the goal is 3 seconds in TC's servers... Another large issue was the memory usage. Even with the clever decorator that makes all of the dp table contents exist within a single list, we would need around 768 MB to make all fit in memory.

During the match, I began to code a solution to the problem anyway. Because of the python challenge, I couldn't use c++. I still wanted to solve it because it was fun and also, if the python solution works in small inputs, it would mean I at least understood the problem, which would be helpful later and it would be a moral victory :). So I tried to do it, but I had bugs. Story of my life, I couldn't really finish the code until after I spent a couple of days working in the rest of the editorial..

My first code (once fixed) was actually more compact and slow than the one above:

import sys
sys.setrecursionlimit(10**6)
class ThreeLLogo:
 def countWays(self, grid):
    n = len(grid)
    m = len(grid[0])
    @MemoizedFunction
    def f(x,y, need, arms, horz):
        if x == n:
            # base case
            return 1 if ( (arms == () ) and (need == 0) ) else 0
        elif y == m:
            return 0 if (horz == 2) else f(x + 1,0, need, arms, 0)
        elif y in arms:
            if (horz == 2) or (grid[x][y] == '#'):
                #must continue the left "L" shape, but also must continue the above one
                return 0
            else:
                # 1) Finish the L shape above.
                narms = tuple( z for z in arms if z != y )
                res  = f(x,y+1, need, narms, 2)
                # 2) Continue the L shape vertically
                res += f(x,y+1, need, arms, 0)
                return res
        elif horz == 2:
            return 0 if (grid[x][y] == '#') else f(x,y+1, need, arms, 1)
        else:
            # 1) Do nothing
            res = f(x,y+1, need, arms, 0)
            # 2) Do something
            if (grid[x][y] == '.'):
                # 2.1) Create a new L shape
                if need > 0:
                    narms = tuple( sorted(arms + (y,)) )
                    res += f(x,y+1, need - 1, narms, 0)
                # 2.2) Continue the left L shape:
                if horz == 1:
                    res += f(x,y+1, need, arms, 1)
            return res
     
    return f(0,0, 3, (), 0) 

However, this helped me discover something cool. After writing this code, it was easy to optimize it and later translate to c++. Behaving as a human python-c++ compiler. When the python challenge ends, I might try this more often. Code a prototype in python making sure the logic works, then translate to c++ :).

The other ones

Then I wrote the editorial. Besides of the div1 hard that appears to be impossible in python, the remaining problems work just fine.

A notable example: This 3-liner solution for div2 500 which could be made into a one-liner.

class ColorfulCoinsEasy:
 def isPossible(self, values):
    limits = sorted([ (values[i+1]/values[i]) for i in range(len(values) -1) ])   ## get list of values[i+1] / values[i], sort it
    condition = all(i+1 < limi for (i,limi) in enumerate(limits))                 ## Required: for each i: i + 1 < limits[i].
    return "Possible" if condition else "Impossible"

Monday, April 14, 2014

SRM 616 hard problem: ThreeLLogo

This editorial is http://apps.topcoder.com/wiki/display/tc/SRM+616. Since Topcoder insist on restricting the wiki for non-registered users, I will post the explanation here.

Link to problem statement

The first part of the explanation explains what the problem asks us to count. Just notice that the constraints say there will be at most 30 columns and at most 30 rows:


A change in format

This explanation will interpret the question slightly different from the statement. The purpose is that I find it easier to make explanation graphics under the following interpretation: We have a grid, some cells (white) which allow us to put shapes on them and some cells (black) that don't the "L" shapes occupy multiple cells, at least three.

Fill the grid

There are different approaches to this problem. The one intended by the problem setter is to see it as a dynamic programming problem. Instead of focusing on there being a need for just 3 L shapes and trying ad hoc counting, perhaps it is more straightforward to see this problem as a grid/bitmask dynamic programming one. Much like FourBlocks or FloorBoards

The idea is to fill the grid in row-major order. In each cell, we can leave it empty, or choose to start a L shape or perhaps continue an already started L shape. So given a row `x` and a column `y`, assume that all the previous cells (in row-major order) have already been filled.

In this example, the current cell we are deciding is green and the remaining undecided cells are red. Because there is an unfinished vertical arm of an L shape above this cell, we have two options:

  • Use cell `(x,y)` to continue the vertical arm of the L shape. We move to the next cell:

  • Use cell `(x,y)` to finish it. Then we also move to the next cell:

Assume we did the latter, now we need to decide what to do with the new cell `(x,y)`, this time the cell has a horizontally-adjacent L shape which must be finished. We have two options:

  • Continue this horizontal shape, we will need to add more of this shape in later steps:




    If we make this decision, the next step will have a blocked cell as `(x,y)`, we cannot put any shape on this cell, but due to the decision we are forced to place something, so the number of ways to fill the remaining cells correctly is 0.

  • Finish this horizontal shape:



    Let's pick this decision

After doing this, the new `(x,y)` position is a blocked cell, our only choice is to do nothing and leave this cell empty. Move to process the next cell, you will find a situation similar to the first example, we can choose to continue the above L shape or finish it, this time we'll pick to continue it.

This is another interesting situation, `(x,y)` is an empty cell, there is no vertical or horizontal L-shape that we must continue. We can leave this cell empty:

Or, because there are currently only two L shapes, we can start a whole new one:

Dynamic programming

Once we understand how is it possible to fill the grid in row-major order, we can see that, in order to count the number of valid ways, there are few things to consider:

  • `(x,y)` : The current position in the row-major order. There are `O(nm)` such pairs. We assume that all earlier cells have already been chosen.
  • `k` : The number of L shapes we still haven't created.
  • `V` : A set of the columns currently containing vertical arms of (currently) incomplete L-shapes. This is a subset of the `m` columns, but it can contain at most 3 elements, thus there are `O(m^3)` such sets. More specifically, you shouldn create L shapes at the right-most column, so `V` is only a subset of the first `m-1` columns and with some calculation you can find that there are 4090 such sets in total for `m = 30`.
  • The state of the cell to the left of `(x,y)`, does it contain an unfinished horizontal L arm? We can use the variable `"horz"` for this, if `"horz"` is 1, we must continue the horizontal shape else it is 0.

In other words, the number of ways to fill the cells painted red in the following two examples ( `(x,y)` and the later cells in row-major order ) should be exactly the same:

This allows an `O(nm^4)` approach through memoization of a recursive implementation of the idea to count the ways to fill the grid. But that is not the whole story. We need to be careful: In practice, there are 4 values for `k`, 2 values for `"horz"`, 4090 values for the `V` set and 31*31 values for the `(x,y)` (we need the base case `x = n` and the special case `y = m` to keep the code simple). Also, the result needs to be a 64 bits integer. Thus the memory usage is 4*2*4090*31*31 * 8 bytes ~= 239 MB, quite close to the 256 MB limit and thus we should avoid overhead. Also notice that slight modifications to the logic we used could have made us require more states. There is a 3 seconds limit but we should also avoid having too much of a constant factor. A key aspect of the implementation is how to deal with the `V` set. As is usual, we can use bitmasks, but of course we cannot downright use 30 bit bitmasks to store results in memory, a workaround is to index the bit masks.

Code

    
vector<string> grid;
int n, m;

long mem[31][31][4][MAX_MASKS][2];
map<int,int> getMaskId;
int validMasks[MAX_MASKS];

long f(int x, int y, int need, int maskid, int horz) {
    long & res = mem[x][y][need][maskid][horz];
    if (res != -1) {
        return res;
    }
    int mask = validMasks[maskid];
    if (x == n) {
        // base case
        res = ( ( (mask == 0) && (need == 0) ) ? 1: 0 );
    } else if (y == m) {
        res = ( (horz == 1) ? 0 : f(x + 1,0, need, maskid, 0) );
    } else if ( (1<<y) & mask ) {
        if ( (horz == 1) || (grid[x][y] == '#') ) {
            // if (horz == 1) must continue the left "L" shape,
            //  but also must continue the above one
            res = 0;
        } else {
            //1) Finish the L shape above.
            int nmask = mask ^ (1 << y);
            res  = f(x,y+1, need, getMaskId[nmask], 1);
            //2) Continue the L shape vertically
            res += f(x,y+1, need, maskid, 0);
        }
    } else if (horz == 1) {
        if (grid[x][y] == '#') {
            res = 0;
        } else {
            res = f(x,y+1, need, maskid, 1) + f(x,y+1, need, maskid, 0);
        }
    } else {
        //1) Do nothing
        res = f(x,y+1, need, maskid, 0);
        //2) Create a new L shape
        if ((grid[x][y] == '.') && (need > 0) && (y + 1 < m) ) {
            //2.1) Create a new L shape
            int nmask = mask | (1 << y);
            res += f(x,y+1, need - 1, getMaskId[nmask], 0);
        }
    }
    return res;
}
 

long countWays(vector<string> grid)
{
    this->grid = grid;
    n = grid.size(), m = grid[0].size();
    
    // generate all bitmasks with <= 3 turned on bits:
    int t = 0;
    auto addMask = [&](int mask) {  // don't mind the lambda syntax, it
        getMaskId[mask] = t;       // simplifies having to repeat this code
        validMasks[t++] = mask;    // for each number of 1 bits.
    };
    addMask(0);
    for (int i = 0; i + 1 < m; i++) {
        addMask( 1 << i );
        for (int j = i + 1; j + 1 < m; j++) {
            addMask( (1 << i) | (1 << j) );
            for (int k = j + 1; k + 1 < m; k++) {
                addMask( (1 << i) | (1 << j) | (1 << k) );
            }
        }
    }
    cout << t << " masks found for " << m << endl; 
    
    // run the recursive solution:
    memset(mem, -1, sizeof(mem));
    return f(0,0, 3, 0, 0);
}

Friday, April 04, 2014

SRM 615: Recursive, recursive

Already reached the half of this python experiment. This time , python hasn't really helped me and it wasn't an obstacle either. Quite mundane, really.

Div1 250: The one with ameba

First of all, this is a 7:00 AM match so I had a bit of issues getting brain to work at first. I didn't really understand the statement very well initially. Then the problem itself is quite special, ad hoc and cool. So it took a while. Even so, I was surprised when it turned out that tons of people had a much better score than I.

The problem (after tons of translation) is as follows:

An amoeba has size `s` , a integer. There is a list `X` of amounts of gel it met in chronologically order, so first it met `X[0]` gel, then `X[1]` gel, etc. `X` has at most 200 elements. Whenever the amoeba meets a gel with size exactly equal to its current size `s`. The ameoba is forced to eat the gel and thus double its size. `s' = 2s`. We don't know the initial size `s_0`, but we wonder how many integers `u` exist such that the final size of the ameoba cannot be equal to `u`.

All integers that do not belong to `X` are possible final sizes. For example, take a integer `a` that is not in `X` and make it the starting size. Since none of the gels the amoeba will meet is exactly `a`, the amoeba will never double its size and the final size is `a`.

So we should only care about the integers that are already in `X`. For each of those integers, answer: "Can the final size be equal to `X[i]`?".

Given a integer `a`, how do you know if it is a possible final size? Focus on the last integer in the sequence: `X[n-1]`. There are two possibilities in which the final size is `a`:

  • The size was already equal to `a` before the amoeba met gel `X[n-1]` and the amoeba didn't eat it. This requires two conditions to be true:
    • `a != X[n-1]`, because the amoeba must not eat it.
    • It is possible for the amoeba to reach size `a` after meeting the first `n-1` gels.
  • The size of the amoeba became `a` after meeting `X[n-1]`:
    • This means that `2X[n-1] = a`.
    • And also `X[i]` should be a possible size after meeting the first `n-1` gels.

So we have a recursive idea. The issue is that `n = 200`, so let's get clever. First find the set of impossible integers for the first 1 elements (easy), then use that set to build the set of impossible integers for the first 2 elements, then the first 3 elements and so and so. E.g: If you know the set of impossible elements for the first `n-1` elements, it is easy to know if `a` or `X[i]` are impossible or not to have after meeting the first `n-1` elements. This yields an `O(n^2log(n))` solution if you use a logarithmic set or `O(n^2)` if you use a hash table.

class AmebaDiv1:
 def count(self, X):
    impossible = { X[0] }
    for i in range(1, len(X)):
        impossible2 = set()
        # which are impossible for X[:i+1] ?
        for j in xrange(0,i+1):
            a = X[j]
            # is a possible?
            # a) ignore X[i],
            poss = False
            if X[i] != a and a not in impossible:
                poss = True
            # b) use X[i] to double
            if 2*X[i] == a and X[i] not in impossible:
                poss = True
            if not poss:
                impossible2.add(a)
        impossible = impossible2
    
    return len(impossible)

So I said python didn't help that much. It is a very procedural algorithm and well, I think that if I did it in c++ it would take the same amount of lines. There are possibly much better python implementations.

The rest

Tried the other problems, not very thrilled about having to write editorial because div1 550 seems hard and div1 950 is incredibly complex in what it asks for. Currently waiting for the challenge phase to end. Blog post will be posted automatically once it does so.

While watching the challenge phase, I learned a new solution and the reason everyone had such fast scores. Remember that either `a` is possible in the first `n-1` or `X[i]` is possible. It follows that the only starting integers you should try and simulate are those already in `X[i]`, if you test each of those integers as the starting one, you can easily find which `X[i]` are possible. The solution is then to simply do that.

Tuesday, April 01, 2014

Explanation for SRM 613 Hard problem is ready

And because TC wiki is still closed to guests, I decided to post it here:

This is a cool little evil problem that is actually "easy" in concept. There is just a little magic trick to make the dynamic programming work.

We have a board of `M` columns and `N` rows. `M` is up to 200, `N` is up to 50. For each row `i`, The area composed out of left[i] of the left-most cells must contain exactly one checker and the area composed out of the right[i] of the right-most cells must contain exactly one checker. These areas never overlap. Each column may contain at most one checker. Return the number of valid ways to follow these condition modulo 1000000007.

If there were no left parts

Let's try an easier version of the problem. What if the problem just had the right-most parts' condition? We have a list right such for the i-th that you must put exactly 1 checker on the right[i] right-most cells in that row. You can also put at most once cell per column.

This version is much simpler to solve. We can try to decide, from column 0 to column `M-1`, what to do:

  • We could decide not to put any checker on this column.
  • We can put a checker, outside any right-most area.
  • We can put a checker inside one of the right-most areas.

There are some useful observation we can make:

If we place the checker on an unrestricted cell or don't put a checker at all, the number of ways to decide the remaining columns shall be exactly the same, regardless of what cell we pick (or not) for the checker:

The second observation is more important. In this example, in column 0 there are two "active" right-part. In column 2, there are three active right parts. In column 3, five active right parts. Etc. Once we take this perspective, we can find a recurrence by noticing what happens if we put a checker on one of the column's active parts.

Placing a checker will just make one of the active right parts, no longer usable. All that matters is the number of active right parts in a column and the number of parts that were already used in previous steps.

With this information it is possible to make an `O(MN)` dynamic programming approach. Use a function `f(x, r)` where `x` is the number of columns we already processed and `r` is the number of available right-parts, which increases in the columns where new right parts begin and decrements whenever we choose to use one of them.

The left parts

When the left parts are included, we have more issues. From the previous section we know that all we need in order to handle the right parts is to know how many of them were already used in the previous steps. So let's focus on the left parts.

Once again, we decide from column 0 to column 1, to the previous options we add an option: Add the checker on top of a left restriction area. In this case, however, the left restriction areas are not indistinguishable from each other...

When deciding for column 5 and later, the only information we need about the right parts is that one is taken. The left parts are more complicated because the two taken ones are very different from the free one.

The clever solution is to avoid making a decision until it is absolutely necessary. Let's just decide in which columns we are going to place checkers on left areas. Without selecting the row specifically until it is completely necessary.

For example, we process the first two columns and decide that they both will contain checkers that would occupy left areas. We don't decide the actual rows used by those checkers yet.

Before moving to column 2, we need to make sure that the left area in row 0 (top) is taken by a checker. Because this left area will not be active anymore. There are two columns that can contain this checker, pick one:

The column that we do not pick would still be available for other rows later. The nice thing is that we can pick any of them and the number of ways to fill the remaining columns will remain the same. In this case, out of two columns assigned to have a checker in the left area, we need to pick one. But multiple left areas may finish in the same column. We can just use a binomial coefficient to calculate the number of ways to pick.

In column 2, we will be forced to leave a checker in the last cell of the left area in the third row. Then we progress to column 3. This will have four options: 1) Place checker on a right area. 2) Place checker on one of the left areas in this column (but leave the decision of the specific row open). 3) Place checker in a cell that has no restriction and 4) Don't place a checker in this column:

What we can conclude is that we can represent the state of a sub-problem with three parameters: `x`: The number of columns already visited / index of the current column. `u`: Number of columns we decided to make available to get used by left areas. `r`: Number of active right areas. In the dynamic programming, we make decisions to alter these parameters, and, when one or more left-areas finish, we need to multiply the current result by a binomial coefficient to choose the specific columns that will be used by those rows. We also need to multiply by a factorial to permute the positions. See below:

In this case, there are 24 ways to choose the position for the checkers in the 3 left areas that have just finished. There are 4 available columns and 3 areas that must finish: `((4),(3))` multiplied by `3!` permutations of the positions. Those images above are just two out of the 24 examples.

Code

With knowledge of the 3 needed parameters to represent the state, we can build a dynamic programming / memoziation solution that needs `O(MN^2)` time.

const int MOD = 1000000007;

long C[201][201];
long fact[201];

void init()
{
    memset(C, 0, sizeof(C));
    for (int i = 0; i <= 200; i++) {
        C[i][0] = 1;
        for (int j = 1; j <= i; j++) {
            C[i][j] = (C[i-1][j] + C[i-1][j-1]) % MOD;
        }
    }
    fact[0] = 1;
    for (int i = 1; i <= 200; i++) {
        fact[i] = (i * fact[i-1]) % MOD;
    }
}


int dp[201][51][51];
// x: processed columns
// u: left parts used
// r: active right parts
int rec(int x, int u, int r)
{
    int& res = dp[x][u][r];
    if (res == -1) {
        
        if (x == M) {
            res = (r == 0);
        } else {
        
            res = 0;
            // common code used to process a number of used left parts nu and 
            // active right parts nr once the transitions below pick it:
            // We use a lambda to be able to update the res variable correctly.
            // p is the number of times we need to add this case.
            auto process = [&](long p, int nu, int nr) {
                if (nu >= endLeft[x]) {
                    p *= ( C[nu][endLeft[x]] * fact[endLeft[x]] ) % MOD;
                    p %= MOD;
                    res += (int)( (p * rec(x+1, nu - endLeft[x], nr) ) % MOD );
                    res %= MOD;
                }
            };
            int nr = r + startRight[x];
            
            // use a left part
            if (u + 1 <= activeLeft[x]) {
                process(1, u+1, nr);
            }
            // use a right part 
            if (nr > 0) {
                process( nr, u, nr - 1);
            }
            // don't use anything
            process(1, u, nr);
            // put in middle
            if ( middle[x] > 0) {
                process( middle[x], u, nr );
            }
        }
        
    }
    return res;
}

int M;
int endLeft[200];
int startRight[200];
int activeLeft[200];
int middle[200];

int getNumber(vector<int> left, vector<int> right, int M)
{
    init();
    this->M = M;
    memset(dp, -1, sizeof(dp));
    
    fill(endLeft, endLeft + M, 0);
    fill(startRight, startRight + M, 0);
    fill(middle, middle + M, 0);
    
    for (int i = 0; i < left.size(); i++) {
        for (int j = left[i]; j < M - right[i]; j++) {
            middle[j]++;
        }
    }
    
    for (int l : left) {
        endLeft[l-1]++;
    }
    activeLeft[0] = left.size();
    for (int i = 1; i < M; i++) {
        activeLeft[i] = activeLeft[i-1] - endLeft[i-1];
    }
    for (int r: right) {
        startRight[ M - r]++;
    }
    return rec(0, 0, 0);
}