Sunday, October 27, 2013

Coding habits

I was wandering in quora and finally found a interesting thread: what are some of your weird coding habits?. I've been meaning to talk about my style choices, specifically in programming contests where the people that think that throwing away readability in exchange of a bit typing time is a fair exchange. I really like my code to be readable (for me) and have forged some coding habits.

#defines

In c++, #defines are a meta-programming tool inherited from c++. It is great to use defines if you want to have code that is created conditionally (using #ifdef). Some defines are useful because the text-processing allows things that a constant function cannot do, like: #define x(a) cout << "#a" << a . In the past, I used a for_each #define because it was much clearer than the alternative (But now c++11 has range-based for loops and a for_each macro is not needed anymore) Any other #define use tends to be something that Satan has introduced in c++ to allow people to make horrible code.

I am unapologetic. All those people using a REP #define to replace for loops. Also you, yes, you who uses #defines for stuff that typedefs and constants can do just fine. You should be ashamed. It is as if you were throwing plastic to the ocean of code. Think of the children, there are newb programmers that read up your code in the high ranks of SRMs and and think "COOL". Yes, really, this awful thing you are doing is IMITABLE BEHAVIOR. How do you sleep at night?

I am not talking just about code purity, #defines are risky and can bring you confusing bugs.

//First of all, this causes a syntax error:  
#define MAX 1000 //Discouraging comments?

// Now think about this, if I use a constant:
const int MAX_N = 50;
// I can then use the same constant to generate other constants
const int MAX_V = MAX_N * MAX_N + 2; // maximum number of vertices in the graph
                                     // note how I can comment that code

//If I instead used defines for constants:
#define MAX_N 50
#define MAX_V MAX_N * MAX_N + 2

// cool, right? NO!

// The following code would be wrong in the #define version:
x = (MAX_V * 2);
cout << MAX_N << " " << MAX_V << " " << x << endl;
// in const version:   "50 2502 5004"
// in #define version: "50 2502 2504" 

// Using a #define in code is  a text replace. It is semantically different to
// a constant, that's why you shouldn't use them in place of constants.

// And yes, you can "fix" this by adding brackets:

#define MAX_V (MAX_N * MAX_N + 2)

// But ask yourself: how many people who use #defines instead of constants actually
// use brackets when declaring them?


Code blocks

When I started programming I really thought saving up on code lines made me look clever. I did aberrations such as this:

for (int i=0; i<n; i++) for (int j=0; j<n; i++) for (int k=0; k<n; i++)
    dist[i][j] = min( dist[i][j],  dist[i][k] + dist[k][j] ); // floyd Warshall

//Or this:
for (int i=0; i<n; i++)
    for (int j=0; j<n; i++)
        for (int k=0; k<n; i++)
            dist[i][j] = min( dist[i][j],  dist[i][k] + dist[k][j] ); // floyd Warshall

        
// or even this:
for (int i=0; i<n; i++)
    for (int j=0; j<n; i++)
        for (int k=0; k<n; i++)
        {
            dist[i][j] = min( dist[i][j],  dist[i][k] + dist[k][j] ); // floyd Warshall
            cout << i << ", " << j << "= " << dist[i][j];
        }
// (Just one curly is needed, right? )

So yeah, c++ allows you to save the hassle of using curly braces when there is only one statement. You can use if(cond) statement; or for(x;y;z;) statement; or any similar variation. You save up the number of lines!

Nowadays I have a strict rule. I try to almost always use curly braces. Just because the language allows you not to use them doesn't mead you shouldn't. I have an exception though, else statement after if:

if (x == 1) {
    return 7;
} else if ( 0 <= x && x <= 10 ) {
    return 8;
} else if ( x > 78 ) {
    return 12;
} else {
    return 0;
}

The reason for this exception is the lack of an elseif keyword.

Why use always curly braces? The more I programmed the more it seemed that I had no way to really predict how I was going to have to modify my code. Today it appears that this if statement requires to execute only one line:

if (x == 1) return 7;

But maybe tomorrow you are going to need to add something to it. Maybe a special case you were missing. Maybe you need to add debug code... In order to update this code you'll need to restructure it , add curly, add the indentation, add the new code. But what if later you decide you need to remove it again? Once again you reset the style... It is too many steps before updating something, and each time you change something in your code you risk causing more bugs. This simple if is... simple, what if it is one line right next and inside many other blocks? I don't know, it just feels better when all your code blocks are braced.

Adding braces is half the battle. Things like :

//1.
if (x == 1)
{
    return 7;
}

if (x == 1) { return 7; }

if (x == 1)
    { return 7; }

if (x == 1) {
    return 7; }

Are also problematic to my taste. I prefer the egyptian brackets style. If I am going to add curly braces to everything, I better not use up too many lines of code. I used to use the style in the first example all the time. Now it just looks quite heavy. The return 7 seems too far away from the x == 1 :).

Oddly , I don't like to use egyptian brackets outside of statement blocks. The braces for a function / method / struct / class / namespace / must be the heavy ones. Why? I am not sure. But it seems intuitive to me. Almost as if it is because the statements are shorter than the functions.

Indent

Really, why do people use tabs for indents? It breaks so many programs that it is not even funny. I prefer 4 spaces. I am not sure why, but less than 4 spaces tends to seem too tight and more requires far more of a panoramic view. I have exceptions. I know that I sometimes do something like this:

for (int x0 = 0; x0 < w; x0++) {
 for (int y0 = 0; y0 < h; y0++) {
    for (int x1 = 0; x1 < w; x1++) {
     for (int y1 = 0; y1 < h; y1++) {
        if ( something ) {
            doWhatever();
        }
     }
    }
 }     
}

Some times the indentation depth becomes too wide, even though the logic of the algorithm is not really nesting four loops. When I look at the code above, I really think I am nesting two loops and not four. One loop for point `(x0,y0)` and another for point `(x1,y1)`. That is why the indentation is so peculiar. I wish languages had better facilities to do "2d loops". In fact, I think they do:

for (int x0 = 0, y0 = 0; y0 < h; x0 = (x0 + 1) % w, y0 += !x0) {
    for (int x1 = 0, y1 = 0; y1 < h; x1 = (x1 + 1) % w, y1 += !x1) {
        if ( something ) {
            doWhatever();
        }
    }
}

But that would be unorthodox, wouldn't it? :)

Memoization functions

This is one I learned the hard way over the years. I always try to have exactly one return statement in a memoized recursive function.

int f(int x, int y)
{
    int & res = dp[x][y];
    if (res == -1) {
        res = 0;
        if (x == 0) {
            res = y;
        } else {
            res = f(x-1, y);
            if (y > 0) {
                res = f(x-1, y-1);
            }
        }
    }
    return res;
}

Why? let us check out the alternative:

int f(int x, int y)
{
    if (x == 0) {
        return y;
    }
    int & res = dp[x][y];
    if (res != -1) {
        return res;
    }
    res = f(x-1, y);
    if (y > 0) {
        res = f(x-1, y-1);
    }
    return res;
}

Seems inicuous, but it is bug-prone to have this habit. In this case y is accessible in `O(1)` time, but maybe another recurrence requires you to return some f(y) function that is `O(y)`. If we did that, in this function, the f(y) call wouldn't get memoized. There are tons of similar ways in which you can forget to memoize something if you are liberal with return statements. My favorite is when coding in a language that doesn't have by-ref variables, or when you use that code style in c++:

int f(int x, int y)
{
    if (x == 0) {
        return y;
    }
    int res = dp[x][y];
    if (res != -1) {
        return res;
    }
    if (x >= y/2) {
        res = f(x, y - x);
        return res; // oops , forgot to set dp[x]y] = res
    }
    res = f(x-1, y);
    if (y > 0) {
        res = f(x-1, y-1);
    }
    dp[x][y] = res;
    return res;
}

Saturday, October 26, 2013

SRM 595

I just finished the editorial for SRM 595: http://apps.topcoder.com/wiki/display/tc/SRM+595. Don't forget to vote.

It is a good moment to talk about the match.

During the match

Lately I am using the standard strategy: easy problem first, then medium, leave hard for when writing the editorial. My only twist is that I force a 5 minute break after every 25 minutes spent working on a problem (So it is the pomodoro technique applied to SRMs, giving it a try).

I opened 250, at first it seemed harder than it actually was. It turned out to be quite simple. But I did take longer than I hoped for. I actually had an unforeseen delay. I have been doing testing on the very unstable, cutting edge Greed plugin 2.0 alpha, and forgot to change it back to 1.5 before the match. I actually thought it would work all right, and it did, until I tried to submit and it was actually having issues there. Well, it was simple to change back to 1.5 but it did delay me for a bit.

I was actually lucky here because from the start I decided to use 1-indexed arrays to fill the colors. It is very rare I do this, but in this case it saved me from a potential bug because .size() in STL structures is unsigned (Which is VERY stupid, really, this is the worst good in ALL of the STL design). Apparently this issue hit many coders during the match.

I opened div1 500, and I tried to solve it. I wwas having issues because my idea of an approach was to cut the string in two, count the number of valid substrings in one half and the other and then combine the parts. This logic was more complicated than first picking the first substring and then counting the others, which I could only find later in the match. I spent all the coding phase trying to get it to work.

Challenge phase was a bit fruitless. But I was glad to find my code for 250 was one of the cleanest and not bugged.

Comments about the problems / editorial

Div2 easy: A bit harder than usual but a good pick.

Div2 medium: I don't see the need of making this a different problem than div1 easy (ok they are the same problem but with different constraints). I like to make fun of Java, and usually pick it only when I already plan to write c++ and python code and try to make the Java code as overloaded with tons of code as possible, just so that Java looks bad ^^.

Div2 hard: Ok problem, a bit too standard, but I think that's a good thing for div2 hard.

div1 easy: Ok for the slot

Div1 medium: It was a interesting one, I wish I solved it.

Div1 hard: Amazingly cool. When I first read the explanation rng_58 sent me (Which is almost exactly the same as the "short summary" in the editorial) I was a bit confused: WHAT TRIANGLES? WHAT IF THEY OVERLAP? IT DOESN'T MAKE SENSE. I had to spend all this day thinking about it and reverse-engineering his code. But once I noticed exactly what method to split by triangles is used, it is a very cool problem, refreshing, actually.

I intended to finish this editorial in 24 hours to save as much time as possible for a school assignment, I failed, as I actually took 50.5 hours, too bad.

Friday, October 18, 2013

Editorial for SRM 594 Div1 Hard: FoxAndAvatar

The editorial for this problem is up

What a cute problem. Certainly interesting to completely decode it and find all the little tricks that make brute force smarter enough to work.

I wonder if I broke a record in the number of images. 19 images for a single problem?

Thursday, October 17, 2013

SRM 594 editorial (Part 1)

The first part of SRM 594 editorial, containing explanations for all problems but division 1 hard is up at: http://apps.topcoder.com/wiki/display/tc/SRM+594

I really hate that there was a delay. This time I really have no excuse, except for the fact that this problem set, specifically division 1 medium and division 2 hard seem to have been designed with the specific objective of making editorials long to write . They have these hard to think of solutions. Hard to prove solutions. And explanations that need to be image-intensive.

While in that topic, I think my explanations for both of those problems are terrible. In Div2 1000 I had no choice but to keep saying "connected component". Div1 medium is bad because it really does a bad job explaining how to come up with that approach. Probably because nobody knows how to do that, it seems to be something that happens espontaneously in good coders's brains without explanation.

I now have less than 48 to solve and write the explanation for div1 hard. I hope I can do it. The fact that nobody solved it during the match (Including Petr and tourist ) does not make me optimistic.

Tuesday, October 15, 2013

SRM 594: More slowness

One of the reasons I was trying a risky strategy to put more focus on div1 medium / hard and risk not solving div1 easy is matches like this one, you only solve div1 easy, and you take too long, so you get a low score and is not too different from getting a zero, but much more boring. It gets frustrating.

Div1 250: The one with planets.

You get two arrays A, B, lists of planets in increasing order of distance from the sun. Each list might be missing some of the planets in the solar system. The lists contains planet dimensions, but both lists use a different measurement system, so two planets with sizes {6, 18} in one list might have {10, 30} in another list. Return the minimum number of planets in a solar system for both lists to be correct. In other words, find the maximum number of planets the lists can have in common (and the proportions have to be consistent) and subtract from the total number of planets in the list.

I took too long to think of this solution, but the key is to figure that at least one pair of planets from each list can be equal. (Worst case scenario, you make the last planet in one list and the first planet in the other equal). We can iterate through all possible pairs `(A[i], B[j])` and ask "What is the minimum number of planets if we assume these two are equal?". If we determine that two integers `A_i,B_j` describe the same planet in different measurement systems, then it is easy to see the conversion rate between the two systems. One unit in system A is equivalent to `B_i/A_i` units in system B. So we can just multiply all planets with that proportion. If we want to avoid doubles, we can actually find the greatest common divisor `g = gcd(A_i,B_j)`, and multiply each array by `B_j/g` and `A_i/g` , respectively, that would make all numbers in the same measurement system.

Once all numbers have the same system, we just need to find the maximum number of planets that are equal. Since both lists have the same order. This is actually the same as the longest common subsequence, a classical dynamic programming problem.

long gcd(long a, long b)
{
    while (b != 0) {
        tie(a,b) = make_tuple( b, a % b);
    }
    return a;
}
vector<long> fix(vector<int> A)
{
    int g = A[0];
    for (int x: A) {
        g = gcd(g, x);
    }
    vector<long> LA;
    for (int &x: A) {
        LA.push_back(x / g);
    }
    return LA;
}

int sub(vector<long> A, vector<long> B, int i, int j)
{
    //if we assume A[i] and B[j] are the same planet, what is the
    // minimum number of planets?
    long a = A[i], b = B[j];
    long g = gcd(a, b);
    
    // Scale vectors such that the new vectors A' , B' 
    // Have:  A'[i] = B'[j] = lcm(A[i], B[j])
    for (long & x: A) {
        x *= (b / g);
    }
    for (long & x: B) {
        x *= (a / g);
    }
    
    // largest common subsequence:
    int nA = A.size(), nB = B.size();
    int dp[nA + 1][nB + 1];
    for (int s = 0; s <= nA; s++) {
         for (int t = 0; t <= nB; t++) {
             if (s == 0 || t == 0) {
                 dp[s][t] = 0;
             } else {
                 bool eq = (A[s-1] == B[t-1]);
                 dp[s][t] = max( {dp[s-1][t], dp[s][t-1], eq + dp[s-1][t-1] } );
             }
         }
    }
    return nA + nB - dp[nA][nB];

}

int minimalPlanets(vector<int> _A, vector<int> _B)
{
    // if the result is not |A| + |B|, at least one pair must be equal
    vector<long> A = fix(_A);
    vector<long> B = fix(_B);
    int nA = A.size(), nB = B.size();
    int res = nA + nB;
    
    for (int i=0; i<nA; i++) {
        for (int j=0; j<nB; j++) {
            res = std::min(res, sub(A,B, i,j));
        }
    }
    return res;
}

I had a fun delay, when multiplying A by `B_j/g` and B by `A_i/g`, I didn't notice that `A[i]` changes after the first step. Good thing example 2 catches this mistake.

Div1 550

I tried plenty of things with no success. At first I thought it was clearly a Maximum Flow problem. It is at least very easy to know, through min-cut the minimum number of black checkers needed to remove all white checkers. Is there a way to do the same but to remove only a limited number of white checkers? I have no idea. But I noticed Petr uses maxflow. Albeit a bit more straightforward, maybe there is a way to solve the whole problem with max flow.

Later I thought [maybe it is greedy?] but nah, I was able to come up with counter examples to any greedy idea I had, so probably not.

Challenge phase

I thought that maybe overflow was going to be a possible mistake guys would make. But not too much luck.

There was a greedy solution to 550 in my room, but it took me a while to understand its strategy and come up with a counter example, someone else challenged while I was inputting my challenge case.

So...

I need to improve my speed. No idea how can it be done. I thought limiting the time I have for 250 to 10 minutes would work, but it didn't seem to. Any comments about the match?

Wednesday, October 09, 2013

So, TCO 2013 video contest

Yeah, I am not going to the TCO this year, at least not through a video contest. That was far-fetched. Anyway, I take this opportunity to say that some videos in this contest are really great, are actual videos and deserve votes. I suggest voting on the ones you like the most. The raw number of linkes determines the winner (and yeah, I am not sure that's a good selection method)

I like the first one because it really shows how it feels to be a TopCoder.

Monday, October 07, 2013

WolfDelaymasterHard editorial

Yep, with this problem I finished the editorial for SRM 593

http://apps.topcoder.com/wiki/display/tc/SRM+593#WolfDelaymasterHard

I really intended to get it done in 48 hours, but I had an 8 hours delay from that objective. This is a problem that really needs you to get clever even after you understand how to solve it. It has a couple of good twists from the standard dp implementation.

It is a very cool problem. But sometimes I wonder how are targets physically able to solve these problems in less than an hour. Even after understanding it, I really needed a day to implement the code (and think/debug hard binary search stuff...)

I wonder if lambda abuse is suitable for editorial code. First-class functions really help with a problem that requires 4 binary searches... If it isn't suitable for editorial code then at least there is the Java version from the admins to compensate.

Don't forget to vote in this video so I can travel to the TCO :(

Sunday, October 06, 2013

SRM 593 Editorial (Part 1)

The first part of the SRM 593 is ready at http://apps.topcoder.com/wiki/display/tc/SRM+593. This is the first editorial in which we are using a new delivery process. It basically means I am supposed to release the editorial minus one problem within 48 hours since the start of the match and I can use additional 48 hours to focus on the remaining problem (Although I'll personally prefer to do it all in 48 hours whenever possible). Let us see how it goes.

Comments

I am disappointed I didn't solve div1 450 during the match. During a match I usually forget to formalize the stuff that happens in a problem. It tends to be a winning move as it makes simplifications clearer. Oh well. Better luck next time.

PS: Don't forget to take me to TCO! :) http://www.youtube.com/watch?v=OMh8Z9Jdug4

c++ and python topcoder generic testers

A sequel to post: did I just spent all my Saturday customizing my Tester?. I have been improving the idea further. After adapting the ideas to python, it caused me to work harder in the c++ version.

Test case format

When I tried to adapt the ideas to python, I was able to have this useful format for test cases:

TEST_CASES = [
    ( [ ("---","---","---",) ], 0 ),
    ( [ ("-X--","---X","----","-X--",) ], 1 ),
    ( [ ("XXXX","---X","---X","---X",) ], 2 ),
    ( [ ("XX-XX--","-XX-XXX","X-XX--X","X--X-X-","XX-X-XX","-X-XX-X","-XX-XX-",) ], 3 ),
    #Your custom test goes here:
    #( [ ], None ),
]

This would be an example for SRM 593 div1 250: HexagonalBoard. It is much simpler than the odd template stuff that I barely managed to have in c++ and it is shorter.

It led me to try to work harder on the c++ version, maybe there really was a way to use {} format without having to type input {} and output {}... It turns out that once you are using constructors you can totally use them:

template<class I, class O> vector<pair<I,O>> getTestCases() { return {    
    { { {"---","---","---"} }, {0} },
    { { {"-X--","---X","----","-X--"} }, {1} },
    { { {"XXXX","---X","---X","---X"} }, {2} },
    { { {"XX-XX--","-XX-XXX","X-XX--X","X--X-X-","XX-X-XX","-X-XX-X","-XX-XX-"} }, {3} },
    // Your custom test goes here:
    { { {"XXXX","XXXX","XXXX","XXXX"} }, {3} },
    { { {"XX-","---","---"} }, {2} },
    { { {"XX","--"} }, {2} },
    { { {"-XX", "X-X", "XX-"} }, {2} },
    { { {"-XX","X-X","XX-"} }, {2} }
    { { {"XXX","XXX","XXX"} }, { } }
    //{ { {}}, {} },    
};}

You can see some custom examples I added today during the match and also while working on the problem for the editorial. An empty { } for the output means the output is unknown...

This works because we are calling constructors of an input and an output structs. The input struct has an implicit constructor to fill all the arguments of a problem. The output has two constructors, one that takes an output result, and one that takes no arguments. The reason {} syntax wasn't working for tuples, was that tuples apparently define only an explicit constructor? Either way once you switch to fixed structs it finally works.

This input format in c++ is quite useful. The compatibility with TopCoder's system tests format makes debugging during practice easier. Also it is easy to copy test cases you created and have stored in the file to the challenge window.

Making the problem-specific code as small as possible

The whole idea of the "meta-tester" is to have the code baggage that we need to automatically generate and include in the problem's file as small as possible. That's why I wanted to have most of the tester in a separate file that is not-specific to a single problem but works with any problem (It is also easier to edit the tester this way). Again, python made it way too easy. This is a sample file generated for python for the HexagonalBoard problem:

import math,string,itertools,fractions,heapq,collections,re,array,bisect
# SRM 593 - HexagonalBoard (250 points) by vexorian :

class HexagonalBoard:
 def minColors(self, board):
    return 0

# CUT begin
#-------------------------------------------------------------------------------
CASE_TIME_OUT = 2.00

TEST_CASES = [
    ( [ ("---","---","---",) ], 0 ),
    ( [ ("-X--","---X","----","-X--",) ], 1 ),
    ( [ ("XXXX","---X","---X","---X",) ], 2 ),
    ( [ ("XX-XX--","-XX-XXX","X-XX--X","X--X-X-","XX-X-XX","-X-XX-X","-XX-XX-",) ], 3 ),
    #Your custom test goes here:
    #( [ ], None ),
]

def isTestDisabled(i):
    return False

#-------------------------------------------------------------------------------
if __name__ == '__main__':
    import sys, tester
    tester.run_tests(
        HexagonalBoard, 'minColors', TEST_CASES, isTestDisabled, 
        1381033582, 250, CASE_TIME_OUT, True
    )
# CUT end

look at, basically the only part code that is not useful for the problem is the call to the tester.run_tests(). Everything else - The problem class that will be sent to topcoder, the TEST_CASES array that allows us to add test cases and the isTestDisabled() function that allows test cases to be disabled - Is very useful to have in the problem's code file.

How does run_tests work? First, it needs to receive the TEST_CASES list and the isTestDisabled function. But we can also send it the HexagonalBoard class. Python is quite dynamic, so it can even instantiate objects from a variable that determines what class. minColor is between quotes, it is the name of the method. After instantiating HexagonalBoard() it needs to call the method, we use a trick like:

getattr(instance, methodName)( arguments list)

to use the method name to call the method from an instance.

The input / arguments list we are supposed to send this method. Python stores them in a list. And no matter the list, we can uncompress it when sending this group of arguments to the method by using the * python operator:

getattr(instance, methodName)( *some_list)

Since the python code was getting so small, I tried to work more in the c++ one. The first candidate for cleaning was the output struct. It is a struct that holds the return type, a boolean to say that there is no return value and two constructors:

struct output {
    int value; bool u=false;
    
    output(int value) { this->value = value; }
    output() { u = true; }
};

We can actually make this thing a template. Being a template we need the code to exist in the generic tester instead of being specific to the problem:

template<typename T> struct output {
    T value; bool u=false;
    
    output(T value) { this->value = value; }
    output() { u = true; }
};

The real code bloat came from the input struct though:

struct input {
    vector<string> board;
     
    int run(double & elapsed) {
        StringPath* instance = new StringPath();
        time_t s = clock();
        int r = instance->countBoards( this->board );
        elapsed = (double)(clock() - s) / CLOCKS_PER_SEC;
        delete instance;
        return r;
    }
 
    void print() {
        cout << "[" << Tester::pretty(board) "]";
    }
};

What complicates things in c++ is that the number of arguments the method needs might vary. I think it might be possible to do this all with a variable template, however, maybe the reason tuples use explicit constructors is because of a limitation in the variadic-ness? Also, it sounds very complicated to do it. So I looked for other ways to reduce the footprint.

The print() method exists because of the variable number of arguments. So the part of the generic tester that needs to print the arguments can't do it alone. However, there was too much code in the print() method. Also, it determined the format the arguments are printed instead of letting the generic tester do it. I managed to create a variadic template that prints any sequence of arguments of any type, following the tester's format:

/*==============================================================================
  Converts given argument into a string containing pretty-looking output
*/
string pretty(string s)
{
    return "\"" + s + "\"";
}
string pretty(double x)
{
    ostringstream s;
    s.precision(10);
    s << x;
    return s.str();
}
template <typename T> string pretty(T t) {
    ostringstream s;
    s << t;
    return s.str();
}
template <typename T> string pretty(vector<T> t)
{
    ostringstream s;
    s << "{";
    for (int i=0; i<t.size(); i++) {
        s << (i?",":"") << t[i];
    }
    s << "}";
    return s.str();
}

/*==============================================================================
  Prints a test case's argument list
  
  printArgs(a,b,c) ->  cout << "[" << pretty(a) <<","<< pretty(b) <<"," <<  pretty(c) <<"]"
*/
void printArgsCont() {}

template<typename T, typename... Args> void printArgsCont(T value, Args... args)
{
    cout << "," << pretty(value);
    printArgsCont(args...);
}

template<typename T, typename... Args> void printArgs(T value, Args... args)
{
    cout << "[" << pretty(value);
    printArgsCont(args...);
    cout << "]";
}

Variadic templates are just great life savers. Now the print() method in the output struct looks like this:

void print() { Tester::printArgs( /*list of comma-separated arguments*/ ) ; }

The larger code usage in input struct was the run method. It also implemented tons of things that should really belong to the tester code, like measuring time or constructing the problem class. Well, it turns out that since the constructor for the problem class has no arguments, the tester can instanciate the problem class (we would need it to be a template argument). The new run method can just take a pointer to the class instance. It just needs to call the method. It is also not necessary to measure time inside the run() method, the tester can measure the amount of time the run() method takes, the overhead would be minor.

struct input {
    /*variables that hold each of the arguments*/;

    int run( /*ProblemClass*/ * x) {
        return x->minColors( /* comma - separated arguments */ );
    }
    void print() { Tester::printArgs( /* comma - separated arguments */ ); }
};

Problem name

At one point the tester received the problem name both as a string and as a template argument. Was it possible to make it receive it just as a template argument? There is something called typeid(T).name() in c++. Unfortunately, it seems to return some random numbers before the class name in gcc. It turns out we can fix it...

/*==============================================================================
    Returns the name of a type in human-readable form. It likely only works 100%
    correctly with classes that don't belong to a namespace and is not very
    portable.
*/
template<typename T> string getTypeName()
{
    string s = typeid(T).name();
    int i= 0; 
    while ( (i < s.size()) && ('0' <= s[i] && s[i] <= '9') ) {
        i++;
    }
    return s.substr(i);
}

What if?

The issue with the input struct is that it has to store the arguments. What if an argument is called input? That would cause some compile errors. A fix is to call the struct _input, it is unlike TopCoder would name arguments starting with _. We can also just add _ to all the argument names. But a method that also reduces the amount of code is to name the arguments inside the input struct as just (p0, p1, p2, ...). Unfortunately, greed 1.5 didn't have an option to get the number / index of an argument. I actually had to do a pull request to add this . The next version of greed should have the feature, but you can also just get the latest source code from github and compile it. Or modify the test code template I'll attach soon so that it uses a _input struct

The result

This is how the c++ test code generated for HexagonalBoard looks like:

#include <bits/stdc++.h>
// SRM 593 - HexagonalBoard (250 points) by vexorian
using namespace std;

struct HexagonalBoard
{
    int minColors(vector<string> board)
    {
        return 0;
    }
};

// CUT begin
//------------------------------------------------------------------------------
const double CASE_TIME_OUT = 2.0;

bool disabledTest(int x)
{
    return 0 * x;
}
template<class I, class O> vector<pair<I,O>> getTestCases() { return {
    { { {"---","---","---"} }, {0} },
    { { {"-X--","---X","----","-X--"} }, {1} },
    { { {"XXXX","---X","---X","---X"} }, {2} },
    { { {"XX-XX--","-XX-XXX","X-XX--X","X--X-X-","XX-X-XX","-X-XX-X","-XX-XX-"} }, {3} },
    // Your custom test goes here:
    //{ { {}}, {} },
};}

//------------------------------------------------------------------------------
// Tester code:
    #include "../tester.cpp"
    struct input {
        vector<string> p0;

        int run(HexagonalBoard* x) {
            return x->minColors(p0);
        }
        void print() { Tester::printArgs(p0); }
    };
    
    int main() {
        return Tester::runTests<HexagonalBoard>(
            getTestCases<input, Tester::output<int>>(), disabledTest, 
            250, 1381061147, CASE_TIME_OUT, true
        );
    }
// CUT end

The extra code that is not solution and not test cases is very small now. I wonder if this is the smallest it can get. Probably not if it is possible to do the input with variadic templates.

Tweaks to the tester

While modifying the auto-generated codes, I also tweaked the tester. Remember the dilemma between compact and not compact code? I think the appeal of the compact code is as a report. This way things like output results kind of pollute the report. I turned the whole thing into a hybrid. First it runs each test case and the output shows everything including parameter lists and results and the test cases are separated by a big part. At the end though, a report is shown. The report has just the results and a debug mesage won't clip it. It looks like this:

(I also found out a perfect character and format to represent a disabled case (test 5 in the image is disabled).

By modifying the boolean value in the call to the tester, you can switch to the old format...

Get stuff

Saturday, October 05, 2013

Together we can send vexorian to the TCO 2013

Did you know that there was a video contest in TopCoder? There is, and the winner wins a trip to the TCO, and free trips to TCO are the best stuff you could ever win. If you didn't know about the contest, well, today is the deadline, so maybe it is not a good idea to find out through this post. I only found out yesterday while lurking in the forums.

Submitting a video is actually quite embarrassing, so I hesitated to try it. Eventually (tonight) though I said to myself: sure why not? I decided to give it a try. A little issue with that is the total lack of a camera or video editing skills or creativity or capability to humor.

But anyway here is the result:

The winner will be the one with the most likes in the video (*cough* vote *cough*).

SRM 593: Meh

The one with hexagon grid

You are asked to color some hexagons in an hexagon grid using the least number of colors in such a way that no two adjacent hexagons have the same color.

I messed up during the match, first I thought that the maximum number of colors to color it was 4. It turns out it was 3, which I found out too late. Under the assumption it was 4 I wasted some precious minutes. What is really sad is that I actually looked at this page: http://en.wikipedia.org/wiki/Hexagonal_tiling#Uniform_colorings, in the first minute after opening the problem and I still didn't notice.

Once I knew the maximum result was 3, then you have to verify if it is possible to color using 0, 1, or 2 colors.

I made the assumption that this depended only on the degree, which was wrong. If there are no hexagons to color, the result is zero. If the maximum degree (maximum number of must-paint hexagons adjacent to another must-paint hexagon) is 1 or 2, then we can paint them using 2 colors (because they form a collar or a line). My mistake was in assuming that degree = 3 meant that you needed 3 colors. There is a specific case for which this isn't true:

-X-
-XX
-X-

Too bad I didn't find it, I actually tried a few hexagons with degree 3 and convinced myself that the adjacent hexagons would always share an edge. If I didn't make this blunder, I would have just done a bipartite check. If the graph is bipartite, then you can paint using 2 colors. That's the solution.

The one with teams

This one was 450 but I really couldn't break its secret. I was trying a dynamic programming, but it wasn't working at all. Then the match ended.

Challenge phase

I knew that my 250 was slow, but I was also pretty sure it was tricky (I even prepared some cases, because there were no example cases in which the maximum degree was 1, for example). But I didn't have that much luck. The codes were very complicated to read. I did find one solution that wasn't even relying on degrees but on partial degrees. So I challenged it, and was happy, because the 50 pts would fix the slow submission. So I went to see the room summary and boom: My solution was already challenged. Probably if I noticed about my solution getting challenged I wouldn't have risked making that challenge...

Comments?

I liked the 250 until I learned checking for bipartiteness was mandatory. Too messy for the slot. 450 said "difference" when it should have specified absolute value of the difference, this made me waste time coding a wrong solution.

Friday, October 04, 2013

More answers to Quora questions.

Weeks ago I wrote a post: vexorian answers quora questions. It was a fun blog post to wire it, so I am going to answer some more:

What are the things you wish you knew when you started programming for the first time?

In general:

  • Speed sucks. There will be some situations in which speed matters, and it is good to have a language that is not like super ultra slow. But unless you have very specialized applications, low level speed should never be your primary objective. (high level optimization is cool though)
  • You'll spend more time reading your code than typing it. So make it readable for god's sake.

For "Competitive programming" :

  • There is no money in participating in contests. There is good money in working behind the scenes though.
  • You did not solve a problem until you code a passing solution. No, you didn't. No, just knowing what the solution is is not the same as solving the problem.
  • It is pointless to spend weeks trying to come up with solutions on your own. If you can't solve a (practice) problem after some hours, search for the solution.
  • Similarly, it is pointless to look for answers immediately after reading a problem's statement. Always give yourself a chance to at least try solving it.
  • You need a plugin to code in TopCoder. Really, you do.

(topcoder) Do you use the Coding Area to code your solution or a separate IDE?

IDEs are the spawn of Satan, but the arena is very limited. You need a plugin. Until recently I've been using a modded KawigiEdit. Lately I am using Greed. I use jEdit to edit code. Each programmer needs a specific set of editor features and in my case I prefer to have fewer features as big editors "get in the way".

How should I practice so that I will be at a level where I can approach TopCoder's Div1-500 problems with confidence?

Sink or swim. What I did when I noticed I was getting stuck in division 1 easy difficulty was to open the div 1 500 first. This way I was forced to solve a div1 medium every once in a while or else my rating would be destroyed. Plus I eventually got used to at least being able to try this slot. I know that this doesn't help your issue about [confidence], because you'll need to at least once try to solve a div1 500 during a match knowing that you don't have a confidence to solve them. But you don't really need the confidence.

Difficulty is not static though, with time problems became tougher and I have had to repeat the effort more than once.

Is it better to learn algorithms from a book or by solving online problems?

I've been meaning to say something about the Books topic in this blog for a while now. Books have not really helped me that much. Whenever people ask me about books, I bring the two closest things to a book that helped me, like Programming Challenger and CLRS. But the other day I learned that I don't even know the author's name for Programming Challenger.

In my case, I think that reading online resources (Editorials, tutorials, howtos, wikipedia pages, etc) and asking questions in forums was far, far more useful than books. Any good contest site should have a resource like that. Try to solve a problem, then after the contest, ask how to solve it, read editorials and any material suggested by people explaining the problem. But if you like getting theory from books, feel free to do it.

Why do a lot of successful competitive programmers not participate actively on CodeChef but participate often on sites like TopCoder and Codeforces?

I don't know about successful coders, but I can speak for myself.

  • I don't have time to participate in all contests released in a month, so I have to put priorities.
  • I choose TopCoder as my first priority, because the 75 minutes format is a lot more comfortable than longer format, and because I would likely have to solve those problems anyway because of my job.
  • Second priority is Codeforces because it is has a very competitive userbase. Although it sort of bores me some times.
  • Why Codeforces and not Codechef? The Codechef format is not helpful. Spending 15 days on a contest would really kill me and leave me with no time for other things. The cook-offs sound ok, but they happen only once a month, and on Sundays, and at a very bad time of the day for me.

How should programmers place their hands on the keyboard in order to increase the typing speed

Uh? Unless you have serious difficulty typing at a decent pace, you shouldn't care that much about your typing speed. Other factors are more important, like thinking of the algorithm you need to code.

If you do lack typing skill (For example, you need to see the keyboard while you are typing), then there are good resources to learn typing out there in the web and practice makes best. typeracer offers both a place where you can practice typing competitively and the forum is a good entry point to the part of the internet that cares about typing speed and has tips/etc.

I learned to type in school around 18 years ago. They made do typing in typewriters, if you got something wrong, you had to restart over. Yeah.

Do you find skills you achieved during programming contests to be useful outside of them?

I personally wouldn't have learned C++ nearly as well as I had without programming contests. Programming contests give you a very good opportunity not just to code solutions to known problems. but to make mistakes when coding those solutions. Because the problems are well-defined, we can run your solutions against a fixed-tester and find a plethora of implementation bugs. Constantly. This will teach you about what things can fail, what pitfalls your language has. What sort of mistakes you are likely to make.

Rant: My god, look at the answers to this question. Read this: Regarding the smug opposition to programming contests. Specially the anonymous guy who thinks adding a competitive layer to programming is to "prostitute" programming.

Programming contests are fun and that should be your first priority. There are, however, some nice skills you can learn through them. It is true that they are skills you can learn in other ways like making contributions to open source software, the demoscene, modding, making your own apps, building robots, etc*. To each their own, if you have more fun at competitive programming than in those areas, then programming contests are a good way for you to develop these skills. The trick to programming is to pick an area you are passionate about and develop your skills there. It is incredibly stupid to be afraid to pick areas of programming that do not have the approval from random people in question sites...

(*But of course not college! lol!. I am talking about skills that require creative coding ability, things college cannot really teach (sorry!) ) Speaking of college, while a good college should be better at teaching you CS theory, the programming contest world is probably a better place to learn CS theory than 99% of colleges.

What do you do when your code is not accepted but you don't find a test case that makes it fail?

If you can't really find any test case, then you are wasting your time solving problems in a judge site that doesn't share test cases. That is very bad for your productivity. Avoid those sites like the plague they are. While there are urban myths about how making you find the wrong test case by yourself will make you better at programming, you should consider that transparency is very importance. Have you stopped to think, what if? What if your code is actually correct? What if the judge was wrong? What if that's the reason you can't find a wrong test case? Transparency = guarantees. If somebody thinks he will get better by guessing the test case he is failing, he can just not take a look at the test cases.

If your problem is different and you actually know that your code is failing a case, but it is a very complicated one so it doesn't tell you much. Then you'll need to get creative.

  • First of all, why is it that apparently none of the simple test cases was wrong? Maybe your code has issues that are specific to large test cases? E.g: A memory write out of bounds of an array could be corrupting your memory. E.g.2: Overflow.
  • In topcoder, sometimes some large cases of the system tests are first in order to other, simpler cases. But since system tests finish running once they find a wrong test case, you cannot find out the simpler cases. What I do sometimes is manually pre-code the results to each complex case until I find a simple case that executes after them and is also wrong.
  • Read editorial / tutorial and try solving with the solution described in there . If the solution in the editorial is the same as yours then you probably need to inspect each line for possible errors. That is sometimes all I can do, verify each line, sometimes multiple times. Sometimes I fall sleep while doing it.
  • Ask in the forums. But please, help us help you. You need to explain your solution and why do you think it should be correct. If you don't have a proof that your solution should work, then it probably won't work. Probably its bugs are not small implementation mistakes but the whole algorithm is wrong. Sometimes it is better to just redo the work. If you have been having issues proving a solution is correct, then maybe you should try to prove it is wrong. Unfortunately this involves finding counter-examples and that's exactly what you are looking for.
  • If everything fails: Grab somebody else's correct code and generate random, small, test cases until you find one that makes your solution give a wrong solution. Maybe you will finally find something.
  • If that fails, give up.

Why is Python not allowed as one of the languages for ICPC?

I'd say some tournaments just can't keep up :/

How can I print 1 to 100 in C++ without a loop, GOTO or recursion?

Lambda power (requires c++11):

#include<functional>
#include<iostream>
using namespace std;

int x = 0;
void q()
{
    if (x != 100) {
        cout << (++x) << endl;
    }
}

// t(q())    returns a function that calls q() two times.
// t(t(q())) returns a function that calls t(q()) two times == q() four times
// ...
function<void()> t( function<void()> g) {
    return [&g] {
        g();g();
    };
}

int main()
{
    // call q() 128 times:
    t(t(t(t(t(t(t(q)))))))();
    return 0;
}

Laziness power:

#include <iostream>
using namespace std;

int main()
{
     cout << "Please type all numbers from 1 to 100 in a single line:"<<endl;
     string x;
     getline(cin, x);
     cout << x; //If user does his/her job correctly, this statement will print the numbers
     return 0;

}

Codeforces #204 (div 1) Recap

So, I could only solve 2 problems and I am not entirely sure of them.

Problem A - The one with rounding

Link to problem statement

The first thing is to notice that all numbers are guaranteed to have exactly three decimal points. From now on. we can just care of the decimal part (x[i] will be number from 0 to 999), and minimize the absolute value of the difference.

  • If x[i] is 0 (So it was a number that ends with .000, like 12.000), then no matter what we do, it will contribute 0 to the difference.
  • Else if we choose to round down the number, the difference increases by (-x[i]) / 1000.0 .
  • If we choose to round up, then the difference changes by (1000 - x[i]) / 1000.0.

n numbers have to be rounded up. If i of those numbers are zero, then the total difference will be: (1000 * (n - i) - sum(x[i]) ) / 1000.0. In other words, the only thing that determines the difference is the number of zeros that we round up. We can just brute force for this number and pick the best difference we find

// input:
string a[4000];
int n2; // n in the statement

// other:
int x[4000];

double solve()
{
    int n = n2 * 2;
    for (int i=0; i<n; i++) {
        istringstream st( a[i] );
        int z=-1, f=-1; char ch;
        st >> z >> ch >> f;
        x[i] = f; // only the modulo , the decimal digits matters
    }
    int res = 5000000;
    int z = count(x, x + n, 0);     // number of zeros
    int sum = accumulate(x, x+n, 0); // total sum
    
    // how many zeros will be ceil?
    for (int i = 0; i <= z && i <= n2; i++) {
        res = min(res, abs( (n2 - i) * 1000 - sum ) );  
    }
    // divide result by 1000
    return res / 1000.0;
}

During the match I first tried a badly-thought greedy algorithm. I even tested it comparing it with a bruteforce algorithm before submitting, but it still failed, apparently the bug with the lame greedy needed very specific input. But while coding the greedy algorithm I noticed the -x[i] and 1000-x[i], which helped me solve the problem correctly (assuming it passes)

Problem B - The one with permutations

Problem statement

The first thing I noticed was that result for the second example was a integer, that is too much of a coincidence. Somehow I decided that maybe the result is equal to (number of steps needed to sort the permutation without the random stuff) * 2 - 1. It turns out the estimation was wrong, but the idea was almost correct (I think).

Anyway, assume you have to do X steps in order to solve it correctly. After you make a step, your friend will throw a coin, and there is a 50% probability he will help you and a 50% probability he will make things worse. It appears to me that he helping you will always decrease the number of required steps by 1, and he not helping you will always increase the number of required steps by 1 (This in addition to the step you performed). The result is the following recurrence, 'f(x)` where x is the number of required steps.:

`f(0) = 0`
`f(1) = 1`
`f(x) = 2 + f(x-2)/2 + f(x) / 2`
`f(x) = 4 + f(x-2)`

We can just fill the recurrence iteratively.

int n;
int p[3000];


double solve()
{
    //a) cost to sort
    int res = 0;
    for (int i = n - 1; i >= 0; i--) {
        pair<int,int> mx = make_pair(-1, -1); 
        for (int j = 0; j <= i; j++) {
            mx = std::max(mx, make_pair(p[j], j));
        }
        res += i - mx.second;
        for (int j = mx.second; j < i; j++) {
            p[j] = p[j+1];
        }
        p[n-1] = mx.first;
    }
    // b) Recurrence:
    double dp[3];
    dp[0] = 0;
    dp[1] = 1;
    for (int i = 2; i <= res; i++) {
        dp[i % 3] = 4 + dp[ (i + 1) % 3 ];
        cout << i << " ; " << dp[i % 3] << endl;
    }
    
    return dp[res % 3];
}

Problem C : The one with parenthesis and cycles

Statement

I didn't have much time to code it (less than 10 minutes), but I think I found a solution.

It appears to me that the best parenthesis sequence has to be cyclic because the cost function is cyclic. When n is even, it should be of length n, when n is odd, it should be of length 2n. Since m is guaranteed to be even, we can half it.

So we just need to generate the cyclic parenthesis sequence. The cost of the sequence is easy to find because the costs are cyclic. Constraints are large in the case n is odd. But I am pretty sure we can use a meet in the middle approach so that the process is `O(2^n)`.