Thursday, November 28, 2013

SRM 598: Dirty tricks

Another week, another Topcoder SRM.

Div1 250: The one with bins.

We have up to 50 items of different weights `100 <= "item"[i] <= 300`, we have to put these items in bins that have a maximum capacity of 300. Which means that the total weight of the items in a bin cannot exceed 300. Return the minimum number of bins we can use to store all items.

It is unusual for an input to have a lower bound. The items' weight is at least 100. This means that a bin can hold at most 3 items. More so, if a bin ever holds 3 items, all 3 items must have weight 100. So we can just treat weight 100 items separately.

There is a case: `{100, 100, 100, 100, 100, 100, 200, 200, 200}` that shows that the optimal strategy is not always to put the 100-weight items in as many bins with 3 items as possible. However, the number of bins holding 3 items is finite and at most 50/3, we can just try all possible numbers of 3-item bins that we will use. 0, 1, 2, ... `h/100` (where `h` is the number of items of weight 100). Once you do this, all of the remaining bins will contain at most 2 items. We should maximize the number of pairs of items with total capacity <= 300.

int optimal2(vector<int> item)
{
    // solve if a bin can only hold at most 2 items.
    int cost = item.size();
    sort(item.begin(), item.end());

    // Each bin can now have at most 2 items
    int i = 0, j  = item.size() - 1;

    // Maximize the number of pairs of items that can enter a single bin: 
    while (i < j) {
        // simply match the smallest item with the largest item possible:
        while ( (j > i) && (item[j] + item[i] > 300) ) {
            j--;
        }
        if ( (j > i) && (item[i] + item[j] <= 300) ) {
            cost--;
            i++;
            j--;
        } else {
            break;
        }
    }
    return cost;

}

int minBins(vector<int> item)
{
    vector<int> not100;
    int count100 = 0;
    for (int x: item) {
        if (x != 100) {
            not100.push_back(x);
        } else {
            count100++;
        }
    }
    // It is possible for a bin to hold 3 items, but they must all be 100.
    int res = item.size();
    for (int a = 0; a <= count100 / 3; a++) { // number of bins that hold 3 items
        // the rest of the bins can hold at most 2 items:
        vector<int> tem = not100;
        for (int i = 0; i < count100 - a * 3; i++) {
            tem.push_back(100);
        }
        int cost = a + optimal2(tem);
        res = std::min(res, cost);
    }
    
    return res;
}

I made plenty of mistakes while coming up to this solution. Coded a wrong solution that failed test cases (the test case I mentioned above). Coded another wrong solution that passed all test cases, but I was very suspicious about it and could notice its mistake. I had the feeling this problem was going to have many wrong solutions and I was right. Unfortunately, during the challenge phase I remembered I am unable to read other people's codes. Many of the codes seemed overly complicated and possibly wrong, but I had no idea how to find a good case for them.

Div1 550

It seemed complicated, I eventually decided to test my luck and skip the div1 550 to open the hard problem.

Div1 950: The one with cities

You have a country/set of n (at most 50) cities. The cities have connections and these connections makes the graph of cities a tree (how suspiciously convenient). The plan is to put beacons in some of the cities so that, when evaluating the list of distances from a city to each of the beacons, the distances make a sequence that is a unique identifier for the city.

A basic `O(2^n)` solution would be to just try each subset of cities and place beacons on it and pick the smallest subset that allows the unique identification to work. It occurred to me that with backtracking and some cleverness, maybe we can optimize this solution to run fast enough and give a correct answer at least most of the time. I expected it to be quite hard to make system tests for this problem.

So I made some backtracking algorithm. All cities begin with beacons on them and we decide from city 0 to n-1 whether or not we remove the beacon from it. It is only possible to remove a beacon if all the remaining beacons allow the identification. Note that when all cities have beacons, the identification works just fine, because each city is the only city with a distance of 0 to its beacon. When doing the backtracking, make sure not to remove a beacon if it no longer allows identification of cities.

The backtracking wouldn't run in time. But my dirty trick idea was to make the backtracking search stop after a high number of steps. So that we use 2 seconds of time to do the search and return the best result we found in those limited 2 seconds.

It wasn't good enough, what if the optimal set of beacons to remove happens to be a set of the last indexed items? My new idea was to make this an unlikely event, my algorithm would randomly modify the positions of the input cities, lowering the chances of a crafted test case of occurring. Now, this approach would be wrong if there was a case that required a perfectly unique set of beacons to be removed, so the probability would be low. I had another trick idea: Spend half of the time in a greedy idea, giving priority to nodes with little edges so we can remove them.

Implementing the dirty tricks was not easy and took me quite a while. So much that in fact the only code I could submit was incomplete. It only did the randomized idea and it still only allocated half of the execution time for it. The randomized idea seemed to be returning a wrong result for a large case I had, so I sort of suspected it would either fail in the challenge phase or in the system tests.

Unfortunately, that didn't happen, and my very dirty brute force algorithm passed system tests. Good for my rating.

Comments?

No comments :