This Haskell cicada was a Google Codejam competition problem way back in 2012 Codejam Round 1B 2012: Safety in Numbers. No, I did not participate. I’m not very good and I’m also not very fast, two things needed in coding competitions. I do like these competitions as a source of problems to tackle. .

Problem

There are nn contestants in a reality TV show. Each contestant is assigned a point value by the judges and receives votes from the audience. The point value given by the judges and the audience’s votes are combined to form a final score for the contestant, in the following way:

Let ss be the sum of the judge-assigned point values of all contestants. Now suppose a contestant got sis_i points from the judges, and that they received a fraction αi\alpha_i (between 00 and 11, inclusive) of the audience’s votes (αi\alpha_i might be, for example, 0.30.3). Then that contestant’s final score is ci=si+αisc_i=s_i+\alpha_i s. Note that the sum of all contestants’ audience vote fractions must be one.

The contestant with the lowest score is eliminated.

Given the points contestants got from judges, your job is to find out, for each contestant, the minimum percentage of audience votes they must receive in order for them to be guaranteed not to be eliminated, no matter how the rest of the audience’s votes are distributed.

If the lowest score is shared by multiple contestants, no contestants will be eliminated.

Solution

We have the popular vote distributed over the nn candidates. Each candidate gets fraction αi[0,1]\alpha_i \in [0, 1] of it, and the sum of the fractions add up to one.

i=0n1αi=1 \sum^{n-1}_{i=0} \alpha_i = 1

We also have the score from the judges. Each candidate gets score sis_i from the judges and

s=i=0n1si s = \sum^{n-1}_{i=0} s_i

is the sum of all the judge’s scores.

The final score of a candidate is ci=si+αisc_i = s_i + \alpha_i s.

For cic_i to be eliminated, it needs to be smaller than every cj,jic_j, j \ne i.

The interesting thing about this problem is that the final score of each candidate depends on a distribution of the total judge score according to audience votes. It means that the final scores are not independent of each other.

It seems useful to sort the candidates by the score they got from the judges:

s0s1sn1 s_0 \leq s_1 \leq \ldots \leq s_{n-1}

For a given candidate ii we need to figure out for each possible value of αi\alpha_i if there exists an audience vote distribution of the remaining 1αi1 - \alpha_i that would eliminate candidate ii. The possible values of αi\alpha_i are in the range [0,1][0, 1].

We define the following function for each candidate:

fi(αi)={1 candidate safe with αi0 candidate not safe with αi f_i(\alpha_i) = \begin{cases} 1 & \text{ candidate safe with } \alpha_i\\ 0 & \text{ candidate not safe with } \alpha_i \end{cases}

A candidate ii is safe with αi\alpha_i iff there is no distribution of the remaining audience votes 1αi1 - \alpha_i that would eliminate candidate ii. It seems we should consider the worst case distribution of the remaining audience votes. If the worst case doesn’t eliminate the candidate, then they are safe. The candidate score is ci=si+αisc_i = s_i + \alpha_i s. Let kk be the index such that

sk1ci<sk s_{k-1} \leq c_i < s_k

We define s1=s_{-1} = - \infty and sn=s_n = \infty for the two cases where cic_i is smaller than all judge scores or bigger than all judge scores. The worst case then is distributing the remaining 1αi1 - \alpha_i audience votes in such a way to lift the candidates 0,1,k10, 1, \ldots k-1 over candidate ii. It is the worst case because the candidates k,,n1k, \ldots, n-1 already have bigger total scores than candidate ii no matter the remaining distribution, so using it fully to lift over the others is the worst case. We get kk inequalities:

sj+αjs>ci,0j<k s_j + \alpha_j s > c_i, 0 \leq j < k

The worst case is if all remaining audience votes go to these 0,1,k10, 1, \ldots k-1 candidates:

j=0k1αj=1αi \sum_{j=0}^{k-1} \alpha_j = 1 - \alpha_i

We add left and right sides of the inequalities and get:

j=0k1sj+(1αi)s>kci \sum_{j=0}^{k-1}s_j + (1 - \alpha_i) s > k c_i

This gives us a criteria to compute fi(αi)f_i(\alpha_i) for each value of αi\alpha_i:

fi(αi)={1 if j=0k1sj+(1αi)skci0 if j=0k1sj+(1αi)s>kci f_i(\alpha_i) = \begin{cases} 1 & \text{ if }\sum_{j=0}^{k-1}s_j + (1 - \alpha_i) s \leq k c_i\\ 0 & \text{ if }\sum_{j=0}^{k-1}s_j + (1 - \alpha_i) s > k c_i \end{cases}

fif_i is a step function. Once it reaches one, it stays there because increasing αi\alpha_i decreases the remaining audience votes so there is even less of a chance to lift the losing candidates past candidate ii using audience votes.

The step can occur inside or outside interval [0,1][0, 1]. If it is outside then candidate ii is either always eliminated or always safe, no matter the audience votes.

Let’s write some code. Before we can compute fi(αi)f_i(\alpha_i) we need to compute the index kk where cic_i falls in the sorted sequence of judge scores sjs_j. The sequence is sorted, so we can use binary search. We might as well write a more general binary search, that finds the index at which a given predicate changes from False to True.

binarySearch :: (Int->Bool) -> Int -> Int -> Int
-- Preconditions:
-- predicate p has to be defined on domain [low, high)
-- low < high
-- predicate p has to satisfy: p(i) => (\forall low <= j < i: not p(j))
--                                        and
--                                      (\forall i <= j < high: p(j))
-- Return:
-- returns smallest index i with p(i),
-- if all values i in domain result in not p(i), then returns high
binarySearch p low _ | (p low) = low
binarySearch _ low high | (low + 1 == high) = high
binarySearch p low high = if (p m) then binarySearch p low m
                                   else binarySearch p m high 
        where m = low + (high - low) `div` 2

For the step function itself we need to consider the fact that the search space is continuous because it is the interval of real numbers [0,1][0, 1]. We can still use binary search but instead of an exact result we have to accept an approximation: we stop when we narrowed the interval where we know the step is to a desired precision.

findStep' :: Double -> (Double -> Bool) -> Double -> Double -> Double
findStep' d _ low high | (high - low) < d = (high - low) / 2
findStep' d p low high = if (p m) then findStep' d p low m
                                  else findStep' d p m high
         where m = low + (high - low) / 2

findStep :: Double -> (Double -> Bool) -> Double
-- Preconditions:
-- predicate p has to be defined on domain [0,1]
-- predicate p has to satisfy: p(i) => (\forall low <= j < i: not p(j))
--                                        and
--                                      (\forall i <= j < high: p(j))
-- returns x such that y \in (x - d, x + d) with y = infinum{z: p(z)}
findStep _ p | (p 0.0) = 0.0
findStep _ p | (not (p 1.0)) = 1.0
findStep d p = findStep' d p 0.0 1.0