For this Haskell cicada I’m dusting off a small problem that I used long time ago to practice Scala:

Given is a list of people at a party and for each person the list of people they know at the party. We want to find the celebrities at the party. A celebrity is a person that everybody at the party knows but that only knows other celebrities. At least one celebrity is present at the party.

Our goal is to find a useful property that will let us distinguish between celebrities and normies and the concept of “peer” will be useful for this distinction. But first lets model our party as a directed graph $G = (V, E)$. The vertices $V$ are the people at the party and the graph has an edge $(u, v) \in E$ iff person $u$ knows person $v$.

We say $u$ and $v$ are peers if they know each other, ie $(u, v) \in E$ and $(v, u) \in E$.

Let $K(u) = \{v \in V: (u, v) \in E\}$ be the set of people that $u$ knows. Let $P(u) \subset K(u)$ be the set of peers of $u$ and let $N(u) = K(u) \setminus P(u)$ be the set of non-peers of $u$.

Theorem 1: If $N(u)$ is not empty then $u$ is a normie.

Proof: If $N(u)$ is not empty then there exists a $v \in N(u)$ which means that $v$ doesn’t know $u$. So $u$ cannot be a celebrity.

Theorem 2: If $u$ is a normie, then all celebrities are in $N(u)$.

Proof: Since we are talking celebrities, $u$ must know them all (by definition of celebrity). But since $u$ is a normie, none of the celebrities can be a peer of $u$. So they are all in $N(u)$.

Armed with these two theorems we can form an algorithm: pick any person, compute their set of non-peers. If it is empty, we got lucky and picked a celebrity and we are done (the chosen person and their peers are the celebrities). If the set of non-peers is not empty then we have a normie at our hands and we can look for celebrities in its set of non-peers. Eliminating $u$ and their peers from the graph allows us to continue the search recursively on a smaller problem. We know the algorithm has to terminate and find the celebrities because we know that at least one celebrity is present.

So let’s do it in Haskell.

We need a directed graph and for this problem an adjacency list representation seems appropriate:

import Data.Array

type Vertex = Int

type DiGraph = Array Vertex [Vertex]

We’ll want a predicate that tells us if a person $u$ is known by another person $v$, ie $(v, u) \in E$:

knows :: DiGraph -> Vertex -> Vertex -> Bool
knows gr u v = elem u (gr!v)

We did it backwards because we will use currying to define a predicate to filter peers and non-peers of $u$.

Speaking of filtering lists according to predicates: we want this functionality so we might as well make it general:

partition :: Eq a => (a -> Bool) -> [a] -> ([a], [a])
partition p xs = foldr pf ([], []) xs
    where 
        pf x (ys, zs) = if (p x) then ((x:ys), zs) else (ys, (x:zs))

The function partition takes a predicate and a list and returns a tuple of lists: the list of elements satisfying the predicate and the list of elements not satisfying the predicate.

We use this function to split the set of people known to $u$ into peers and non-peers:

peersAndNonPeers :: DiGraph -> Vertex -> ([Vertex], [Vertex])
peersAndNonPeers gr u = partition (knows gr u) (gr!u)

We can now define the recursive algorithm outlined above:

findCelebsThru :: Vertex -> DiGraph -> [Vertex]
findCelebsThru u gr = if (null nprs) then (u:prs) else findCelebsThru (head nprs) gr
    where
        (prs, nprs) = peersAndNonPeers gr u

and the wrapper which picks a starting person for our search:

findCelebs :: DiGraph -> [Vertex]
findCelebs gr = findCelebsThru u gr
   where
    u = (fst . bounds) gr

That’s it. Worst case runtime is $O(n^2)$ because we can imagine a conga line of people at a party where everybody in the line only knows the person in front of them and the person at the head of the line. Partition is what contributes to the squared runtime. A simple observation here lets us convert to a linear runtime: we don’t need the peers and we can stop partitioning at the first non-peer we find (this optimization is left as an exercise to the reader).