Introduction

In this Haskell cicada we dissect a clever algorithm for finding a cycle in a sequence of values generated by a function from a finite set to itself. I used this algorithm in a math note without further explaining it or its Haskell implementation.

Before we dive in, let us set the stage. Given is a finite set $S$ and a function

$$ f: S \mapsto S $$

We also choose a starting point $x_0 \in S$ and define the following sequence:

$$ \begin{aligned} x_0 & \\ x_1 & = f(x_0) = f^1(x_0) \\ x_2 & = f(x_1) = f(f(x_0)) = f^2(x_0) \\ x_3 & = f(x_2) = f(f(f(x_0))) = f^3(x_0) \\ \ldots \\ x_n & = f(x_{n-1}) = f^n(x_0), \forall n \in \mathbb{N} \end{aligned} $$

The preconditions that $S$ is finite and $f$ is a function are strong preconditions. Using them we can show that any sequence defined as above will have a (potentially zero-length) prefix before it ends in a cycle which it cannot escape anymore. The sequence draws the Greek letter $\rho$ when laid out as a functional graph. Speaking of a functional graph, we should introduce it because it is useful for the cycle finding problem: A functional graph of $f: S \mapsto S$ is a directed graph where the nodes are the elements of $S$. If $u, v \in S$ and $f(u) = v$ then the directed edge $(u, v)$ is an edge of the functional graph.

Here is an example: $S =\{🎀, 🎧, 🎷, 🎸, 🎻, πŸ₯, πŸ””, πŸͺ—, πŸ“»\}$.

$$ \begin{aligned} f(🎀) & = 🎷 \\ f(🎧) & = πŸͺ— \\ f(🎷) & = 🎻 \\ f(🎸) & = 🎷 \\ f(🎻) & = 🎸 \\ f(πŸ₯) & = πŸ”” \\ f(πŸ””) & = πŸ”” \\ f(πŸͺ—) & = 🎸 \\ f(πŸ“») & = πŸ”” \end{aligned} $$

The corresponding functional graph is

Canvas 1Layer 1🎀🎧🎷🎸🎻πŸ₯πŸ””πŸͺ—πŸ“»

The sequence starting with $x_0 = 🎧$ is $🎧πŸͺ—πŸŽΈπŸŽ·πŸŽ»πŸŽΈπŸŽ·πŸŽ»πŸŽΈπŸŽ·πŸŽ»\ldots🎸🎷🎻\ldots$.

Looking at just those nodes in the functional graph, we can see the Greek letter $\rho$ appearing.

Canvas 1backgroundgraph🎧🎷🎸🎻πŸͺ—letter

Let’s first prove that a cycle exists. Assume there is no cycle. We define the function $g: \mathbb{N} \mapsto S$ with $g(n) = f^n(x_0)$. If there is no cycle, then for any $n \neq m$ we have $f^n(x_0) \neq f^m(x_0)$, so $g$ is injective and $g(\mathbb{N}) \subset S$ is infinite which is a contradiction to $S$ being finite.

Once you’re in the cycle, there is no way out. That is because $f$ is a function, so every node in the functional graph has out-degree equal to one (by the definition of a function, $f$ associates one and only one element of $S$ to each element of $S$). No node in the cyce can point to the next node in the cycle and also a node outside of the cycle for an escape path.

Let $\mu$ be the index of the first element of the cycle, ie the smallest integer for which $x_\mu$ is in the cycle. Everything before $x_\mu$ is the little appendix of the letter $\rho$ in the functional graph. Let $\lambda$ be the length of the cycle, ie the smallest integer offset at which elements in the cycle repeat, so for example $x_\mu = x_{\mu + \lambda}$, but $x_\mu \neq x_{\mu + i}$ for any $1 \leq i < \lambda$.

We are ready to state our goal: given $S$, $f$ and $x_0$ we want to find $\mu$ and $\lambda$.

Floyd’s Algorithm

Pretend that we can find the value $x_{k \lambda}$ for some large enough $k \in \mathbb{N}$ such that it is in the cycle (remember, we don’t know $\lambda$ or $k$ but pretend that we somehow were able to find out $x_{k \lambda}$). How would that help us ? Assume we move two pointers in the sequence, one starting from $x_0$ and the other starting from $x_{k \lambda}$. Assume we move these two pointers one step at a time by applying $f$ to each.

So the tuple of pointers starts at $(x_0, x_{k \lambda})$. After the first step it will be $(x_1, x_{k \lambda + 1})$. After the second step it will be $(x_2, x_{k \lambda + 2})$ and after $i$ steps it will be $(x_i, x_{k \lambda + i})$. The following holds:

$$ \begin{align} x_i & \neq x_{k \lambda + i}, \forall 1 \leq i < \mu \\ x_\mu & = x_{k \lambda + \mu} \end{align} $$

The reason (1) holds is because by definition of $\mu$, it is the smallest index where the value is in the cycle, so nothing before it can be equal to any other value in the sequence. The reason (2) holds is by definition of $\lambda$, offset by any multiple of $\lambda$ lands you in the same spot in the cycle (convince yourself using the functional graph representation).

But (1) and (2) give us a great way to find $\mu$. Keep moving the pointer tuple until the two values in the tuple are equal. That gives us $\mu$ assuming we can find $x_{k \lambda}$ as we promised in the beginning of this section. How do we accomplish that ?

The key observation is this: $x_i = x_{k \lambda}$ if and only if $x_i = x_{2 i}$. To see this for the $(\Rightarrow)$ direction, offset the equation $x_i = x_{k \lambda}$ by $i$ (ie apply $f$ $i$ times to both sides of the equation) getting $x_{i + i} = x_{k \lambda + i}$. But on the left side of this new equation we have $x_{2 i}$ and on the right side we have $x_i$ (because a multiple of $\lambda$ gets you back where you are in the cycle, so $x_{k \lambda + i} = x_i$). For the $(\Leftarrow)$ direction we reverse the same equations.

This key observation lets us send two pointers into the sequence (the tortoise and the hare), both starting at $x_0$. At each iteration, one pointer (the tortoise) moves forward only one step in the sequence and the other pointer (the hare) moves forward two steps. So at each iteration the tortoise is at $x_i$ and the hare is at $x_{2 i}$. Whenever the values are equal we also know we found the value at a multiple of $\lambda$ index and we have fullfilled our promise of finding a $x_{k \lambda}$.

Ok, this gives us $\mu$ and from there it is simply walking the cycle to find $x_\mu$ again which gives us $\lambda$.

Imperative implementation

The imperative implementation is a straightforward translation of the strategy from the previous section.

Here it is in Python (copied from the wikipedia page):

def floyd(f, x0):
    # Main phase of algorithm: finding a repetition x_i = x_2i.
    # The hare moves twice as quickly as the tortoise and
    # the distance between them increases by 1 at each step.
    # Eventually they will both be inside the cycle and then,
    # at some point, the distance between them will be
    # divisible by the period Ξ».
    tortoise = f(x0) # f(x0) is the element/node next to x0.
    hare = f(f(x0))
    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(f(hare))
  
    # At this point the tortoise position, Ξ½, which is also equal
    # to the distance between hare and tortoise, is divisible by
    # the period Ξ». So hare moving in circle one step at a time, 
    # and tortoise (reset to x0) moving towards the circle, will 
    # intersect at the beginning of the circle. Because the 
    # distance between them is constant at 2Ξ½, a multiple of Ξ»,
    # they will agree as soon as the tortoise reaches index ΞΌ.

    # Find the position ΞΌ of first repetition.    
    mu = 0
    tortoise = x0
    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(hare)   # Hare and tortoise move at same speed
        mu += 1
 
    # Find the length of the shortest cycle starting from x_ΞΌ
    # The hare moves one step at a time while tortoise is still.
    # lam is incremented until Ξ» is found.
    lam = 1
    hare = f(tortoise)
    while tortoise != hare:
        hare = f(hare)
        lam += 1
 
    return lam, mu

Functional implementation in Haskell

The functional implementation (from wiki.haskell.org) looks a little intimidating at first glance:

findCycle :: Eq a => [a] -> ([a],[a])
findCycle xxs = fCycle xxs xxs
  where fCycle (x:xs) (_:y:ys)
         | x == y              = fStart xxs xs
         | otherwise           = fCycle xs ys
        fCycle _      _        = (xxs,[]) -- not cyclic
        fStart (x:xs) (y:ys)
         | x == y              = ([], x:fLength x xs)
         | otherwise           = let (as,bs) = fStart xs ys in (x:as,bs)
        fLength x (y:ys)
         | x == y              = []
         | otherwise           = y:fLength x ys

We can extract the helper functions and promote them to top-level functions. This will enable us to play with each one and annotate it with type information to better understand what it does.

Consider fLength:

fLength :: Eq a => a -> [a] -> [a]
fLength x (y:ys)
    | x == y              = []
    | otherwise           = y:fLength x ys
fLength _ [] = []

It looks like given an element and a list, it returns the prefix of that list up to (but not including) that element. A quick evaluation in ghci confirms it:

ghci> fLength 3 [1,2,3,4,5]
[1,2]

Now let’s look at fStart:

fStart :: Eq a => [a] -> [a] -> ([a], [a])
fStart (x:xs) (y:ys)
    | x == y              = ([], x:fLength x xs)
    | otherwise           = let (as,bs) = fStart xs ys in (x:as,bs)
fStart _ [] = ([], [])
fStart [] _ = ([], [])

The function takes in two lists and returns a tuple of lists. It handles two cases. The first case is when the heads of the two lists are equal. In this case it returns an empty list as the first member of the tuple and for the second member of the tuple it calls fLength, searching for the head in the tail of the first list. It sure looks like it searches for another occurrence of $x_\mu$ at the point where $\mu$ was found and we need to find the extent of the cycle. To recap, in this case it returns a tuple consisting of an empty list and the list representing the cycle.

The second case is when the heads of the two lists are not equal. In this case fStart recursively calls itself with the tails of the two lists and then returns a tuple whose first member is accumulating what is not equal to the second list from the first list. The second member of the tuple just munches through the second list.

So it seems that overall fStart takes care of computing the prefix and the cycle and assumes that the first input list is at the start of the sequence and the second input list is at $x_{k \lambda}$.

The last helper function is fCycle:

fCycle :: Eq a => [a] -> [a] -> [a] -> ([a], [a])
fCycle xxs (x:xs) (_:y:ys)
    | x == y              = fStart xxs xs
    | otherwise           = fCycle xxs xs ys
fCycle xxs _      _        = (xxs,[]) -- not cyclic

Ignore the first parameter. It is there as an artifact of the promotion to top-level function. The other two parameters are two lists: the tortoise and the hare. The function munches through the tortoise one element at a time and through hare two elements at a time. Looks familiar ? It indeed does the first part of the Floyd algorithm, advancing the tortoise and the hare until their heads are equal, in which case it calls fStart with just the right arguments to satisfy the contract of fStart: first argument to fStart is the initial list $x_0$ and second argument is where $x_{k \lambda}$ is.

And there you have it. In a way it’s a direct translation of the imperative program, but instead of pointers or indices into lists, it uses the actual lists and traverses them through deconstruction. The helper functions are set up in such a way that the right values are in scope and results from one helper can be transferred as an input into the next helper to continue the computational steps: fCycle calls itself recursively until its base case is reached ($x_{k \lambda}$ is found) and in the base case expression it calls fStart which in turn calls itself recursively until its base case is reached ($\mu$ is found) at which point it calls fLength to compute the last part, namely $\lambda$.