Detecting whether a single linked list has a loop/cycle is a popular interview type question. The typical solution is to use two pointers and advance one faster than the other and see if the two pointers end up colliding with each other. Simple enough. However, what I found less obvious is finding out exactly where the loop starts. A Google search will reveal plenty of code and diagrams to the latter problem but it took me some time to really “get it”. This is my own explanation of the problem to serve as a reminder for myself, should I need it in the future.
Detecting the loop
To detect the presence of a loop in the linked list we use the following algorithm
- initialise the slow/fast pointer to the start of the linked list
- advance the slow pointer one node (next node)
- advance the faster pointer two nodes (next next node)
- if the two pointers point to the same location then we have detected a loop, algorithm terminates
- repeat 2 (unless the end of the linked list is reached by the faster pointer, in which case there is no loop)
Below is a cutesy example to illustrate the algorithm. The rabbit represents the fast pointer and the turtle the slow pointer. The blue boxes are the linked list nodes. This linked list has a loop from node 7 back to node 3.
Both the rabbit and turtle initially start at node 0. Applying the algorithm above they both end up at node 5. If you’re confirming my results by using your finger you may think I’ve stuffed up and believe that the correct answer should be a collision at node 3. If you follow the algorithm exactly, you will see that you check for the collision AFTER moving both pointers simultaneously (step 2 and 3). The rabbit will always travel twice the distance of the turtle. From the above example, we can see the turtle has traveled 5 nodes forward, and the rabbit has done 10. Using your fingers you can confirm they both end up at node 5.
Finding where the loop is
Now here is where thing’s get a little messy. After detecting the presence of a loop we want to know exactly where it occurs, that is the culprit loop node, should we want to remedy it. It’s easy when we can see the whole picture like the above diagram, but we don’t in practice.
First let’s start by introducing a few variables as shown below.
I’ve introduced variable A, B, C and L. Hopefully, the labels next to the variables are self explanatory. I’ve put values next to the variables for clarification, but in practice we won’t know them explicitly or need to. The aim is to find the variable A in terms of the others.
From the above diagram, we can see a relationship between A, B, C. Our first thought might be A + B = C but this does not hold for the rabbit because the rabbit goes through the loop at least once. We’ll need a definition to cater for the loop, like so
So far so good, except we don’t know B. L we can technically find but won’t have to, you’ll see why shortly.
Let’s introduce one more definition
Where did this one come from?? Let me explain. When the turtle reached node C, it has traveled A+B distance. The rabbit also reached C by traveling 2(A+B) distance. Let’s think about this for one moment. The rabbit traveled an extra (A+B) and still came back to the same node C. This implies the loop cycle length is (A+B). Let that sink in for a bit if it hasn’t already.
Okay, now that we got the two key definition let’s do something interesting. I’m going to take definition 1 and advance an extra A nodes to both sides.
A + modulus(B, L) = C (definition 1)
Advance A nodes on both sides
A + modulus(B + A, L) = (C advanced A nodes)
Substituting definition 2 for B + A gives
A + modulus(L, L) = (C advanced A nodes)
A + 0 = (C advanced A nodes)
A = (C advanced A nodes)
or an alternatively
(Node 0 advanced A nodes) = (C advanced A nodes)
The above result suggests that if we keep one slow pointer at node 0 and one slow pointer at C, and keep advancing them both one node at a time they will eventually collide at the same spot, namely node A. And indeed this is the actual algorithm to the problem. Try it below with the example linked list we’ve been using.
In practice you probably want to start the first turtle at node 1 instead of node 0, so that when the second turtle reaches node 7, its next pointer will point to node 3. This way you can detect that node 7 points to node 3 and possibly fix up the linked list by making node 7′s next pointer point to NULL.