63

I was asked this question in interview: "How to detect the loop in linked list?", I solved this but immediately the interviewer asked me how do I remove the loop in a linked list. I fumbled.

So any pointers on how to solve this, may be pseudo code, or method definition?

I'm comfortable with Java so I have tagged this question under java.

For Instance this linked list has a loop

 0--->1---->2---->3---->4---->5---->6
                  ▲                 |
                  |                 ▼
                 11<—-22<—-12<—-9<—-8
Algorithmist
  • 6,657
  • 7
  • 35
  • 49
SuperMan
  • 3,532
  • 12
  • 45
  • 49

7 Answers7

69

There are two parts to this problem:

  1. Detect if there is a loop in the list
  2. Identify the start of the loop

Once you know where the loop starts, it's easy to identify the last element in the list since it's the element in the list following the start of the loop that ends up pointing back to the start of the loop. It is then trivial to set the next pointer/reference of this element to null to correct the cyclic link list (not circular linked list which is where the last elements points back to the first - this would be a specific instance of cyclic lists).

  1. Floyd's cycle detect algorithm, also called the tortoise and hare algorithm as it involves using two pointers/references that move at different speeds, is one way of detecting the cycle. If there is a cycle, the two pointers (say p1 and p2) will end up pointing to the same element after a finite number of steps. Interestingly, it can be proved that the element at which they meet will be the same distance to the start of the loop (continuing to traverse the list in the same, forward direction) as the start of the loop is to the head of the list. That is, if the linear part of the list has k elements, the two pointers will meet inside the loop of length m at a point m-k from the start of the loop or k elements to the 'end' of the loop (of course, it's a loop so it has no 'end' - it's just the 'start' once again). And that gives us a way to find the start of the loop:

  2. Once a cycle has been detected, let p2 remain pointing to the element where the loop for the step above terminated but reset p1 so that it's pointing back to the head of the list. Now, move each pointer one element at a time. Since p2 began inside the loop, it will continue looping. After k steps (equal to the distance of the start of the loop from the head of the list), p1 and p2 will meet again. This will give you a reference to the start of the loop.

  3. It is now easy to set p1 (or p2) to point to the element starting the loop and traverse the loop until p1 ends up pointing back to the starting element. At this point p1 is referencing the 'last' element list and it's next pointer can be set to null.


Here's some quick and dirty Java code assuming a linked list of Nodes where a Node has a next reference. This could be optimized but it should give you the basic idea:

Node slow, fast, start;
fast = slow = head;

//PART I - Detect if a loop exists
while (true)
{
    // fast will always fall off the end of the list if it is linear
    if (fast == null || fast.next == null)
    {
        // no loop
        return;
    }
    else if (fast == slow || fast.next == slow)
    {
        // detected a loop
        break;
    }
    else
    {
        fast = fast.next.next; // move 2 nodes at at time
        slow = slow.next; // move 1 node at a time
    }
}

//PART II - Identify the node that is the start of the loop
fast = head; //reset one of the references to head of list

//until both the references are one short of the common element which is the start of the loop
while(fast.next != slow.next) 
{
    fast = fast.next;
    slow = slow.next;
}

start = fast.next;

//PART III - Eliminate the loop by setting the 'next' pointer 
//of the last element to null
fast = start;
while(fast.next != start)
{
    fast = fast.next;
}

fast.next = null; //break the loop

This explanation might help the why behind Part II:

Assume the length of the cycle is M, and the length of the rest of the linked list is L. Let's figure out what is the position in the cycle when t1/t2 first meet?

Define the first node in the cycle is position 0, following the links we have position 1, 2,..., up to M-1. ( when we walk in the cycle, our current position is (walk_length) mod M, right?) Suppose t1/t2 first meet at position p, then their travel time are the same, (L+k1*M+p)/v = (L+k2*M+p)/2v for some k1

So it concludes that if t1 start from p, t2 start from head and move at the same speed, then will grantee to meet at position 0, the first node of the cycle. QED.

More references:

Community
  • 1
  • 1
no.good.at.coding
  • 20,221
  • 2
  • 60
  • 51
  • 1
    I really liked learning from your answer, thanks for the thoroughness as well as the link. – TomHarrigan Apr 09 '11 at 20:42
  • 3
    I don't get this part under "until both the references are one short..." since they move at the same speed now, it seems like `fast.next` may *never* be `slow.next` (they chase each-other around the cycle forever). –  Apr 09 '11 at 20:47
  • @pst Note that `fast` is reset to the starting of the **list** - so they're not both traversing the loop - only `slow` moves inside the loop, `fast` is traversing the part of the list *outside* the loop, moving towards the start of the loop. – no.good.at.coding Apr 09 '11 at 20:50
  • 1
    @no.good.at.coding But if they don't meet where the loop starts then they will never meet. I don't see how it is guaranteed that they *do* meet there. –  Apr 09 '11 at 20:52
  • 1
    I'm not sure your `k` is guaranteed to be correct, as you cannot be certain _where_ in the loop the hare meets the tortoise. – Thorbjørn Ravn Andersen Apr 09 '11 at 21:03
  • @Dante, the trick is understanding that when the hare (which moves 2) and the tortoise (which moves 1) are BOTH in the loop, they will eventually at one point or another hit the same spot. The hare may go around more than once and then meet the tortoise again, but eventually they will. – Thorbjørn Ravn Andersen Apr 09 '11 at 21:09
  • @Thorbjørn It can be proven that `slow` and `fast` will meet at an element that is as far from the start of the loop, as the start of the loop is from the head of the list – no.good.at.coding Apr 09 '11 at 21:28
  • 2
    @no.good.at.coding Yeah, that was the part I was missing. A +1 for you, sir. –  Apr 09 '11 at 21:38
  • @pst Sorry I can't provide a more rigorous formal proof but math isn't my strongest suite! – no.good.at.coding Apr 09 '11 at 22:03
  • I've posted a response in an attempt to explain the meeting of the nodes. – bitxwise Apr 10 '11 at 08:03
  • You say, "the same distance from the start of the loop as the start of the loop is from the head of the list." This is wrong. It should be "same distance to the start of the loop as the distance from the head of the list to the start". Basically the distance you want is from the meeting point to the start of the loop, not from the start of the loop to the meeting point. – Chris Hopman Apr 11 '11 at 05:20
  • @Chris You're right, of course. I phrased that rather ambiguously and I see now that from could be interpreted as 'from the start of the loop to the current point' when instead, I was intending to convey 'from this point, to the start of the loop, while traversing the list in the forward direction (i.e. the same as so far)'. I'll fix it. Thanks for pointing it out! – no.good.at.coding Apr 11 '11 at 16:05
  • What if k>m? its not working if k=7 and total number of node is 9. – Parag Bafna Mar 04 '13 at 09:05
  • 1
    For Part III in the code, can't we just do a slow.next = null, instead of iterating through the loop again? – Faux Pas Feb 21 '14 at 00:29
  • Excellent way of explanation..clean, precise and good use of name for variables to make it readable and understandable. – Dipesh Gupta Mar 02 '14 at 12:19
16

Solution 1 - courtesy of Career Cup and "Cracking the Coding Interview" book:

public static LinkedListNode findStartOfLoop(LinkedListNode head) {
    LinkedListNode n1 = head;
    LinkedListNode n2 = head; 

    // find meeting point using Tortoise and Hare algorithm
    // this is just Floyd's cycle detection algorithm
    while (n2.next != null) { 
        n1 = n1.next; 
        n2 = n2.next.next; 
        if (n1 == n2) { 
            break; 
        }
    }

    // Error check - there is no meeting point, and therefore no loop
    if (n2.next == null) {
        return null;
    }

    /* Move n1 to Head. Keep n2 at Meeting Point.  Each are k steps
    /* from the Loop Start. If they move at the same pace, they must
     * meet at Loop Start. */
    n1 = head; 
    while (n1 != n2) { 
        n1 = n1.next; 
        n2 = n2.next; 
    }
    // Now n2 points to the start of the loop.
    return n2;
}

The explanation for this solution is straight from the book:

If we move two pointers, one with speed 1 and another with speed 2, they will end up meeting if the linked list has a loop. Why? Think about two cars driving on a track; the faster car will always pass the slower one!

The tricky part here is finding the start of the loop. Imagine, as an analogy, two people racing around a track, one running twice as fast as the other. If they start off at the same place, when will they next meet? They will next meet at the start of the next lap.

Now, let’s suppose Fast Runner had a head start of k meters on an n step lap. When will they next meet? They will meet k meters before the start of the next lap. (Why? Fast Runner would have made k + 2(n - k) steps, including its head start, and Slow Runner would have made n - k steps Both will be k steps before the start of the loop ).

Now, going back to the problem, when Fast Runner (n2) and Slow Runner (n1) are moving around our circular linked list, n2 will have a head start on the loop when n1 enters. Specifically, it will have a head start of k, where k is the number of nodes before the loop. Since n2 has a head start of k nodes, n1 and n2 will meet k nodes before the start of the loop.

So, we now know the following:

  1. Head is k nodes from LoopStart (by definition)
  2. MeetingPoint for n1 and n2 is k nodes from LoopStart (as shown above)

Thus, if we move n1 back to Head and keep n2 at MeetingPoint, and move them both at the same pace, they will meet at LoopStart

Solution 2 - courtesy of me :)

public static LinkedListNode findHeadOfLoop(LinkedListNode head) {

    int indexer = 0;
    Map<LinkedListNode, Integer> map = new IdentityHashMap<LinkedListNode, Integer>();
    map.put(head, indexer);
    indexer++;

    // start walking along the list while putting each node in the HashMap
    // if we come to a node that is already in the list, 
    // then that node is the start of the cycle 
    LinkedListNode curr = head;
    while (curr != null) {

        if (map.containsKey(curr.next)) {
            curr = curr.next;
            break;
        }
        curr = curr.next;
        map.put(curr, indexer);
        indexer++;
    }
    return curr;
}

I hope this helps.
Hristo

Hristo
  • 45,559
  • 65
  • 163
  • 230
  • 1
    I see same problem with this as with no.good.at.coding -- the "while n1 is different than n2" may not terminated because there is no guarantee that n1 and n2 will ever be equal because "n1 starts at head", but n2 "starts somewhere the rabbit and hair met in the cycle". If they don't meet at the loop itself then they will both get stuck in the cycle chasing each-other at the same speed. Because the lead-up to the cycle differs and the cycle-length differs, not sure there is a guarantee that distance head -> cycleNode = distance meetingNode -> cycleNode. –  Apr 09 '11 at 21:26
  • However, I *am failing at coming up with a counter-case*, please help! :p (See no.good.at.coding's answer and links which explains why I can't find a counter-case ;-). A +1 for this answer as well. Same reasoning. –  Apr 09 '11 at 21:37
  • I'm just going to quote the interview book I read and paste their explanation to **Solution 1** marked above. – Hristo Apr 09 '11 at 21:41
  • Your solution (2) seems to only work if the linked list has unique items. – bitxwise Apr 10 '11 at 07:19
  • @bitxwise... I guess you're right. suggestions on improvement? – Hristo Apr 10 '11 at 07:21
  • 1
    @Hristo - Your approach relies on the uniqueness of list items so it cannot truly address the existence of a loop or cycle. The only true uniqueness of non-unique item VALUES would be the memory address of the objects representing those items or perhaps an ID of a non-primitive object containing the value. Since Java doesn't allow you to see machine address (I think), you'd have to go with the latter. This is also because (I think) Java's CONTAINS method uses an class's EQUALS method, which compares the hash code of an object and not the memory address. – bitxwise Apr 10 '11 at 08:14
  • @Hristo - Change your Map to be an IdentityHashMap and it should work better. – bitxwise Apr 10 '11 at 08:32
  • @bitxwise... thanks for the suggestion. I made the change above. I only changed the one line where I declare my Map. Everything else should be the same right? – Hristo Apr 10 '11 at 19:46
  • @Hristo - Yeah, what you have should work. It's more resource intensive than the tortoise/hare algorithm but it should work. – bitxwise Apr 10 '11 at 20:30
  • @bitxwise... I agree that it has higher space complexity, which is why I included the first solution. But I also think its an intuitive solution so I wanted to include it :) thank you! – Hristo Apr 10 '11 at 20:45
6

This response is not meant to compete for the answer, but rather to explain a little more on the meeting of the two nodes in the tortoise and the hare algorithm.

  1. Both nodes will eventually enter the loop. Because one moves faster (F) than the other (S), (F) will eventually lap (S).

  2. If the loop's start is at the list's head, (F) must meet (S) back at the list's head. This is ONLY because (F)'s speed is 2X (S)'s; if it was 3X this then would not be true. This is true because (F) completes one lap when (S) completes half a lap, so when (S) completes its first lap, (F) has completed two laps - and is back at the start of the loop with (S).

  3. If the loop's start is NOT at the list's head, then by the time (S) enters the loop, (F) has had a head start of (k) nodes in the loop. Because (S)'s speed is only one node at a time, it will meet (F) at (k) nodes from the loop's start - as in, (k) more steps before reaching the start, NOT (k) steps AFTER the start. This would NOT be true if (S)'s speed was not one and the speed ratio was not 2:1 between (F) and (S).

    3.1. This is where it gets a little tricky to explain. We can agree that (F) will continue lapping (S) until they eventually meet (see 1 above), but why at (k) nodes from the loop's start? Consider the following equation where M is the number of nodes or distance of the loop and k is the head start (F) had; the equation represents distance traveled by (F) given time t on the left in terms of distance traveled by (S) on the right:

    d_F(t) = 2 * d_S(t) + k

    So when (S) enters the loop and has traveled 0 distance in the loop, (F) has traveled only the (k) distance. By the time d_S = M - k, d_F = 2M - k. Because we also have to use modular math in consideration that M represents the total distance of a single lap in the loop, the POSITION of (F) and (S) at any whole M (no remainder) is 0. So then in terms of POSITION (or the differential), this leaves k (or rather, -k).

    And so finally, (S) and (F) will meet at position (0 - k), or (k) nodes away from the start of the loop.

  4. Given [3] above, as (k) represents the head start (F) had, and as (F) had traveled 2X the distance (S) traveled to enter the loop from the head of the list, (k) also represents the distance from the list's start, which then represents the start of the loop.

It's a bit late here so I hope I've articulated effectively. Let me know otherwise and I'll try and update my response.

bitxwise
  • 3,534
  • 2
  • 17
  • 22
  • Bitxwise.. neat, but care to add code, like method definition ? – SuperMan Apr 10 '11 at 18:30
  • @SuperMan - no.good.at.coding's answer includes a code example but he had difficulty explaining why the algorithm actually works (i.e. why the 2 nodes are guaranteed to meet at a particular point that indicates the beginning of the loop). I was merely adding my 2 cents on why/how the tortoise/hare algorithm works. no.good.at.coding's code example could definitely be cleaner and perhaps I can add a cleaner coding example later - but to no.good.at.coding's credit, he, himself, admitted the code example was quick and dirty. – bitxwise Apr 10 '11 at 20:34
5

If using an identity hash map (such as IdentityHashMap) is permissible this is terribly easy to solve. It does use more space, though.

Traverse the nodes list. For each node encountered, add it to the identity map. If the node already existed in the identity map then the list has a circular link and the node which was prior to this conflict is known (either check before moving or keep the "last node") -- simply set "next" as appropriate to break the cycle.

Following this simple approach should be a fun exercise: code is left as an exercise for the reader.

Happy coding.

  • 1
    Maybe finally this will turn out to be the only way. But I just do not want to give up too soon. XD – Dante May Code Apr 09 '11 at 20:54
  • 1
    @Dante Jiang It's not the only way. no.good.at.coding is onto something and his approach can be adapted. Once the cycle is detected, keep running the rabbit/hair, but this time, build up the list which is the *inverse* of where the rabbit/hair met the 2nd time (each time), if care is taken to ensure the rabbit/hair don't meet at the same place, this new list will be smaller *and will include the cycle node* until the list will be a cycle length of one (or two). If two, walk from head until this cycle is intercepted (gives exact node), then keep walking until the node before that node... –  Apr 09 '11 at 21:06
  • Well, I was wrong. Both the answers with the rabbit/hair cycle detection work. It's a curious property of *where* they are guaranteed to meet if finding a cycle when both starting from head (try to work out a counter-case like me ;-). In any case, the above solution will still work, even if not ideal for this. –  Apr 09 '11 at 21:39
  • Can't we we it using a normal HashMap ? – rai.skumar Jan 13 '14 at 07:11
3
 0--->1---->2---->3---->4---->5---->6
                  ▲                 |
                  |                 ▼
                 11<—-22<—-12<—-9<—-8  

Insert dummy node after every node of link list and before inserting check that node next to next is dummy or not. If next to next is dummy, insert null in next of that node.

 0-->D->1-->D->2-->D->3->D-->4->D-->5->D-->6
                     ▲                      |
                  /                         ▼
                 11<—D<-22<—D<-12<—D<-9<—D<--8 


Node(11)->next->next == D
Node(11)->next =null
Parag Bafna
  • 22,812
  • 8
  • 71
  • 144
0
//Find a Loop in Linked List and remove link between node

    public void findLoopInList() {
            Node fastNode = head;
            Node slowNode = head;
            boolean isLoopExist = false;
            while (slowNode != null && fastNode != null && fastNode.next != null) {
                fastNode = fastNode.next.next;
                slowNode = slowNode.next;
                if (slowNode == fastNode) {
                    System.out.print("\n Loop Found");
                    isLoopExist = true;
                    break;
                }
            }
            if (isLoopExist) {
                slowNode = head;
                Node prevNode = null;
                while (slowNode != fastNode) {
                    prevNode = fastNode;
                    fastNode = fastNode.next;
                    slowNode = slowNode.next;
                }
                System.out.print("Loop Found Node : " + slowNode.mData);
                prevNode.next = null; //Remove the Loop
            }
        }

:)GlbMP

-2

Easiest and unique way

To solve this problem we just count no of nodes (that's it). I bet you haven't seen this solution till now in any competitive website, because i made it today on my own...

void removeTheLoop(Node *root)
{
    std :: unordered_set < struct Node * > s;
    if(root == NULL)
        return ;

    s.insert(root);
    int before_size = s.size();

    while(1)
    {
        if(root -> next == NULL)
            return;
        s.insert(root -> next);
        if(before_size == s.size())
        {
            root -> next = NULL;
            return;
        }
        before_size = s.size();
        root = root -> next;
    }
}

How it works:

TIME COMPLEXITY: O(n)...SPACE COMPLEXITY: O(n)

  • It simply counts the no of elements. We will take unordered_set in c++.
  • It inserts the element if it is not present in the container and increases its size.
  • Now the suspense begins when the node comes that point to the node that already added , so in that case size doesn't increase and we will make next to that as NULL.

UPVOTE IT IF YOU THINK IT IS UNIQUE.

DecPK
  • 24,537
  • 6
  • 26
  • 42