0

I'm new to the Tortoise and Hare algorithm by Floyd. Given an array of words such as ['cat', 'dog', 'cat', 'elephant']. Is it possible to use the idea of the Tortoise and Hare algorithm to find the duplicate word?

As I searched online, it seems to only possible with an array of number. If my question is not clear enough, please comment and I can try to provide more information.

Many thanks,

utopia
  • 322
  • 1
  • 4
  • 22
  • Have you tried? It doesn't seem logical for it only to be for an array of numbers. – TheChicken4452 Jul 25 '21 at 22:05
  • 1
    @KevinHwang I don't even understand how you'd use it for numbers. The algorithm is very specific to a problem with linked lists. – Mark Ransom Jul 25 '21 at 22:15
  • @MarkRansom I think I understand the algorithm wrong then. – TheChicken4452 Jul 26 '21 at 00:45
  • @MarkRansom I think you've misunderstood. The asker is asking if you could implement the Tortoise and Hare algorithm with an array of words instead of an array of numbers. I meant that if it worked with an array of numbers, why wouldn't it work with an array of strings? Here's an implementation for an array of integers (https://stackoverflow.com/questions/64561399/understanding-why-floyds-tortoise-and-hare-algorithm-works-when-applied-to-an-a) – TheChicken4452 Jul 26 '21 at 00:56
  • 2
    I don't see how T&H can be applies to finding duplicates. It detects cycles in linked structures. If there's a cycle, the hare pointer eventually "laps" the tortoise pointer so they point at the same element. nb: The Wikipedia page rather confusingly illustrates the algorithm with a diagram that looks like an array. Read the description carefully. That's not what it is. – Gene Jul 26 '21 at 01:05
  • 1
    @KevinHwang Please read that article more carefully. The conclusion is that finding duplicates (regardless of type) with T&H does not work. – Gene Jul 26 '21 at 01:07
  • @Gene My bad. I guess this question is the duplicate of the linked question then? – TheChicken4452 Jul 26 '21 at 01:31
  • @KevinHwang I meant that I don't see a way to make Tortoise and Hare work for a list of integers. The link you give doesn't have a working implementation. – Mark Ransom Jul 26 '21 at 03:06
  • 1
    @Gene. Actually it works to find duplicate in an array of numbers. If you have taken a look of the video on YT by Joma Tech. Seems to be a specific number array though – utopia Jul 26 '21 at 08:42
  • @utopia It works in the special case when the numbers are 0 to n for an n-element array. This makes the array equivalent to a linked list because every entry is an index back into the array. The OP is _not_ asking about this very special case. – Gene Jul 27 '21 at 01:32

1 Answers1

2

No, the Tortoise and Hare algorithm serves to find a cycle in a linked list, meaning that if you follow the links, you will eventually arrive at a node that was already visited. Note that this does not have anything to do with the values stored in the list. It isn't an algorithm to find duplicate values in a list. A list with duplicate values may very well have no cycles.

An array has by definition no cycles. If we want to describe an array in terms of links, then element links to element +1 and the last element in the array doesn't link to any other element: there is no cycle.

Array of numbers

it seems to only be possible with an array of numbers.

This must be about a specific use of an array, where it is used to encode a linked list that has nodes without values, but only next-pointers. In that case the values in the array are the links, i.e. the value at a certain index is the index of the next "node" in the list. Then indeed, this is just a different representation of a linked list that could have a cycle.

This would explain that it only works with numbers, as indexes are numbers. Moreover, the numbers would have to be integers in the range 0..-1. Here is an example:

[2, 3, 1, 0]
 

Here the "node" at index 0 links to the node at index 2, which in turn links to the node at index 1, which links to the node at index 3, which closes the cycle, and links to the node at index 0.

Note that the array happens to have no duplicate "values", but has a cycle.

If however, the array has a cycle that does not link back to index 0, then it will have a duplicate "value". For instance:

[1, 2, 3, 2]

Here the first node at index 0 is not part of the cycle,.. the path starting at index 0 enters the loop at index 2, which links to index 3, which links back to index 2, ...etc.

So, if the assignment is to find duplicate values where all the values can be interpreted as indices, then the Tortoise and Hare algorithm can be used. But be aware that:

  • this works only for specific arrays, i.e. where the values can all be interpreted as indices (or positions, or whichever 1-to-1 mapping you can define between a value and an index).
  • A cycle can be detected that links back to the very first element of the array, and in that case the array actually does not need to have a duplicate at all. Whether this cycle would be detected with the Tortoise and Hare algorithm, depends on the actual flavour of that algorithm. It can be designed to not detect a full cycle that links back to index 0, which is what you would want to go undetected.

Array of strings

Strings do not represent indexes, so the above does not work for strings.

The only things that come to mind is to have something like this:

[{ "value": "cat", "next": 1}, 
 { "value": "dog", "next": 2},
 { "value": "cat", "next": 3},
 { "value": "elephant", "next": 0}]

...which is using the index of the array again to establish the links.

Or an object (dictionary), like this:

 {
   "cat": "dog",
   "dog": "cat",
   "cat": "elephant",
   "elephant": "cat"
 }

...which essentially represents the same graph, but using strings as keys in the object representation of the list.

Conclusion

If the question is about an array with just string values, there is no way that the Tortoise and Hare algorithm can help to identify duplicate values. But there are specific structures where you could use number or string information to link nodes, but then these are not the values, but the links. Such arrays could at face value have no duplicates, but still represent a linked list that has a cycle. So also in that case, the Tortoise and Hare algorithm will not help to find duplicate values. It serves to find a cycle of linked nodes.

trincot
  • 317,000
  • 35
  • 244
  • 286