Delete duplicates from Doubly linked list

Question

Hello I stumbled following question You given unsorted doubly linked list.You should find and delete duplicates from Doubly linked list.

What is the best way to do it with minimum algorithmic complexity?

Thank you.

Related question : http://stackoverflow.com/questions/4976765/how-will-you-delete-duplicate-odd-numbers-from-a-linked-list — jweyrich, May 04 '11 at 18:40
Releated question : http://stackoverflow.com/questions/1532819/algorithm-efficient-way-to-remove-duplicate-integers-from-an-array — Robᵩ, May 04 '11 at 19:11

score 7 · Accepted Answer · answered May 04 '11 at 18:39

7

If the space is abundance and you have to really optimize this with time, perhaps you can use a Hashset (or equivalent in C++). You read each element and push it to the hashset. If the hashset reports a duplicate, it means that there is a duplicate. You simply would delete that node.

The complexity is O(n)

answered May 04 '11 at 18:39

bragboy

34,892
30
114
171

1

The equivalent in C++ is the set<> container: http://www.cplusplus.com/reference/stl/set/ – Baltasarq May 04 '11 at 18:49
I think it is only O(n) with a perfect has function. Otherwise, you need to store colliding values on a list, increasing the complexity. – Robᵩ May 04 '11 at 19:15
@Rob: Yes.. that is why I said Space being abundance. For example if the input set of numbers are known between say 1 and 10000, we could use an array's index as the hashing algorithm here. Moreover, In datastructures terms a hash insert and get is always O(1). You cannot manipulate on the Big O further. – bragboy May 04 '11 at 19:20
1

@Rob Even with collisions a hash function is still effectively constant time in the number of elements in the hash (unless your has is like 95% full in which case you need a bigger table), it's just the `C` may be bigger. – Mark B May 04 '11 at 19:28
@Baltasarq - There is no equivalent container in C++03. Java's [`Hashset`](http://download.oracle.com/javase/1.4.2/docs/api/java/util/HashSet.html) has O(1) performance. [`std::set::insert`](http://www.cplusplus.com/reference/stl/set/insert/), in contrast, has O(log n). – Robᵩ May 04 '11 at 19:46
1

@Bragboy, @Mark, I was unfamiliar with Java `Hashset`, so came to the wrong conclusion about its performance. I stand by my claim that a hashed bucket list with a constant-size hash table has O(n) performance for find, insert, etc. But a `Hashset` has a dynamically-resized hash table, which is clearly O(1). – Robᵩ May 04 '11 at 19:49

score 4 · Answer 2 · answered May 04 '11 at 19:20

Think of it as two singly linked lists instead of one doubly linked list, with one set of links going first to last and another set going last to first. You can sort the second list with a merge sort, which will be O(n log n). Now traverse the list using the first link. For each node, check if (node.back)->key==node.key and if so remove it from the list. Restore the back pointer during this traversal so that the list is properly doubly linked again.

This isn't necessarily the fastest method, but it doesn't use any extra space.

Robᵩ · Answer 3 · 2011-05-09T13:58:10.007

3

Assuming that the potential employer believes in the C++ library:

// untested O(n*log(n))
temlate <class T>
void DeDup(std::list<T>& l) {
    std::set<T> s(l.begin(), l.end());
    std::list<T>(s.begin(), s.end()).swap(l);
}

edited May 09 '11 at 13:58

answered May 04 '11 at 19:21

Robᵩ

163,533
20
239
308

score 0 · Answer 4 · answered May 04 '11 at 18:37

0

With minimum complexity? Simply traverse the list up to X times (where X is the number of items), starting at the head and then delete (and reassign pointers) down the list. O(n log n) (I believe) time at worse case, and really easy to code.

answered May 04 '11 at 18:37

Tejs

40,736
10
68
86

2

X times X number of items makes it an O(n^2) worst case. – Ferruccio May 04 '11 at 18:41
This couldn't be a solution that has minimum complexity. Using a Hashset as in one of the other answers would definitely reduce the complexity. – Victor Zamanian May 04 '11 at 18:46
1

That's minimal coding complexity, not minimal algorithmic complexity which I'm sure is what the OP is after. – Mark B May 04 '11 at 18:57
Depending on your specific requirements, this isn't a bad solution. Its stable, requires no additional memory, and is trivial to implement. Given the original wording of the question, if it weren't for the incorrect complexity guess, I'd say this a reasonable answer. – Dennis Zickefoose May 04 '11 at 20:52

Delete duplicates from Doubly linked list

4 Answers4