1

Lets say I have an array of integer elements, here I want to remove all the duplicate elements and print the remaining elements with out using any Java.util classes. I solved it using 2 pointers to scan and remove all the duplicates but which takes O(N^2). I just wanted to know is there any algorithm which can finish this task in O(N)?

Example:

Input Array:    [1, 2, 3, 4, 5, 4, 3, 4, 6]
Expected Array: [1, 2, 5, 6]
tkr_in
  • 135
  • 1
  • 9
  • write your own `Set` implementation with amortized `O(1)` behavior. – roippi Nov 04 '13 at 06:43
  • @roippi we can try by implementing our own collection but idea here is to not to depend on collection – tkr_in Nov 04 '13 at 06:56
  • If they are Java integers, they are fixed length so you can bucket- or radix-sort them in O(n). If they are Java arrays, the input is O(1) anyway because they have a constant length limit. That's silly, but then so is pretending that your real life computer is a theoretical one with limitless memory (but constant-size pointers) and other impossibilities. – harold Nov 04 '13 at 07:20
  • Here's an answer: http://stackoverflow.com/a/1533667/1400793. O(N), zero additional memory. – Paul Hankin Nov 04 '13 at 07:32
  • @Anonymous There are too many wishy-washy steps in the high-level description of that algorithm (e.g. pretty much the entire step 2) that you'd have to do an in-depth analysis of the code to see whether or not it's actually O(N) (and there's likely to be plenty of hash collisions, i.e. it's unlikely to be O(N)). – Bernhard Barker Nov 04 '13 at 08:05
  • The question is absurd. It uses collections already in what's provided and in the expected result, but asks not to use them in solution. This question has to ask about a specific language, where collection is defined to be something that doesn't include arrays. Additionally: can the elements of an array be sorted? What does it mean for the elements to be the same? –  Nov 04 '13 at 10:48
  • @wvxvw What I mean here that we should not use java.util classes to solve the problem or should not implement any collection to solve this. Sorting of elements doesn't matter for the given input array. – tkr_in Nov 04 '13 at 14:43
  • Probably, in the future, it'd help if you tagged the question with the corresponding language tag. You can't not implement any collection to solve this, because you already have implemented at least one. Your question is effectively asking how do I do X without doing X. Java makes random distinction between what is a collection and what isn't based on some bizarre naming scheme. If you wanted to make your question language-agnostic, you'd need to actually get down to describe the details of the mechanism you don't want to use. –  Nov 04 '13 at 15:13

3 Answers3

3

You could use buckets to achieve O(N) + C (with a HUGE C) but at the cost of storage

  1. Create an array of ints of MAX_INT size called bucket[MAX_INT]
  2. Loop through input array. If value is x, increment bucket[x]++;
  3. Loop through input array again. For every x if bucket[x] == 1, add into expected array.

the bucket[] array can be replaced with a better data structure. But it still achieves O(N)

lead_the_zeppelin
  • 2,017
  • 13
  • 23
  • If my input array size is 6 and lets first element is 200, then bucket[200] = 1 will work? – tkr_in Nov 04 '13 at 06:55
  • 1
    It's `O(N + C)`, not `O(N) + C`, and you can replace `MAX_INT` with the maximum value from the array (or the range of values without changing much). – Bernhard Barker Nov 04 '13 at 06:57
  • It is O(n) but at the cost of huge space. – Nishant Lakhara Nov 04 '13 at 07:05
  • To clarify - I'm saying, from a syntax perspective, `O(N) + C` doesn't make sense (nor do I really know what it is intended to mean). The one's a complexity class, the other's a constant - you can't add them. And we can say allocating memory for the `O(C)` array takes `O(C)`, thus the total running time is `O(N + C)`. – Bernhard Barker Nov 04 '13 at 07:08
1

I would use a Set.

for each item:
  if item is in the Set
    ignore it
  else
    copy it somewhere or output it or whatever
    add it to the Set

That should work for you.

Using a HashSet seems to have O(1) complexity for add and contains (Time complexity of set in Java)

Community
  • 1
  • 1
Mike T
  • 4,747
  • 4
  • 32
  • 52
  • 1
    How can we do with out using set/collection? – tkr_in Nov 04 '13 at 07:04
  • Sorry. I missed that part of your question. You could implement your own Hashed Set. It would probably only be 30-40 lines of code, but it would require some dynamic length memory which would then rely on some other Java.util classes (like an ArrayList) – Mike T Nov 04 '13 at 07:07
  • @MikeT 'Dynamic length memory' is trivial to implement with a self-written linked-list. – Bernhard Barker Nov 04 '13 at 07:16
  • @Dukeling: Indeed but that would mean O(n) insertions and O(n) lookups but O(1) is required – Mike T Nov 04 '13 at 07:38
  • Wait, I figured you were talking about the linked-lists required for separate chaining. If you were actually talking about the base array of the hash table, then note that this is resizing the hash table is somewhat more complicated than simply increasing the size of the underlying array (you have to rehash all the values), but, since you know the input array size to start, you can allocate enough memory for all the values from the start so resizing isn't necessary. – Bernhard Barker Nov 04 '13 at 07:48
  • I was suggesting this as a method to reduce the space though – Mike T Nov 04 '13 at 09:33
0

You can sort all the elements with any algorithm like merge sort, quick sort that has a complexity of n lg n. And then use a stack to add elements to it. keep on pushing elements to the stack until you find a duplicate. In case of duplicate elements pop from the stack.it will remove duplicate elements. The total complexity would be O(n lg n). This can be an efficient way to solve the problem.

Nishant Lakhara
  • 2,295
  • 4
  • 23
  • 46