8

Given an array of N integer such that only one integer is repeated. Find the repeated integer in O(n) time and constant space. There is no range for the value of integers or the value of N

For example given an array of 6 integers as 23 45 67 87 23 47. The answer is 23 (I hope this covers ambiguous and vague part)

I searched on the net but was unable to find any such question in which range of integers was not fixed. Also here is an example that answers a similar question to mine but here he created a hash table with the highest integer value in C++.But the cpp does not allow such to create an array with 2^64 element(on a 64-bit computer).

I am sorry I didn't mention it before the array is immutable

Anubhav Agarwal
  • 1,982
  • 5
  • 28
  • 40
  • 14
    Good Question. Your thoughts or attempts at the answer are missing though. – Alok Save Nov 24 '11 at 16:45
  • 2
    Found it! Can I keep it? – Kerrek SB Nov 24 '11 at 16:53
  • 5
    @Constantinius: Are you saying that this is an easy problem? I don't see a solution myself, and I don't think one exists. – interjay Nov 24 '11 at 17:01
  • *".But the cpp does not allow such to create an array with 2^64 element(on a 64-bit computer)."* What do you mean? – BrokenGlass Nov 24 '11 at 17:40
  • (And there's technically no way to do this if there's no bounds on the range of the integers. log-n always gets you, one way or the other.) – Hot Licks Nov 24 '11 at 17:40
  • @BrokenGlass the range of integers he's working with is not limited, and for a simple hash table where the integers are a key, you'd need (more than) 2^64 elements. – Seth Carnegie Nov 24 '11 at 17:41
  • You don't need such huge array to implement the hash table. – svick Nov 24 '11 at 17:42
  • @BrokenGlass -- I assume you're referring to the fact that he only needs (2^64)/8 bytes for a simple "scorecard" approach. But that's still like more bits than atoms in the known universe. Or something like that. – Hot Licks Nov 24 '11 at 17:44
  • @BrokenGlass If you would have read that link it says to create an array of lenght N, where N is the largest value an integer can take place.However in cpp (on 64 bit computer) the largest value an int can take is 2^32, but I cannot create an array of this size. – Anubhav Agarwal Nov 24 '11 at 17:45
  • @HotLicks this was not a homework, however an interview question – Anubhav Agarwal Nov 24 '11 at 17:46
  • @svick -- There's no limit on the number of integers. A hash table will need as many entries as there are integers in the list. Far less space-efficient than using a simple bit array. – Hot Licks Nov 24 '11 at 17:47
  • @svick then how would we implement the hash table – Anubhav Agarwal Nov 24 '11 at 17:47
  • The previous question was closed for being "rhetorical". But, by definition, homework and interview questions are "rhetorical". Seems like a "gotcha". – Hot Licks Nov 24 '11 at 17:52
  • Please provide an answer to this question or atleast reopen it. The previous question was closed on the arguments of being unambiguous and vague.I think this is not vague.The previous question has been closed so I guess people will not answer to that. – Anubhav Agarwal Nov 24 '11 at 18:00
  • To avoid the memory issue, you could generate your hashes and store them in a file, and then use a stream iterator to find what you need. Sounds feasible? – jweyrich Nov 24 '11 at 18:05
  • @Christian Rau: That other question has been reopened since your comment. – BoltClock Nov 24 '11 at 20:35
  • @BoltClock I know, that's why I said he should look there for the answers. – Christian Rau Nov 24 '11 at 20:48
  • 1
    possible duplicate of [Finding repeating signed integers with O(n) in time and O(1) in space](http://stackoverflow.com/questions/8208363/finding-repeating-signed-integers-with-on-in-time-and-o1-in-space) – Yakov Galka Nov 25 '11 at 06:21
  • possible duplicate of [Finding duplicates in O(n) time and O(1) space](http://stackoverflow.com/questions/5739024/finding-duplicates-in-on-time-and-o1-space) – dmckee --- ex-moderator kitten Nov 26 '11 at 03:31
  • possible duplicate of [Algorithm to find a repeated number in a list that may contain any number of repeats](http://stackoverflow.com/questions/6433414/algorithm-to-find-a-repeated-number-in-a-list-that-may-contain-any-number-of-rep) – PengOne Dec 01 '11 at 21:29

8 Answers8

9

Jun Tarui has shown that any duplicate finder using O(log n) space requires at least Ω(log n / log log n) passes, which exceeds linear time. I.e. your question is provably unsolvable even if you allow logarithmic space.

There is an interesting algorithm by Gopalan and Radhakrishnan that finds duplicates in one pass over the input and O((log n)^3) space, which sounds like your best bet a priori.

Radix sort has time complexity O(kn) where k > log_2 n often gets viewed as a constant, albeit a large one. You cannot implement a radix sort in constant space obviously, but you could perhaps reuse your input data's space.

There are numerical tricks if you assume features about the numbers themselves. If almost all numbers between 1 and n are present, then simply add them up and subtract n(n+1)/2. If all the numbers are primes, you could cheat by ignoring the running time of division.

As an aside, there is a well-known lower bound of Ω(log_2(n!)) on comparison sorting, which suggests that google might help you find lower bounds on simple problems like finding duplicates as well.

Syscall
  • 19,327
  • 10
  • 37
  • 52
Jeff Burdges
  • 4,204
  • 23
  • 46
  • The lower bound for a comparison sort is Ω(n log n). See chapter 8 in [Introduction to Algorithms](http://mitpress.mit.edu/algorithms/) for a proof. – Richard Povinelli Nov 25 '11 at 05:44
  • 1
    Yes, that's Stirling's approximation. – Jeff Burdges Nov 25 '11 at 05:50
  • @JeffBurdges the array is immutable – Anubhav Agarwal Nov 25 '11 at 05:57
  • 1
    @Jeff: Yep, you're right. I'm just not use to seeing written as Ω(log_2(n!)). Too much turkey, too late at night :-) – Richard Povinelli Nov 25 '11 at 06:00
  • @Anubhav Agarwal Fine, it's cheating declaring the k constant in O(kn) anyways, but Tarui proved no solution exists, so the only question is : How does one cheat? An interviewer who asks this probably means linear space when they say constant space, well unless you're interviewing for a professorship in complexity theory. – Jeff Burdges Nov 25 '11 at 06:21
  • The link to Tarui's paper doesn't work. Can you update the link? – D.W. Dec 01 '19 at 17:59
5

If the array isn't sorted, you can only do it in O(nlogn).

Some approaches can be found here.

Igor
  • 26,650
  • 27
  • 89
  • 114
4

If the range of the integers is bounded, you can perform a counting sort variant in O(n) time. The space complexity is O(k) where k is the upper bound on the integers(*), but that's a constant, so it's O(1).

If the range of the integers is unbounded, then I don't think there's any way to do this, but I'm not an expert at complexity puzzles.

(*) It's O(k) since there's also a constant upper bound on the number of occurrences of each integer, namely 2.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
3

In the case where the entries are bounded by the length of the array, then you can check out Find any one of multiple possible repeated integers in a list and the O(N) time and O(1) space solution.

The generalization you mention is discussed in this follow up question: Algorithm to find a repeated number in a list that may contain any number of repeats and the O(n log^2 n) time and O(1) space solution.

Community
  • 1
  • 1
PengOne
  • 48,188
  • 17
  • 130
  • 149
2

The approach that would come closest to O(N) in time is probably a conventional hash table, where the hash entries are simply the numbers, used as keys. You'd walk through the list, inserting each entry in the hash table, after first checking whether it was already in the table.

Not strictly O(N), however, since hash search/insertion gets slower as the table fills up. And in terms of storage it would be expensive for large lists -- at least 3x and possibly 10-20x the size of the array of numbers.

Hot Licks
  • 47,103
  • 17
  • 93
  • 151
2

As was already mentioned by others, I don't see any way to do it in O(n).

However, you can try a probabilistic approach by using a Bloom Filter. It will give you O(n) if you are lucky.

ruslik
  • 14,714
  • 1
  • 39
  • 40
0

We can do in linear time o(n) here as well

public class DuplicateInOnePass {

    public static  void duplicate()

   {
        int [] ar={6,7,8,8,7,9,9,10};
        Arrays.sort(ar);
        for (int i =0 ; i <ar.length-1; i++)
        {


            if (ar[i]==ar[i+1])
                System.out.println("Uniqie Elements are" +ar[i]);

        }  

    }

    public static void main(String[] args) {
        duplicate();
    }
}
SirPeople
  • 4,248
  • 26
  • 46
OSAAT
  • 1
0

Since extra space is not allowed this can't be done without comparison.The concept of lower bound on the time complexity of comparison sort can be applied here to prove that the problem in its original form can't be solved in O(n) in the worst case.

bashrc
  • 4,725
  • 1
  • 22
  • 49