17

I am trying to answer for the below question : You have an array of integers, such that each integer is present an odd number of time, except 3 of them. Find the three numbers.

so far I came with the brute force method :

 public static void main(String[] args) {
    // TODO Auto-generated method stub

    int number[] = { 1, 6, 4, 1, 4, 5, 8, 8, 4, 6, 8, 8, 9, 7, 9, 5, 9 };
    FindEvenOccurance findEven = new FindEvenOccurance();
    findEven.getEvenDuplicates(number);

  }

  // Brute force
  private void getEvenDuplicates(int[] number) {

    Map<Integer, Integer> map = new HashMap<Integer, Integer>();

    for (int i : number) {

      if (map.containsKey(i)) {
        // a XOR a XOR a ---- - -- - - odd times = a
        // a XOR a ---- -- -- --- - even times = 0
        int value = map.get(i) ^ i;
        map.put(i,value);
      } else {
        map.put(i, i);
      }
    }

    for (Entry<Integer, Integer> entry : map.entrySet()) {

      if (entry.getValue() == 0) {
        System.out.println(entry.getKey());
      }

    }
  }

It works fine but not efficient.

The o/p :

1
5
6
8

But the questions specifies we need to do this in O(1) space and O(N) time complexity. For my solution, the time complexity is O(N) but space also O(N). Can some one suggest me a better way of doing this with O(1) space ?

Thanks.

Pham Trung
  • 11,204
  • 2
  • 24
  • 43
Sarah
  • 403
  • 5
  • 16
  • 5
    "except 3 of them", and your example has 4 of them !?! –  Jan 27 '16 at 11:06
  • In fact the first statement conflicts with the code and output. So some solutions try to find three non-paired integers when other solutions show ways to find all integers except non-paired. Please, edit your question and specify **strictly** what do you want! – Edgar Rokjān Jan 27 '16 at 18:59
  • Since you have to iterate over the map again to retrieve the result would the time complexity not exceed O(N) ? Any how you could sort it in-place. The time would increase to n*log(n) or some variation of it but your space complexity would then reduce to zero ! – Ravindra HV Jan 27 '16 at 19:38
  • I sure hope the problem is not about digits (for any base fixed before N) - the example gives no clue. – greybeard Jan 31 '16 at 20:45
  • For measurements of what you _can_ do: [discussion of scalability](http://codereview.stackexchange.com/a/36870/93149). – greybeard Feb 03 '16 at 10:54

6 Answers6

4

I spent some time solving this problem. Seems that I found solution. In any case I believe, that community will help me to check ideas listed below.

First of all, I claim that we can solve this problem when the number of non-paired integers is equal to 1 or 2. In case of 1 non-paired integer we just need to find XOR of all array elements and it'll be the answer. In case of 2 non-paired integers solution becomes more complicated. But it was already discussed earlier. For example you can find it here.

Now let's try to solve problem when the number of non-paired integers is equal to 3.

At the beginning we also calculate XOR of all elements. Let's denote it as X.

Consider the i-th bit in X. I assume that it's equal to 0. If it's equal to 1 the next procedure is practically the same, we just change 0 to 1 and vice versa.

So, if the i-th in X bit is equal to 0 we have two possible situations. One situation is when all non-paired integers have 0 in the i-th bit. Another situation is when one non-paired integer has 0 in the i-th bit, and two non-paired integers have 1 in i-th bit. This statement is based on simple XOR operation properties. So we have one or three non-paired integers with 0 in the i-th bit.

Now let's divide all elements into the two groups. The first group is for integers with 0 in the i-th bit position, the second is for integers with 1 in the i-th bit position. Also our first group contains one or three non-paired integers with '0' in the i-th bit.

How we can obtain the certain number of non-paired integers in the first group? We just need to calculate XOR of all elements in the second group. If it's equal to zero, than all non-paired integers are in the first group and we need to check another i. In other case only one non-paired integer is in the first group and two others are in the second and we can solve problem separately for this two groups using methods from the beginning of this answer.

The key observation is that there's i such that one non-paired integer has i-th bit that differs from the i-th bits of the two other non-paired integers. In this case non-paired integers are in both groups. It's based on the fact that if there's no such i then bits in all positions in non-paired integers are similar and they are equal to each other. But it's impossible according to the problem statement.

This solution can be implemented without any additional memory. Total complexity is linear with some constant depending on the number of bits in array elements.

Edgar Rokjān
  • 17,245
  • 4
  • 40
  • 67
  • XOR implies a known number of bits, so that's not a solution for the infinite set of mathematical integers. It's a valid "computing" solution, but in that case I argue that the other solutions are also O(1), cf my answer, don't you agree ? – Ilya Jan 17 '16 at 10:20
  • 2
    @Ilya I think we can't solve such problems as purely mathematical problems. In practice we assume that *XOR* operation has *O(1)* complexity because the total number of bits is limited. I just want to show that we can solve this problem without any additional enormous arrays which sizes depend on the bits number... – Edgar Rokjān Jan 17 '16 at 10:54
  • I'm OK with that, but my main point is that the original "brute force" solution is *also* O(1) so it also a valid solution to the exposed problem. Don't you agree ? – Ilya Jan 17 '16 at 11:15
  • And NB in the "brute force" solution, the additional array is guaranteed to be smaller than the tested array, so not necessarily "enormous". – Ilya Jan 17 '16 at 11:21
  • @Ilya I agree with the statement that, strictly saying, OP uses *O(1)* additional memory because map size is limited my maximal integer value. I suppose that OP wants solution without additional arrays or maps. – Edgar Rokjān Jan 17 '16 at 11:23
  • @Ilya when I talked about enormous additional arrays I meant Lashane and SGM1 solutions. – Edgar Rokjān Jan 17 '16 at 11:26
  • 1
    Yes, maybe, she doesn't say. But my second point is that the OP solution is arguably better than the "improved" solutions in the answers. So if your solution works, I would rate them 1) Yours 2) OP 3) Lashane & SGM1. And all O(1) on the condition that bit numbers is fixed. – Ilya Jan 17 '16 at 11:27
  • I think your solution is probably correct (but not sure, my brain overheats somewhere in the middle), so I'll upvote it. Even if there's a mistake, it's a great effort. – Ilya Jan 17 '16 at 11:49
  • "If it's equal to zero, than all non-paired integers are in the first group and we need to check another i." ...And what happens if it's zero for all *i* ? – Ilya Jan 17 '16 at 12:27
  • @Ilya In the penultimate paragraph I explained that there's always *i* such that we can divide non-paired integers into the two separate groups. – Edgar Rokjān Jan 17 '16 at 12:55
  • @Ilya I enhanced this paragraph slightly – Edgar Rokjān Jan 17 '16 at 13:02
  • See also the solutions for k > 3, with a generalization of your algorithm up to k=6: http://stackoverflow.com/questions/11916062/find-the-k-non-repeating-elements-in-a-list-with-little-additional-space?rq=1 – Ilya Jan 17 '16 at 14:33
  • @Ilya Nice discussion. But these solutions look too complicated at first glance. I'll try to understand them a little bit later. Thanks. – Edgar Rokjān Jan 18 '16 at 12:44
  • Yes, to be frank some parts are flying way over my head, but it's interesting. And BTW in the case you're (mostly) agreeing with my answer, please upvote it. – Ilya Jan 18 '16 at 12:58
  • Another way to think about this is that the number of bits is >= log(n) (the number of elements in the array) under reasonable assumptions that the number of elements in the array is <= MAXINT. So this in essence is an n(log(n)) time solution. – Ben Jackson Jan 26 '16 at 01:30
  • 4
    I think this solutions finds integers with odd counts, not even. The linked answer finds numbers that are **not-repeating.** e.g. **odd**. The question asks for integers with even counts. – Ben Jackson Jan 26 '16 at 01:40
  • @BenJackson I reread the question and... It looks a bit controversial... In the first statement OP asks to find three distinct numbers, but then he posts duplicates as the result. Finding all duplicates seems unresolvable for me. Of course, my solution is for non-repeating numbers. – Edgar Rokjān Jan 26 '16 at 19:01
  • Now I got confused about the question stating _all odd but three_ myself :-/ – greybeard Jan 31 '16 at 20:33
  • @greybeard Seems that OP isn't interested even in editing her answer :( – Edgar Rokjān Jan 31 '16 at 21:46
  • To divide into groups based on i-th bit is causing O(n log n) time complexity. Good, but not OP's dream solution. – Christopher Oezbek Feb 01 '16 at 23:16
  • @ChristopherOezbek I assume that `XOR` is constant time operation. I doubt if OP's dream solution is possible under really strict restriction. My main goal was to decide how to avoid using additional memory and minimize total complexity. – Edgar Rokjān Feb 02 '16 at 06:39
4

Unfortunately it is not possible to achieve such a solution with O(1) space and O(n) complexity if we use a strict sense of space, i.e. O(1) space is bound by the max space used in the input array.

In a weak sense of space, where one arbitrary large Integer number does still fit into O(1), you can just encode your counter into the bits of this one integer. Start with all bits set to 1. Toggle the n-th bit, when you encounter number n in the input array. All bits remaining 1 at the end represent the 3 numbers which were encountered an even number of times.

Christopher Oezbek
  • 23,994
  • 6
  • 61
  • 85
  • 1
    Regarding your first comment, I believe that it is standard in complexity terms to understand "O(1) space" as "O(1) space _in excess of the input itself_". Otherwise, complexity classes like [L](https://en.wikipedia.org/wiki/L_%28complexity%29) would not make sense. Regarding your second comment, accessing arbitrarily large integers in this way is typically contrary to the standard RAM model used in complexity, where only integers of size log(n) can be accessed in unit time. – mhum Jan 27 '16 at 02:22
  • @Edgar: Yes, that would be the easy way. – Christopher Oezbek Jan 27 '16 at 10:06
  • @EdgarRokyan: Sorry to say but there is no solution to this problem that fulfills the O constraints given. IF the question would be the other way around: All but three integers exist an even number of times, then we could get a better solution (Still not O(1) space). – Christopher Oezbek Jan 27 '16 at 17:43
  • I think that I didn't read your solution carefully. You say explicitly about constraints. So it has no sense to consider arrays with large elements cause in this case additional large integer number doesn't fit into O(1) space. – Edgar Rokjān Jan 27 '16 at 19:10
  • However we solve different problems cause OP didn't specify carefully what she wants. Seems that I can solve this problem without large integers or additional arrays if we try to find three non-paired integers. – Edgar Rokjān Jan 27 '16 at 19:13
1

There's two ways to look at your problem.

The first way, as a mathematical problem with an infinite set of integer, it seems unsolvable.

The second way, as a computing problem with a finite integers set, you've already solved it (congratulations !). Why ? Because storage space is bounded by MAX_INT, independently of N.

NB an obvious space optimization would be to store the values only once, erasing the previous value for even counts, you'll gain half the space.

About the other answers by @Lashane and @SGM1: they also solve the "computing" problem, but are arguably less efficient than yours in most real-world scenarios. Why ? Because they pre-allocate a 512MB array, instead of allocating proportionaly to the number of different values in the array. As the array is likely to use much less than MAX_INT different values, you're likely to use much less than 512MB, even if you store 32bits for each value instead of 1. And that's with 32 bits integers, with more bits the pre-allocated array would grow exponentially, OTOH your solution only depends on the actual values in the array, so is unaffected by the number of bits of the system (i.e. max int value).

See also this and this for better (less space) algorithms.

Community
  • 1
  • 1
Ilya
  • 5,377
  • 2
  • 18
  • 33
  • We need some way to evaluate the practical complexity of algorithm, and thus we typically limit ourselves to non-infinite integers to do so. We define the largest possible integer (MAXSIZE). In this case, the operator XOR on integers <= MAXSIZE are considered to take O(1) time (or perhaps in some systems O(log(MAXSIZE)) time. Similarly, each integer <= MAXSIZE in is considered to take O(1) space (or perhaps O(MAXSIZE) space). It is standard practice to evaluate algorithms with these assumptions in place. – Ben Jackson Jan 26 '16 at 00:46
  • @BenJackson that's OK, I'm just saying than except for Edgar's proposal, all the solution were using O(MAXSIZE) in space, and the original proposal (ironically) probably using less space in practice. NB Edgar's solution was added after my first answer. – Ilya Jan 26 '16 at 15:54
1

consider for example the numbers allowed are of size 4 bits, which means the range of numbers allowed from 0 to 24-1 which is a constant number 16, for every possible input we run over all array and xor the occurrence of this number, if the result of xor is zero, we add current value to the overall result. this solution is O(16N) which is O(N) and use only one extra variable to evaluate the xor of current number which is O(1) in terms of space complexity.

we can extend this method to our original problem, but it will have a very big constant number in terms of run time complexity which will be proportional to the number of bits allowed in the original input.

we can enhance this approach by run over all elements and find the Most significant bit over all input data, suppose it is the 10th bit, then our run time complexity will become O(210N) which is also O(N).

another enhancement can be found in the below image, but still with the worst case complexity as discussed before. enter image description here

finally I believe that, there exist another better solution for this problem but I decided to share my thought.

Edit:

the algorithm in the image may not be clear, here is some explanation to the algorithm.
it start with the idea of trying to divide the elements according to there bits, in other words make the bits as a filter, at each stage xor the divided elements, until the xor result is zero, then it is worth to check this group one by one as it will for sure contain at least one of the desired outputs. or if two consultative filters result in the same size we will stop this filter, it will be more clear with example below.
input: 1,6,4,1,4,5,8,8,4,6,8,8,9,7,9,5,9
we start by dividing the elements according to the Least significant bit.
1st bit zero : 6,4,4,8,8,4,6,8,8
6 xor 4 xor 4 xor 8 xor 8 xor 4 xor 6 xor 8 xor 8 = 4
so we will continue dividing this group according to the 2nd bit.
1st bit zero and 2nd bit zero : 4,4,4,8,8,8,8
4 xor 4 xor 4 xor 8 xor 8 xor 8 xor 8 xor 8 = 4.
so we will continue dividing this group according to the 3rd bit.
1st bit zero and 2nd bit zero and 3rd bit zero : 8,8,8,8
8 xor 8 xor 8 xor 8 = 0
so we will go through every element under this filter as the result of xor is zero and we will add 8 to our result so far.
1st bit zero and 2nd bit zero and 3rd bit one : 4,4,4
4 xor 4 xor 4 = 4
1st bit zero and 2nd bit zero and 3rd bit one and 4th bit zero : 4,4,4
4 xor 4 xor 4 = 4.
so we will stop here as this filter contain the same size as previous filter
now we will go back to the filter of 1st and 2nd bit
1st bit zero and 2nd bit one : 6,6
6 xor 6 = 0.
so we will go through every element under this filter as the result of xor is zero and we will add 6 to our result so far.
now we will go back to the filter of 1st bit
1st bit one : 9,5,9,7,9,1,1
now we will continue under this filter as the same procedure before.
for complete example see the above image.

Mahmoud Emam
  • 151
  • 8
  • If I don't squint too hard, you are doing a "custom" counting sort. – greybeard Jan 27 '16 at 13:25
  • yes this is similar to counting sort, but i first think of it as trying to distribute the elements according to there bits, see the [image](http://i.stack.imgur.com/o8vTe.jpg) in the answer, this is what i come with first. – Mahmoud Emam Jan 27 '16 at 13:32
  • If you have numbers from 0 to 15, then O(16*n) is o(n^2). Just looking at the tree makes it clear that time complexity is not o(n). – Christopher Oezbek Feb 01 '16 at 23:05
  • @ChristopherOezbek the allowed numbers are from 0 to 15 but nothing said that repetition is not allowed, so you can have 1000 number but there values are in range 0 to 15. – Mahmoud Emam Feb 02 '16 at 04:12
0

Your outline of the problem and the example do not match. You say you're looking for 3 integers in your question, but the example shows 4.

I'm not sure this is possible without additional constraints. It seems to me that worst case size complexity will always be at least O(N-6) => O(N) without a sorted list and with the full set of integers.

If we started with sorted array, then yes, easy, but this constraint is not specified. Sorting the array ourselves will be too time or space complex.

Nick Kuznia
  • 1,698
  • 17
  • 27
-2

My stab at the an answer, using Lashane's proposal in slightly different way:

char negBits[268435456]; // 2 ^ 28 = 2 ^ 30 (number of negative integer numbers) / 8 (size of char)
char posBits[268435456]; // ditto except positive

int number[] = { 1, 6, 4, 1, 4, 5, 8, 8, 4, 6, 8, 8, 9, 7, 9, 5, 9 };

for (int num : number){
   if (num &lt 0){
       num = -(num + 1);// Integer.MIN_VALUE would be excluded without this + 1
       negBits[ &lt&lt 4] ^= ((num & 0xf) >> 1);
   }
   else {
       posBits[num &lt&lt 4] ^= ((num & 0xf) >> 1);
       // grab the rite char to mess with
       // toggle the bit to represent the integer value.
   }
}

// Now the hard part, find what values after all the toggling:

for (int i = 0; i &lt Integer.MAX_VALUE; i++){
    if (negBits[i &lt&lt 4] & ((i & 0xf) >> 1)){
        System.out.print(" " + (-i - 1));
    }
    if (posBits[i &lt&lt 4] & ((i & 0xf) >> 1)){
        System.out.print(" " + i);
    }
}

As per discussion in comments, below points are worth noting to this answer:

  • Assumes Java in 32 bit.
  • Java array have an inherent limit of Integer.MAX_INT
SGM1
  • 968
  • 2
  • 12
  • 23
  • 1
    I have the same objection here that I have to Lashane's answer. This for loop `for (int num : number)` must contain a counter that counts through the `N` different indices of the array, and assign the value to `num`. Even if you think `int` is of constant size, that counter must have size at least `log N` bits, or there is no way that the loop is possible. If fewer than `N` states are representable using the memory available then you can't keep track of what the next number is, or exit the loop at the correct time. – Chris Beck Jan 15 '16 at 21:27
  • Do you suppose that your solution use `O(1)` of additional memory? – Edgar Rokjān Jan 15 '16 at 21:28
  • @ChrisBeck this for loop should not contain counter, it uses iterator, which could internally use BIgDecimal counter with almost infinite length – Iłya Bursov Jan 15 '16 at 21:29
  • @Lashane, no that wouldn't even work, when it gets too big it would lose precision and then when you try to increment the counter it would discard your increments – Chris Beck Jan 15 '16 at 21:30
  • @ChrisBeck A counter is space complexity 1, u reuse the same variable so no addition space is used. I just didn't want to do ad full for loop for that one. – SGM1 Jan 15 '16 at 21:31
  • @SGM1: I don't know what you mean "reuse the same variable", please explain how you want to implement the for loop, because the naive way will use `log N` space at least – Chris Beck Jan 15 '16 at 21:36
  • "Assumes 32 bit" is changing the question (a lot). – Ilya Jan 15 '16 at 21:37
  • @ChrisBeck `for (int i = 0; i < number.lenth; i++)` instead of my for each loop. – SGM1 Jan 15 '16 at 21:38
  • @SGM1 Chris point is that number.length could exceed 2^31 (not in java, but in general) – Iłya Bursov Jan 15 '16 at 21:39
  • @SGM1, ok but then if the maximum represenatable `int` is less than `number.length` then you are in trouble, because you will loop infinitely or have an error depending on what language you are in – Chris Beck Jan 15 '16 at 21:40
  • @Ilya I agree, but this is the best I got, abuse the fact that is is still bound between min int and max int. – SGM1 Jan 15 '16 at 21:40
  • @ChrisBeck Ok now I see your complaint, I made the assumption this would be In Java. – SGM1 Jan 15 '16 at 21:42
  • 1
    There's no "Java" tag, is it Java-only ? – Ilya Jan 15 '16 at 21:43