find the duplicate number with O(1) space and O(n) time

Question

I'm solving a question in leetcode

Given an array nums containing n + 1 integers where each integer is between 1 and n (inclusive), prove that at least one duplicate number must exist. Assume that there is only one duplicate number, find the duplicate one in O(n) time and O(1) space complexity

class Solution(object):
    def findDuplicate(self, nums):
        """
        :type nums: List[int]
        :rtype: int
        """
        xor=0
        for num in nums:
            newx=xor^(2**num)
            if newx<xor:
                return num
            else:
                xor=newx

I got the solution accepted but I have been told that it is neither O(1) space nor O(n) time.

can anyone please help me understand why?

This is probably a duplicate of [Find repeating in O(n) and constant space](https://stackoverflow.com/questions/9072600/find-repeating-in-on-and-constant-space) and/or [How to find time complexity of an algorithm](https://stackoverflow.com/questions/11032015/how-to-find-time-complexity-of-an-algorithm) or [Time complexity of power()](https://stackoverflow.com/questions/5231096/time-complexity-of-power) — Bernhard Barker, Sep 14 '17 at 15:03

score 6 · Answer 1 · answered Sep 14 '17 at 15:05

6

Your question is actually hard to answer. Typically when dealing with complexities, there's an assumed machine model. A standard model assumes that memory locations are of size log(n) bits when the input is of size n, and that arithmetic operations on numbers of size log(n) bits are O(1).

In this model, your code isn't O(1) in space and O(n) in time. Your xor value has n bits, and this doesn't fit in a constant memory location (it actually needs n/log(n) memory locations. Similarly, it's not O(n) in time, since the arithmetic operations are on numbers larger than log(n) bits.

To solve your problem in O(1) space and O(n) time, you've got to make sure your values don't get too large. One approach is to xor all the numbers in the array, and then you'll get 1^2^3...^n ^ d where d is the duplicate. Thus you can xor 1^2^3^..^n from the total xor of the array, and find the duplicate value.

def find_duplicate(ns):
    r = 0
    for i, n in enumerate(ns):
        r ^= i ^ n
    return r

print find_duplicate([1, 3, 2, 4, 5, 4, 6])

This is O(1) space, and O(n) time since r never uses more bits than n does (that is, approximately ln(n) bits).

answered Sep 14 '17 at 15:05

Paul Hankin

54,811
11
92
118

Another method would be to xor all the values and if n is not 2^x-1, xor the result with n+1, n+2, ... until you reach a power of two minus 1, I think it would require less xor operations, but isn't that elegant. – maraca Sep 14 '17 at 15:23
Computing xor 1..n seems inefficient. Another solution on the same theme in any language with numbers following modular arithmetic (like unsigned in C) is to add the numbers and then subtract their sum (which is ((unsigned)n)*(n+1)/2). – Hans Olsson Sep 14 '17 at 15:29
It's not `n/log(n)` memory locations, it's `n/k` where `k` is the number of bits per location. – Mark Ransom Sep 14 '17 at 15:29
1

@HansOlsson this has the disadvantage that the sum can get too big, with the xor solution you just need as many bits as n. – maraca Sep 14 '17 at 15:32
1

@HansOlsson similarly, you can compute the value of 1^2^3...^n quickly, based on n mod 4. http://www.geeksforgeeks.org/calculate-xor-1-n/ . I computed it explicitly because it's easier to explain and doesn't change the complexity. – Paul Hankin Sep 14 '17 at 16:12
1

@MarkRansom Yes, but the Transdichotomous model assumes the memory locations are of size ~lg(n). – Paul Hankin Sep 14 '17 at 16:14

sascha · Answer 2 · 2017-09-14T15:03:45.987

0

Your solution is not O(1) space, meaning: your space/memory is not constant but depending on the input!

newx=xor^(2**num)

This is a bitwise XOR over log_2(2**num) = num bits, where num is one of your input-numbers, resulting in a log_2(2**num) = num bit result.

So n=10 = log_2(2^10) = 10 bits, n=100 = log_2(2^100) = 100 bits. It's growing linearly (not constant).

It's also not within O(n) time-complexity as you got:

an outer loop over all n numbers
- and a non-constant / non O(1) inner-loop (see above)
- assumption: XOR is not constant in regards to bit-representation of input
  - that's not always treated like that; but physics support this claim (Chandrasekhar limit, speed of light, ...)

edited Sep 14 '17 at 15:03

answered Sep 14 '17 at 14:08

sascha

32,238
6
68
110

10 numbers will result in 11 bits, not 1024 bits. And 100 numbers would be 101 bits. The space complexity is linear, not exponential. – interjay Sep 14 '17 at 14:18
10 bits can express 1024 numbers, the number of bits is not as big as you say, only the calculated number to xor with. – maraca Sep 14 '17 at 14:18
You're on the right track with this answer, but your calculation of the number of bits required is way off. Let me know when you fix it. – Mark Ransom Sep 14 '17 at 14:23
XOR is actually constant (not dependent on the bit pattern), except that it grows as the size of the integers grow. – Mark Ransom Sep 14 '17 at 14:57
My term *bit-representation* includes the length (which i suppose is used like that in complexity theory too). – sascha Sep 14 '17 at 14:59

score 0 · Answer 3 · answered Aug 26 '21 at 13:35

This question has to be solved with linked list Floyd's algorighm.

Convert the array to a linked list. There are n+1 positions but only n values.

For example if you have this array: [1,3,4,2,2] convert it to linked list.

How the pointing works

Starting from index 0, look at which element in that position. it is 1. Then index 0 will point to nums1. 0 is pointing 3. then figure out which value 3 is pointing to. That will nums[3] and so on.

Now you converted this to linked list, you have to use Floyd's hare and tortoise algorithm. Basically you have two pointers, slow and fast. If there is cycle, slow and fast pointers are gonna meet at some point.

from typing import List
class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        # slow and fast are index
        slow,fast=0,0
        while True:
            slow=nums[slow]
            fast=nums[nums[fast]]
            if slow==fast:
                break
        # so far we found where slow and fast met.
        # to find where cycle starts we initialize another pointer from start, let's name is start
        # start and slow will move towards each other, and meeting point will be the point that you are looking for
        start=0
        while True:
            slow=nums[slow]
            start=nums[start]
            if slow==start:
                return slow

Notice none of the elements after first index ever points to value at index 0. because our range is 1-n. We are tracking where we point to by nums[value] but since no value will be 0, nothing will point to nums[0]

score 0 · Answer 4 · answered Dec 19 '22 at 11:39

0

You can find the xor of all the number in the array (lets call it x) and then calculator xor of the number 1,2,3,....,n (lets call it y). Now, x xor y will be your answer

answered Dec 19 '22 at 11:39

Gaurav Jha

161
1
7

find the duplicate number with O(1) space and O(n) time

4 Answers4