5

So I failed the programming interview question that is like

"Given an array of ints 1, 2, ..., n with one of them missing, find the missing one."

The interviewer said the correct answer is to sum the numbers and subtract the sum from n(n+1)/2, that is, apply the formula https://en.wikipedia.org/wiki/1_%2B_2_%2B_3_%2B_4_%2B_%E2%8B%AF

enter image description here

and said that any computer science student would've done this. My solution was like

char takenSpots [] = n*malloc(sizeof(char)); 
for (int k = 0; k < n; ++k) takenSpots[arr[k]-1] = 'x';
for (int k = 0; k < n; ++k) if (takenSpots[k] != 'x') return (k+1);

which isn't as "cool" as the summation solution that I confess I would've never thought to try.

First of all, isn't there danger of overflow using the summation method? I mean, what if arr contains ~((int)0) and ~((int)0) - 1 ? Then won't arr[0] + arr[1] + ... + arr[n-1] overflow? Or will the solution still work since 1 + 2 + ... + n overflows too?

user5124106
  • 393
  • 2
  • 3
  • 11
  • 1
    `for(i = 1; i < n; i++){ if(arr[i - 1] != arr[i] - 1){ printf("Found it: %i\n", arr[i] - 1); } }` or am I interpreting the question wrong? – Kninnug Aug 15 '15 at 20:55
  • `char takenSpots [] = n*malloc(sizeof(char));` is several compiler errors. – melpomene Aug 15 '15 at 20:55
  • 2
    @Kninnug: It may not be clear from the question, but the idea is that the integers are not necessarily in order. – psmears Aug 15 '15 at 20:56
  • One of your `takenSpots[k]` is uninitialized. – melpomene Aug 15 '15 at 20:57
  • @psmears Ah, that makes sense. They should've phrased that though. – Kninnug Aug 15 '15 at 20:57
  • 2
    Well, you should definitely start by initializing all the entries in the `takenSpots` array! – barak manos Aug 15 '15 at 21:02
  • From a practical point of view, the interviewer is completely wrong. He asked about a special case that you will never, ever encounter in practice, with a solution that only applies to this very special case. But there is a whole huge class of problems that you will actually encounter are solved sometimes with a bit map, sometimes using a set, so I would have been more than happy with the "takenSpots" solution if it was implemented correctly, because that shows the candidate can solve problems that may actually happen in real life. – gnasher729 Aug 15 '15 at 21:55
  • @Kninnug I had exactly the same idea, also not realizing they could be out of order. That would be a lot faster than summing them all. But with them out of order I cannot think of a better way than the interviewer's answer. – WDS Aug 16 '15 at 07:34
  • 1
    Why would that be faster? n/2 comparisons on average, n worst case. If you assume that you have 1 to n with one missing, you can do a binary search in O (log n). Now _that_ is faster. – gnasher729 Aug 16 '15 at 08:13
  • This question http://stackoverflow.com/questions/7153659/find-an-integer-not-among-four-billion-given-ones has some answers that go into detail about the special case of only one missing integer. It also has some correctly-implemented (unlike the OPs) taken-map solutions (mostly with bitmaps), and some clever and space-efficient alternatives to one-pass bitmaps. – Peter Cordes Aug 16 '15 at 08:27

4 Answers4

8

Indeed, the solution the interviewer proposed would suffer from overflow. In fact, there is a better solution that does not suffer from overflow, which the interviewer might have followed up with if you had suggested the summation one.

The better solution is much less intuitive, but it's pretty well-known nowadays, as far as I know.

It relies on the following:

x xor y = y xor x
x xor 0 = x
x xor x = 0

So the better solution involves computing the xor of all the given numbers and then xoring that with the xor of all numbers from 1 to n. Those that appear in your array will cancel out with those from 1 to n. The one that was missing will remain.

As for how to think about such solutions, there's no recipe. You just have to be familiar with such challenge / interview questions and have some training in computer science and math.

Don't feel too bad about not getting it. I almost certainly would not have gotten the summation solution in an interview setting if I hadn't seen the problem before after finishing college (I might have figured it out eventually in my own time) and I definitely would not have come up with the xor solution without seeing it first. These kind of questions are pretty much hit or miss. And if it's a hit, they'll probably ask something else until you don't know the answer.

They don't do this to catch you off guard. They do it to see how you think about it, usually: can you come up with a naive solution (you did), can you tell what's wrong with it? can you think of ways to improve it, maybe with some nudging in the right direction? can you explain how you'd go about researching a better solution? The thought process can be more important than the solution.

If all the interviewer said was that your solution is bad and you should have known the better solution (I don't think there's a checklist of things you must absolutely know about X after finishing a major in X), you should look for a better place to work.

IVlad
  • 43,099
  • 13
  • 111
  • 179
4

If you use unsigned integers, the method using the formula will still work because C standard specifies modular arithmetic for multiplication and addition. With n-bit integers you are guaranteed to get the correct result modulo 2^n, and since you know the result must be in range 0-2^n you know it must be correct.

Joni
  • 108,737
  • 14
  • 143
  • 193
  • Are you sure? How do you divide by 2 in modular arithmetic? – IVlad Aug 15 '15 at 20:57
  • 4
    You're right, you have to be a little bit more careful and divide n or n+1 (which ever is even) by 2 before the multiplication (i.e. before modular arithmetic is needed) – Joni Aug 15 '15 at 21:02
  • 2
    If you're careful to do that (note that `(a + b) % c = (a % c + b % c) % c` and the same applies to multiplication, but not to division: (a / b) % c != ((a % c) / (b % c)) % c)`, this should work. If you're testing language knowledge, this is more valuable than the xor solution I think. – IVlad Aug 15 '15 at 21:15
0

Almost any computer science student should have solved it. His solution runs in O(n) time and O(1) space, your solution runs in O(n) time with bigger constant and O(n) space. So it is not surprising that his solution is better.

Regarding the overflow. Almost with any problem you can come up with the input that will overflow. If you hear the problem "how to add two numbers" and then see the solution a + b, it will also suffers overflow. Or output the number you get from input also suffers overflow.

A good idea during the interview is to ask what kind of input do you expect. And when you propose the solution, you have to tell what kind of problems will you expect. My add two numbers will work if the sum is less than X. If you want better, use big int arithmetic.

P.S. have you thought that you might not have enough memory for malloc operation when the number n is too high?

Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
  • I didn't downvote, but I disagree with "Almost any computer science student should have solved it". It's a question that requires mathematical knowledge or familiarity with such problems that is not necessarily needed in many programming jobs. And they don't teach this stuff in college AFAIK. – IVlad Aug 15 '15 at 21:04
  • 1
    @IVlad, thanks for your opinion. But I thought that mathematical knowledge is something CS student should already have, applying to CS or develop during the study. If one does not know how to sum numbers from 1 to n, it is hard to imagine how will he analyses the complexity of qsort or dijkstra. But I agree, it is hard to tell what CS student can and can not do, this is why I added "almost". This means that out 100 CS people, I expect 90 to know it, and would consider that 10 are not yet ready to take a job. – Salvador Dali Aug 15 '15 at 21:16
  • I don't think it's necessarily an issue of not knowing those things, but an issue of identifying that they are needed for the problem and putting them all together. Someone might know what xor does, but I'm willing to bet that most people who know what xor is would not figure out the xor solution if they haven't heard of it before. It's one thing to know the individual elements, and another to be able to tell that they can be applied to a given problem. For the latter, you need knowledge about those kind of problems, and such knowledge is not really developed during formal CS study, AFAIK. – IVlad Aug 15 '15 at 21:19
0

This is a math question, not a programming question. I'm not sure if computer science majors take a math course that includes series, and if not, then it's somewhat random if a programmer has encountered this before. There are other series like power series as part of many http://en.wikipedia.org/wiki/List_of_mathematical_series.

At what point do these questions stop being programming questions and mathematical questions instead?

rcgldr
  • 27,407
  • 3
  • 36
  • 61