10

I was asked this question in an interview. Although the interview was for dot net position, he asked me this question in context to java, because I had mentioned java also in my resume.

How to find the index of an element having value X in an array ?

I said iterating from the first element till last and checking whether the value is X would give the result. He asked about a method involving less number of iterations, I said using binary search but that is only possible for sorted array. I tried saying using IndexOf function in the Array class. But nothing from my side answered that question.

Is there any fast way of getting the index of an element having value X in an array ?

teenup
  • 7,459
  • 13
  • 63
  • 122

10 Answers10

16

As long as there is no knowledge about the array (is it sorted? ascending or descending? etc etc), there is no way of finding an element without inspecting each one.

Also, that is exactly what indexOf does (when using lists).

f1sh
  • 11,489
  • 3
  • 25
  • 51
  • 1
    "... without inspecting each one until you find it.", actually. If your element is the first one, you don't need to inspect every element n the array/list. – Joachim Sauer Aug 17 '10 at 08:33
  • 2
    Sorry, i thought that was obvious! Anyone who doesn't mind that in his implementation doesn't deserve to run it... – f1sh Aug 17 '10 at 08:56
  • But that doesn't mean that it couldn't be done faster i.e. by parallelization. – Peladao Aug 17 '10 at 12:35
  • Parrallelization is only potentially faster. If the item is near the start of the array then you're performing more work to parrelalize the task than you would have from just starting to search the array. Parrelalization isn't a magic bullet that makes everything faster, it's better for large tasks for sure, but there's an overhead cost that makes it less efficient for small tasks. – Doctor Jones Nov 12 '13 at 11:29
12

How to find the index of an element having value X in an array ?

This would be fast:

int getXIndex(int x){
    myArray[0] = x;
    return 0;
}
Ultimate Gobblement
  • 1,851
  • 16
  • 23
  • 2
    your function should receive the array as a parameter. Other than that, excellent answer (which also illustrates one danger of TDD) :) – Mikeage Aug 17 '10 at 10:38
  • 2
    I must start writing my code like this - it saves so much time. – mdma Aug 17 '10 at 11:02
  • 2
    @fortran, you _do_ realise that the hover-help for the upvote button says "useful" rather than "funny", yes? I'm all for humour but, unless it comes with an _actual_ useful answer, it's probably best left in a comment :-) – paxdiablo Apr 10 '15 at 10:40
9

A practical way of finding it faster is by parallel processing.

Just divide the array in N parts and assign every part to a thread that iterates through the elements of its part until value is found. N should preferably be the processor's number of cores.

Peladao
  • 4,036
  • 1
  • 23
  • 43
  • Nice idea. But is it really useful to scale it using the processor's core count? The VM doesn't divide the thread equally amongst all cores, or does it? – f1sh Aug 17 '10 at 11:11
  • 2
    java has the ParallelArray (generic) and ParallelLongArray (primitive) for this. http://gee.cs.oswego.edu/dl/jsr166/dist/extra166ydocs/extra166y/ParallelLongArray.html. The `indexOf` operation is implemented using the N partitions scheme. – mdma Aug 17 '10 at 11:13
  • @f1sh - in practice (on Windows, where most of my experience lies) you find cpu usage can approach 100% for all cores when using a ForkJoinPool sized to the number of available cores. And this also gives you the benefit of using more L1 cache than with the single-threaded case, which since L1 is usually dedicated to each core/pair of cores. – mdma Aug 17 '10 at 11:16
  • That is the responsibility of the VM. In general, using multiple threads on a multi-core CPU should always be faster than just one thread. (This is disregarding the overhead of dividing the array and starting threads. The array should of course be sufficiently large to justify parallelization). – Peladao Aug 17 '10 at 11:20
  • 1
    For greater parallelism you could always assign the work to a GPU or two: http://www.jcuda.de/ – Ultimate Gobblement Aug 17 '10 at 12:27
3

If a binary search isn't possible (beacuse the array isn't sorted) and you don't have some kind of advanced search index, the only way I could think of that isn't O(n) is if the item's position in the array is a function of the item itself (like, if the array is [10, 20, 30, 40], the position of an element n is (n / 10) - 1).

gustafc
  • 28,465
  • 7
  • 73
  • 99
2

Maybe he wants to test your knowledge about Java.

There is Utility Class called Arrays, this class contains various methods for manipulating arrays (such as sorting and searching)

http://download.oracle.com/javase/6/docs/api/java/util/Arrays.html

In 2 lines you can have a O(n * log n) result:

    Arrays.sort(list); //O(n * log n)
    Arrays.binarySearch(list, 88)); //O(log n)
Torres
  • 5,330
  • 4
  • 26
  • 26
  • Sorting is already `O(n*log(n))` in the best case for mergesort (that is the algorithm implemented in Java utils if I remember well), so your solution is worse than linear (`O(n*log(n)^2)` indeed). – fortran Aug 17 '10 at 13:15
  • 1
    should be O(n*log n) as the sorting (assuming comparison model) takes O(n*log n) and the binary search is O(log n) so the total is O(logn + nlogn) = O(nlogn) – Ahmed Kotb Aug 17 '10 at 13:18
  • @Ahmed you're quite right! I'd update my comment but the edition time has expired :-s – fortran Aug 17 '10 at 13:20
  • You´re right Ahmed, I was only thinking in the search problem. I´ve fixed it, thank you. – Torres Aug 18 '10 at 09:34
  • Extract for the documentation: _Sorts the specified array of bytes into ascending numerical order. The sorting algorithm is a tuned **quicksort**, adapted from Jon L. Bentley and M. Douglas McIlroy's "Engineering a Sort Function", Software-Practice and Experience, Vol. 23(11) P. 1249-1265 (November 1993). This algorithm offers n*log(n) performance on many data sets that cause other quicksorts to degrade to quadratic performance._ – Torres Aug 18 '10 at 09:42
  • 2
    The problem with this is that it _doesn't_ give you the position of the element in the array as requested, simply because _it changes the order of the array!_ There is no relationship between the index after sorting and the index before sorting. – paxdiablo Apr 10 '15 at 10:42
1

Puneet - in .net its:

string[] testArray = {"fred", "bill"};
var indexOffset = Array.IndexOf(testArray, "fred");

[edit] - having read the question properly now, :) an alternative in linq would be:

string[] testArray = { "cat", "dog", "banana", "orange" };
int firstItem = testArray.Select((item, index) => new
{
    ItemName = item,
    Position = index

}).Where(i => i.ItemName == "banana")
  .First()
  .Position;

this of course would find the FIRST occurence of the string. subsequent duplicates would require additional logic. but then so would a looped approach.

jim

jim tollan
  • 22,305
  • 4
  • 49
  • 63
  • That method does not exist in Java... a) What Array class are you using? b) method names don't start with uppercase letters in java – f1sh Aug 17 '10 at 08:19
  • It seems you havn't read the entire question, I asked them for using the IndexOf function but they were asking me to write logic myself. They were saying that the simple loop you have written is correct but this can be improved and made in less number of iterations. Just think about it!!! – teenup Aug 17 '10 at 08:20
  • f1sh - oops, correct and corrected... puneet - yes, can see that now. kinda rushed it a bit!! – jim tollan Aug 17 '10 at 08:21
  • 1
    @Puneet Dudeja well, since it's not possible to do this in less number of iterations, they wanted you to say so and explain why. Either that or they're just plain wrong. – KeatsPeeks Aug 17 '10 at 08:22
  • This is basically the *opposite* of what the OP wants (!); you are simply adding *more* overhead (the construction of many new objects of anonymous type) to what is effectively still the determination of an item's index by enumeration (using `Where`/`First`). – Dan Tao Aug 17 '10 at 13:14
1

It's a question about data structures and algorithms (altough a very simple data structure). It goes beyond the language you are using.

If the array is ordered you can get O(log n) using binary search and a modified version of it for border cases (not using always (a+b)/2 as the pivot point, but it's a pretty sophisticated quirk).

If the array is not ordered then... good luck.

He can be asking you about what methods you have in order to find an item in Java. But anyway they're not faster. They can be olny simpler to use (than a for-each - compare - return).

There's another solution that's creating an auxiliary structure to do a faster search (like a hashmap) but, OF COURSE, it's more expensive to create it and use it once than to do a simple linear search.

helios
  • 13,574
  • 2
  • 45
  • 55
0

If the only information you have is the fact that it's an unsorted array, with no reletionship between the index and value, and with no auxiliary data structures, then you have to potentially examine every element to see if it holds the information you want.

However, interviews are meant to separate the wheat from the chaff so it's important to realise that they want to see how you approach problems. Hence the idea is to ask questions to see if any more information is (or could be made) available, information that can make your search more efficient.

Questions like:


1/ Does the data change very often?

If not, then you can use an extra data structure.

For example, maintain a dirty flag which is initially true. When you want to find an item and it's true, build that extra structure (sorted array, tree, hash or whatever) which will greatly speed up searches, then set the dirty flag to false, then use that structure to find the item.

If you want to find an item and the dirty flag is false, just use the structure, no need to rebuild it.

Of course, any changes to the data should set the dirty flag to true so that the next search rebuilds the structure.

This will greatly speed up (through amortisation) queries for data that's read far more often than written.

In other words, the first search after a change will be relatively slow but subsequent searches can be much faster.

You'll probably want to wrap the array inside a class so that you can control the dirty flag correctly.


2/ Are we allowed to use a different data structure than a raw array?

This will be similar to the first point given above. If we modify the data structure from an array into an arbitrary class containing the array, you can still get all the advantages such as quick random access to each element.

But we gain the ability to update extra information within the data structure whenever the data changes.

So, rather than using a dirty flag and doing a large update on the next search, we can make small changes to the extra information whenever the array is changed.

This gets rid of the slow response of the first search after a change by amortising the cost across all changes (each change having a small cost).


3. How many items will typically be in the list?

This is actually more important than most people realise.

All talk of optimisation tends to be useless unless your data sets are relatively large and performance is actually important.

For example, if you have a 100-item array, it's quite acceptable to use even the brain-dead bubble sort since the difference in timings between that and the fastest sort you can find tend to be irrelevant (unless you need to do it thousands of times per second of course).

For this case, finding the first index for a given value, it's probably perfectly acceptable to do a sequential search as long as your array stays under a certain size.


The bottom line is that you're there to prove your worth, and the interviewer is (usually) there to guide you. Unless they're sadistic, they're quite happy for you to ask them questions to try an narrow down the scope of the problem.

Ask the questions (as you have for the possibility the data may be sorted. They should be impressed with your approach even if you can't come up with a solution.

In fact (and I've done this in the past), they may reject all your possibile approaches (no, it's not sorted, no, no other data structures are allowed, and so on) just to see how far you get.

And maybe, just maybe, like the Kobayashi Maru, it may not be about winning, it may be how you deal with failure :-)

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
0

Take a perfectly unsorted array, just a list of numbers in memory. All the machine can do is look at individual numbers in memory, and check if they are the right number. This is the "password cracker problem". There is no faster way than to search from the beginning until the correct value is hit.

Kevin Kostlan
  • 3,311
  • 7
  • 29
  • 33
0

Are you sure about the question? I have got a questions somewhat similar to your question.

Given a sorted array, there is one element "x" whose value is same as its index find the index of that element.

For example:

         //0,1,2,3,4,5,6,7,8,9, 10     
int a[10]={1,3,5,5,6,6,6,8,9,10,11};

at index 6 that value and index are same.

for this array a, answer should be 6.

This is not an answer, in case there was something missed in the original question this would clarify that.

thkala
  • 84,049
  • 23
  • 157
  • 201
Mani
  • 603
  • 1
  • 5
  • 6