2

In the pursuit of learning I have written a multi-threaded linear search, designed to operate on an int[] array. I believe the search works as intended, however after completing it I tested it against a standard 'for loop' and was surprised to see that the 'for loop' beat my search in terms of speed every time. I've tried tinkering with the code, but cannot get the search to beat a basic 'for loop'. At the moment I am wondering the following:

Is there an obvious flaw in my code that I am not seeing?

Is my code perhaps not well optimised for CPU caches?

Is this just the overheads of multi-threading slowing down my program and so I need a larger array to reap the benefits?

Unable to work it out myself, I am hoping someone here may be able to point me in the right direction, leading to my question:

Is there an inefficiency/flaw in my code that is making it slower than a standard loop, or is this just the overheads of threading slowing it down?

The Search:

public class MLinearSearch {



private MLinearSearch() {};

public static int[] getMultithreadingPositions(int[] data, int processors) {
    int pieceSize = data.length / processors;
    int remainder = data.length % processors;

    int curPosition = 0;
    int[] results = new int[processors + 1];
    for (int i = 0; i < results.length - 1; i++) {

        results[i] = curPosition; 
        curPosition += pieceSize;

        if(i < remainder) {
            curPosition++;
        } 
    }

    results[results.length - 1] = data.length;
    return results;
}

public static int search(int target, int[]data) {
    MLinearSearch.processors = Runtime.getRuntime().availableProcessors();
    MLinearSearch.foundIndex = -1;

    int[] domains = MLinearSearch.getMultithreadingPositions(data, processors);
    Thread[] threads = new Thread[MLinearSearch.processors];


    for(int i = 0; i < MLinearSearch.processors; i++) {
        MLSThread searcher = new MLSThread(target, data, domains[i], domains[(i + 1)]);
        searcher.setDaemon(true);
        threads[i] = searcher;
        searcher.run();
    }


    for(Thread thread : threads) {
        try {
            thread.join();
        } catch (InterruptedException e) {
            return MLinearSearch.foundIndex;
        }
    }
    return MLinearSearch.foundIndex;   
}



private static class MLSThread extends Thread {
    private MLSThread(int target, int[] data, int start, int end) {
         this.counter = start;
         this.dataEnd = end;
         this.target = target;
         this.data = data;
    }


    @Override
    public void run() {

        while(this.counter < (this.dataEnd) && MLinearSearch.foundIndex == -1) {
            if(this.target == this.data[this.counter]) {
                MLinearSearch.foundIndex = this.counter;
                return;
            } 
            counter++;
        }
    }



    private int counter;
    private int dataEnd;
    private int target;
    private int[] data; 
}


private static volatile int foundIndex = -1;
private static volatile int processors;

}

Note: "getMultithreadingPositions" is normally in a separate class. I have copied the method here for simplicity.

This is how I've been testing the code. Another test (Omitted here, but in the same file & run) runs the basic for loop, which beats my multi-threaded search every time.

public class SearchingTest {

@Test
public void multiLinearTest() {
    int index = MLinearSearch.search(TARGET, arrayData);
    assertEquals(TARGET, arrayData[index]);
}



private static int[] getShuffledArray(int[] array) {
    // https://stackoverflow.com/questions/1519736/random-shuffling-of-an-array
    Random rnd = ThreadLocalRandom.current();
    for (int i = array.length - 1; i > 0; i--)
    {
      int index = rnd.nextInt(i + 1);
      int a = array[index];
      array[index] = array[i];
      array[i] = a;
    }

    return array;
}


private static final int[] arrayData = SearchingTests.getShuffledArray(IntStream.range(0, 55_000_000).toArray());
private static final int TARGET = 7;

}

The loop beating this is literally just a for loop that iterates over the same array. I would imagine for smaller arrays the for loop would win out as its simplicity allows it to get going before my multi-threaded search can initiate its threads. At the array size I am trying though I would have expected a single thread to lose out.

Note: I had to increase my heap size with the following JVM argument:

-Xmx4096m

To avoid a heap memory exception.

Thank you for any help offered.

Levenal
  • 3,796
  • 3
  • 24
  • 29
  • 1
    `MLinearSearch.foundIndex == -1` kill most of performance. It is sgared variable and you check it every iteration. Anyway you will not gain any performance boost from multithreading here. Most time will be spend in waiting for data from memory and you randomize access by using several thread so it increase chances of cache miss. Multithreading is usefull if you perform calculation, but in your example only do few comparisions which is not enough to load CPU. – talex Aug 17 '18 at 09:56

0 Answers0