1

I'm trying to measure performance improvement of my code when I run my multithreaded android app on a multicore device (like the S3) versus a single core android device. To measure performance, I run the tasks sequentially versus in parallel. I've implemented normal Java threads, which didn't seem to make a difference. I thus tried AsynchTask, but I only got a little bit of performance improvement.

Can you let me know how I can write code that makes sure that each of my tasks / threads are being run on different cores as opposed to a single one? If that is not possible, how can I maximize the use of multiple cores for my app?

Here's the code for the onCreate method of the activity that executes the tasks.

protected void onCreate(Bundle savedInstanceState)  {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_local_executor);

    multiplicationTask t1 = new multiplicationTask();
    t1.executeOnExecutor(AsyncTask.THREAD_POOL_EXECUTOR);       

    multiplicationTask t2 = new multiplicationTask();
    t2.executeOnExecutor(AsyncTask.THREAD_POOL_EXECUTOR);

    multiplicationTask t3 = new multiplicationTask();
    t3.executeOnExecutor(AsyncTask.THREAD_POOL_EXECUTOR);

    multiplicationTask t4 = new multiplicationTask();
    t4.executeOnExecutor(AsyncTask.THREAD_POOL_EXECUTOR);

}

Here is the AsynchTask that is run from the onCreate method

class multiplicationTask extends AsyncTask<Integer, Integer, String> {

    protected void onPreExecute()
    {
        Log.v("PrintLN", "Executing Task");
    }

    @Override
    protected String doInBackground(Integer... params) {
     //Do lots of floating point operations that are independent of anything whatsoever
    }


    protected void onPostExecute(String result) {
        Log.v("PrintLN", "Done Task: " + resulting_time);
    }

}
afahim
  • 619
  • 5
  • 7
  • FWIW, this is called "thread affinity" and it isn't even supported in standard Java - at least [not in the JCL](http://stackoverflow.com/questions/2238272/java-thread-affinity). –  Feb 17 '13 at 06:54
  • How are your matrix multiplications implemented? I don't know whether the JIT does loop manipulations. If not, matrix multiply can be dominated by memory transfers rather than in-processor computation. I suggest substituting something that is pure compute, with little or no memory access. – Patricia Shanahan Feb 17 '13 at 07:12
  • @PatriciaShanahan for testing purposes i'm multiplying two 8x8 float arrays with each other - what else would you suggest? – afahim Feb 17 '13 at 07:17
  • 8x8 should be small enough to fit in cache, so it should be CPU bound if you do enough of them to make it measurable. – Patricia Shanahan Feb 17 '13 at 07:30

2 Answers2

3

Android has a limitation for that. You cant manually make your thread to run on a certain core. The best way is you could use ThreadPoolExecutor to control the number of thread. By that way ThreadPoolExecutor will automatically separate thread to run in the cores according to the usage of the cores

Kavin Varnan
  • 1,989
  • 18
  • 23
  • thanks @PrinceVegeta - I've also added code to my question - is this what you are talking about when referring to ThreadPoolExecutor? – afahim Feb 17 '13 at 07:04
2

Can you let me know how I can write code that makes sure that each of my tasks / threads are being run on different cores as opposed to a single one?

That will be handled automatically by the JVM / android. If you don't see any performance gains, the likeliest reasons are:

  • the tasks are not parallelisable (for example you calculate two results but the second depends on the first so they run sequentially)
  • the tasks are no CPU-bound (i.e. if you read a huge file, the bottleneck is the speed of you storage, not the CPU, and adding threads won't help)
  • you don't start enough threads / you start too many threads

I suggest you show the code that create and start the threads as well as give an idea of what the tasks do if you need a more specific answer.

EDIT

Note that AsyncTask's primary use is to run short background tasks that interact with the UI. In your case, a plain executor would probably be better. The basic syntax to create one would be:

private static final int N_CPUS = Runtime.getRuntime().availableProcessors() + 1;
private final ExecutorService executor = Executors.newFixedThreadPool(N_CPUS);

And to use it:

executor.submit(new Runnable() {
    public void run() {
        Log.v("PrintLN", "Executing Task");
        //your code here
        Log.v("PrintLN", "Done Task: " + resulting_time);
    }
});

And don't forget to shutdown the executor when you are done.

Now the performance improvement will vary on a number of factors. In your case, if tasks are too short lived, the context switching overhead (when the CPU activates one thread or another) can be large enough that the benefits of multiple threads can be offset.

Another possible bottleneck is synchronization: if you exchange data across threads continuously this will cause a lot of overhead too.

assylias
  • 321,522
  • 82
  • 660
  • 783
  • Thanks @assylias, I've updated the question with the code, can you please look into it now? Let me know if you want more info – afahim Feb 17 '13 at 07:02