I have been using the java version of libsvm for many data mining problems. However what I noticed is even when we have multicore computer, libsvm uses only one core, it doesn't parallelize the problem. When I searched in FAQs there was a c++ solution[http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html#f432]. The existing java class looks like this.
@Override
float[] get_Q( int i, int len )
{
float[][] data = new float[1][];
int start, j;
if ( ( start = cache.get_data( i, data, len ) ) < len )
{
for ( j = start; j < len; j++ )
{
data[0][j] = ( float ) ( y[i] * y[j] * kernel_function( i, j ) );
}
}
return data[0];
}
I used the same concept in java also - changing the for loop of get_Q in class SVC_Q like following.
@Override
float[] get_Q( int i, int len )
{
float[][] data = new float[1][];
int start, j;
if ( ( start = cache.get_data( i, data, len ) ) < len )
{
ExecutorService executorService = Executors.newFixedThreadPool( Runtime.getRuntime()
.availableProcessors() ); // number of threads
for ( j = start; j < len; j++ )
{
final int count = j;
executorService.submit( new Runnable()
{
@Override
public void run()
{
data[0][count] = ( float ) ( y[i] * y[count] * kernel_function( i, count ) );
}
} );
}
executorService.shutdown();
}
return data[0];
}
Even though after the change now it uses all cores in my machine the results were decreasing. The percentage of correctly classified instances for a new test set went down from 78% to 58%. And the training time didn't reduce either. So obviously I am not doing it right. Is there a proper way to parallelize libsvm? What is the mistake I am doing in my code?