I have a project that is an image processing app for android devices. For working with image I choose opencv android framework. The whole project consist of some general parts such as blocking the input image, compute dct of each block, sorting the result, compare the feature that get from each block, and finally show some results.
I write this project but it contain so many heavy computing like dct, sorting etc, so I can't even run it on my emulator because it take long time and my laptop shutdown middle of processing. I decided to optimize the processing using parallel computing and gpu programming (it is obvious that some parts like computing dct of blocks can become parallel, but I am not sure about some other parts like sorting), anyway there is a problem that I can't find any straightforward tutorial for doing this.
Here is the question, is there any way to do that or not ? I need it to be global for most of android device not for an especial device !!!
Or beside the gpu programming and parallel computing is there anyway to speed the processing up? (maybe there is other libraries better than opencv!)