1

Is it possible to speed up this mapping function by dividing the lookups between multiple processes or threads?

for k, v in map_dict.iteritems():
    result_arr[k]=input_arr[v]

Note: k, v are tuples as result_arr and input_arr are 2 dimensional.

  • As a rule of thumb, Python usually doesn't gain from parallel processing unless the amount of data is huge and there is little shared data. The reason for this is because threads don't execute fully in parallel due to the GIL, and processes can't efficiently share data. – orlp Dec 30 '13 at 14:36
  • Thanks I am aware of this but in this case the amount of data is huge. I will consider moving to C but for now I would like to finish a prototype in python. I would be grateful for any speed improvements. –  Dec 30 '13 at 14:39
  • What types are `result_arr` and `input_arr`? If they're `numpy` arrays (which your reference to their being `two dimensional` suggests), `result_arr[map_dict.keys()] = input_arr[map_dict.values()]` will probably be faster than any explicit loop, and it may be parallelized by numpy. – Blckknght Dec 30 '13 at 14:41
  • it is possible? yes, sorting and then splitting the dictionary among a pool (e.g. http://stackoverflow.com/questions/3842237/parallel-processing-in-python?rq=1). Does it speeds up the whole process? it depends on many factors (size of data and overheads above all). – furins Dec 30 '13 at 14:43
  • Blckknght both are image arrays in the format (width, height, color). The dictionary contains the [x1,y1] keys to lookup from one image and where to place them value [x2,y2] in the output image. –  Dec 30 '13 at 15:07
  • @Furins how do I do that? –  Dec 30 '13 at 15:37
  • wait. It's still unclear to me which algorithm and which data are you trying to optimize. Are you trying to translate/scale/rotate one image basing on lookup/match points by accessing each pixel from a 2D matrix? can you please add a more complete example of your code (including variable definition)? – furins Dec 30 '13 at 16:10
  • @furins Im not trying to translate, scale or rotate the image. Each pixel on one image is drawn to the other according to the translation of that pixel referenced in the dictionary. You could think of it as scattering the image according to rules if you like. So if you have the key/value (10,700),(202,4) then output_arr[202][4]=input_arr[10][700]. There fewer keys in the dictionary than pixels on the image, many of them will stay blank. Im not sure what you want for definition: I have output_arr = np.zeros_like(input_arr), map_dict[(i,p)]=(x,y) input_arr[x][y]=[r,g,b]. –  Dec 30 '13 at 21:06
  • Now I understand it better, thanks. I'll try to answer tomorrow if nobody else will answer in the meantime (it's a bit late in my timezone, sorry). Have you tried Blckknght approach? – furins Dec 30 '13 at 21:26

1 Answers1

0

you may consider Theano or Cython and, capitalizing on Blckknght comment, using this syntax:

result_arr[map_dict.keys()] = input_arr[map_dict.values()]

trying to split the original keys list in parts, and assigning each part to a different multiprocessing Pool (as suggested by me in a comment) may hardly improve it more, even on huge sets of points.

furins
  • 4,979
  • 1
  • 39
  • 57
  • Thanks I am already using Cython. And 'result_arr[map_dict.keys()] = input_arr[map_dict.values()]' is not producing the same results for some reason the data is scrambled perhaps because of some sorting problem. If there was a way to break the process up that would be good. Because I want to scale it a lot further with either multiple cores or even computers. –  Jan 03 '14 at 22:45