0

I have an array with about 64,000 features (labeled arbitrarily 0-64,000), and I'm trying to write a function that will compare each feature to each of its neighbors. To do this iteratively takes a prohibitive amount of time. I'm trying to expedite the process by creating a nested function that will be applied to each feature using pandas.DataFrame.apply() using the following code:

def textureRemover(pix, labeledPix, ratio):
    counter = 0

    numElements = numpy.amax(labeledPix)
    maxSize = numpy.count_nonzero(labeledPix)


    allElements = pandas.DataFrame(numpy.array(list(range(numElements))))
    def func(regionID):
    ...

    allElements.apply(func, axis = 1)
    return pix

Where func() needs access to the parameters of, and variables defined within, textureRemover() and each call of func() will alter the very large arrays pix and labeledPix

I've tried using the line: global pix, labeledPix, counter, ratio, maxSize. However, If use this line in textureRemover() I receive an error that variables cannot be parameters and global. And if I use the line within func() I receive an error that these variables are undefined.

How can I make func() able to access and modify these variables?

asheets
  • 770
  • 1
  • 8
  • 28
  • 1
    There are a couple questions here. The first is regarding whether using `apply` will improve performance over iteration.The second is how to use apply with a function that takes multiple parameters. These should be split into two different questions, the latter of which has an answer [here](https://stackoverflow.com/a/12183507/3639023), hope this helps – johnchase Dec 29 '17 at 18:45
  • Passing a very large array into a function is as cheap as passing anything else. It's passed by reference, not copied, so any mutations done to it by the function remain. Also, you can just refer to names in the outer scope (global or not), unless you try to assign a name. So `pix[...] = ...` should just work inside `func`. – 9000 Dec 29 '17 at 19:03
  • @9000 I did not realize that! Without declaring anything as global, it seems to work. If you post that as an answer I can mark it – asheets Dec 29 '17 at 19:19
  • @asheets: Appreciated :) – 9000 Dec 29 '17 at 20:23

1 Answers1

1

Passing a very large array into a function is as cheap as passing anything else. It's passed by reference, not copied, so any mutations done to it by the function remain.

Also, you can just refer to names in the outer scope (global or not), unless you try to assign a name. So pix[...] = ... should just work inside func.

9000
  • 39,899
  • 9
  • 66
  • 104