6

I'd like to be able to define a python function which does some sort of data structure traversal and performs some operation (defined in a callback) at each step. Here's a simple example to illustrate.

def min_value(list):
   m = 0
   for i in range(0, len(list)):
     if list[i] < m: m = list[i]
   return m

This example is trivial, but becomes more complex if I'm traversing an n-dimension array, a numpy array, or I want to define a more complex method of stepping through the data structure.

I'm looking for a "pythonic" analogue to the following JavaScript which does the same thing.

traverse = function(list, callback) {
   for(i = 0; i < list.length; i++) {
      callback(list[i])
   }
}

min_value = function(list) {
   m = 0
   traverse(list, function(element) {
      if(element < m) m = element
   });
}

If I wanted a function max_value(list), it'd be handy to be able to just adjust the one line to if(element > m) m = element without having to to copy/paste my traversal function. What's the best way to do this in python?

EDIT (THE ANSWER): Here I use the accepted answer below to show how my original JS example could be written in python 3. There are clearly better ways to find the minimum value of a list of scalars, but this is the pattern I was looking for.

def traverse(list, callback):
   for i in range(0, len(list)):
      callback(list[i])

def min_value(list):
   m = 0

   def update_min(val):
      nonlocal m
      if(val < m): m = val

   traverse(list, update_min)
   return m

EDIT (COMPLEX EXAMPLE): Here's a more complex and direct example of my use case. I have an image and I want to do some processing on a "focus region" of the image. Let's say, for example, I want to find the darkest 10x10, 20x20, 30x30, etc pixel regions of the image. My image is a numpy array. I'll use a "bastard" multiline lambda to illustrate what I'd like to do, though as far as I know lambdas cannot be used in this way.

In python:

# find the coordinates and of darkest region
def darkest_region(image, fsize):
   darkest_coords = [0, 0]
   darkest_avg = 256

   # illegal lambda
   g = lambda x, y, r:
      # r is an fsize x fsize region
      avg_darkness = # blah blah, traverse and average pixels
      if(avg_darkness < darkest_avg): darkest_coords = [x, y]

   focus_scan(image, fsize, g)
   return darkest_coords

# scan all regions of fsize x fsize
def focus_scan(image, fsize, callback):
   height, width = image.shape[:2]
   for y in range(0, height - fsize):
     for x in range(0, width - fsize):
        focus = image[y:y+fsize, x:x+fsize]
        callback(x, y, focus)

Given that I can't have a multiline lambda like that, I think what I need to do is get a little fancy with the arguments and return value of focus_scan, in order to accommodate a variety of possible inputs/outputs. In this example, I could pass focus scan darkest_avg and have it return the coordinates I'm looking for, but I might want to pass it something other than a scalar, or have it return a more complex data structure. I imagine I'll have to do something like:

def focus_scan(image, fsize, callback, args):
   ... # loopy loops
      some_result = callback([x, y, focus], args)

   return some_result

I like the inline function approach I can use in other programming languages, but it doesn't look like it's a common pattern in python.

CaptainStiggz
  • 1,787
  • 6
  • 26
  • 50
  • 1
    `functools.reduce`, maybe? I don't think there's a "singular way" to do this effectively, at all. Different operations will require different (specific) tools, especially with NumPy. – miradulo Mar 01 '17 at 22:33
  • Or in this case, there's already a `min` function that takes a `key` argument... – ShadowRanger Mar 01 '17 at 22:35
  • @Mitch: There are tons of general ways to accomplish stuff like this (I'd recommend learning the `functools` and `itertools` modules if you want to do more powerful stuff), e.g. the aforementioned `reduce` function, but when you have more obvious options, it's best to avoid unnecessary complication. – ShadowRanger Mar 01 '17 at 22:40
  • @Mitch: very often a 'general way' is cleaner and _shorter,_ independent of the language, as long as you hit the right generalization. Something like a "visitor pattern" is one way to think about the OP's case; "traversable" and "foldable" is another, etc. – 9000 Mar 01 '17 at 22:41
  • 1
    @9000 That is fair, I suppose I was considering more just NumPy as OP mentioned it, where vectorized routines often require some abstractions that can make a "general way" challenging, depending on how you define "general". – miradulo Mar 01 '17 at 22:43
  • Just use a normal function instead of the multiline lambda. To get access to the darkest_coords variable you can use the `nonlocal` keyword in Python 3. In Python 2 you have to mutate the list, because the `nonlocal` keyword is not available. Also, binding a lambda (a anonymous function) to a name is pointless. – skrx Mar 02 '17 at 00:30

3 Answers3

3

You can totally pass a function to your function, just like in JS.

def myFancyTraversal(data, callback):
  result = 0
  for element in data:  # or anything more fancy
    result = callback(element, result)
  return result

But before reinventing wheels, make sure you're fluent with list comprehensions, map, reduce, and lambda.

9000
  • 39,899
  • 9
  • 66
  • 104
  • 1
    Thanks for this. I edited my question to provide a more complex example use-case, along with my interpretation of your answer. Basically, I'm gonna need to pass more arguments to `myFancyTraversal` and have it actually return something, as there's no way to have tracked variables like `min_value` or `darkest_avg` scope to an inline function as in JS or other languages. So each of my different `callback` functions needs to accept all those variables as arguments, and return something for `myFancyTraversal` to return to its calling function. – CaptainStiggz Mar 01 '17 at 23:31
  • The beauty of the inline approach available in JS is that it allows me to define a function or module whose only concern is "traversing" or "scanning," without having to provide any context on what is being done with the scan. But it seems that with `myFancyTraversal`, and python in general, that context is required. In other words, the `myFancyTraversal` actually needs some context about `callback` in the form of arguments, and needs to actually return something, even though its job is simply to "traverse." Is that correct? – CaptainStiggz Mar 01 '17 at 23:51
  • 1
    I see, what you need is not a function, and not even a proper procedure with side effects; you want something like a macro that would meld with the code around it, access its local variables, etc. This way, the "callback" needs the context about the place where it would run. I stay away from such approaches because it's hard (for me) to reason about them or test them; I tend to use proper functions, pure where possible. But the macro-like approach is still attainable, as @skrx shows above. – 9000 Mar 02 '17 at 04:15
3

Just use a normal function instead of the multiline lambda. To get access to the darkest_coords variable you can use the nonlocal keyword in Python 3. In Python 2 you have to mutate the list, because the nonlocal keyword is not available.

def darkest_region(image, fsize):
    """Find the coordinates of darkest region."""
    darkest_coords = [0, 0]
    darkest_avg = 256

    def g(x, y, r):
        nonlocal darkest_coords
        # r is an fsize x fsize region
        avg_darkness = # blah blah, traverse and average pixels
        if avg_darkness < darkest_avg:
            darkest_coords = [x, y]

    focus_scan(image, fsize, g)
    return darkest_coords


def focus_scan(image, fsize, callback):
    """Scan all regions of fsize x fsize."""
    height, width = image.shape[:2]
    for y in range(0, height - fsize):
        for x in range(0, width - fsize):
            focus = image[y:y+fsize, x:x+fsize]
            callback(x, y, focus)
skrx
  • 19,980
  • 5
  • 34
  • 48
2

For your case, I would just use the builtin min function:

>>> min([1, 2, 3, 4, 5])
1

But if you wanted to have an inline function, you would use a lambda and a map. The map function is essentially a built-in version of your traverse function.

def findMin(arr):
    m = 100 #Must be a value larger than can be in your list
    min_func = lambda el: el if el < m else m #Use an inline lambda to declare a function.
    for val in map(min_func, arr): #Returns the function called on 
                                   #every item in the array
        m = val #Set m to the new lowest value
    return m

The min_func lambda is equivalent to:

def min_func(el):
    if el < m:
        return el
    return m
Community
  • 1
  • 1
rassar
  • 5,412
  • 3
  • 25
  • 41