2

Numpy newbie here. I'm trying to normalize (aka feature scaling, standardization) my inputs to a neural network. I just doing linear scaling and the formula I'm using is:

I = Imin + (Imax-Imin)*(D-Dmin)/(Dmax-Dmin)

where I is the scaled input value, Imin and Imax are the desired min and max range of the scaled values, D is the original data value, and Dmin and Dmax are the min and max range of the original data values. I want a python method that takes a numpy array and returns an array with all the values normalized. This is what I'm thinking so far.

def get_normalized_values(array):
    """I = Imin + (Imax-Imin)*(D-Dmin)/(Dmax-Dmin)"""
    imin = -1
    imax = 1
    dmin = array.amin()
    dmax = array.amax()

    normalized = imin + (imax - imin)*(array - dmin)/(dmax - dmin)

    return normalized

My question is will this work? Or do I have to loop through each element in the array and perform the math? Can you just do math like this with arrays and scalars? That is, will array - dmin create a new temporary array where each value has dmin subtracted? Not sure if this is the right terminology but I think this is a "vectorized" approach?

Update

Is there a way to have this modify the array in place? That is rather than returning a copy of the array, have the function take the array and modify the original array?

User
  • 62,498
  • 72
  • 186
  • 247
  • 1
    Things like this normally work fine, do just try it. One thing is that amin and amax should just be min and max. Or argmin/argmax. – Brian Larsen Apr 13 '12 at 21:28

2 Answers2

4

I believe you need to change the calls amin() and amax() to just be calls to min() and max(), as in my_array.max().

Otherwise, this should work fine. You can do things in NumPy much like Octave/Matlab, such as adding a scalar to an array, and it automatically knows to map the operation to all elements. Sometimes, you might need slightly different syntax (like knowing the difference between numpy.linalg.dot() and just multiplying two arrays), but in general things like this are as straightforward as you have indicated.

ely
  • 74,674
  • 34
  • 147
  • 228
  • What's the difference between amin, amax and min, max? – User Apr 13 '12 at 21:38
  • 1
    I don't believe that `amin()` is a member function of array types in NumPy. I'm pretty sure you use `numpy.amin()` to compute a minimum along an axis, and that `my_array.amin()` will throw an error (it definitely throws an error for me in NumPy 1.5.1 when I test your code above). The proper method that is implemented for array types is the `min()` method, such as `my_array.min()`. So you can choose to call `amin` if you want, but you can't call it with the dot-syntax after the array's name. – ely Apr 13 '12 at 22:00
  • Also, there's a small personal aesthetic preference. I use the NumPy function `argmin()` often to get the index of a minimum entry. And I don't like the potential confusion of `min`, `argmin` and `amin`. I tend to only use things that are class methods for the array types, and I avoid any array operation that is applied by calling `np.some_function(array)`. I try to stick to `array.some_function()`, and there is no such thing for `amin()`, it can only be called as `np.amin(array)`. – ely Apr 13 '12 at 22:05
  • That's because I modified my question and was hoping for an answer on the edit. But instead I decided to open a new question: http://stackoverflow.com/questions/10149416/numpy-modify-array-in-place – User Apr 13 '12 at 23:11
2

It's python - just try it (tm)

I really don't know the answer, but my way of finding out would be to paste the question into an iPython terminal session. Generally whenever I have wondered how to do something like this in numby the simple way has worked.

Martin Beckett
  • 94,801
  • 28
  • 188
  • 263