218

After doing some processing on an audio or image array, it needs to be normalized within a range before it can be written back to a file. This can be done like so:

# Normalize audio channels to between -1.0 and +1.0
audio[:,0] = audio[:,0]/abs(audio[:,0]).max()
audio[:,1] = audio[:,1]/abs(audio[:,1]).max()

# Normalize image to between 0 and 255
image = image/(image.max()/255.0)

Is there a less verbose, convenience function way to do this? matplotlib.colors.Normalize() doesn't seem to be related.

Ashwin Nanjappa
  • 76,204
  • 83
  • 211
  • 292
endolith
  • 25,479
  • 34
  • 128
  • 192

8 Answers8

210
# Normalize audio channels to between -1.0 and +1.0
audio /= np.max(np.abs(audio),axis=0)
# Normalize image to between 0 and 255
image *= (255.0/image.max())

Using /= and *= allows you to eliminate an intermediate temporary array, thus saving some memory. Multiplication is less expensive than division, so

image *= 255.0/image.max()    # Uses 1 division and image.size multiplications

is marginally faster than

image /= image.max()/255.0    # Uses 1+image.size divisions

Since we are using basic numpy methods here, I think this is about as efficient a solution in numpy as can be.


In-place operations do not change the dtype of the container array. Since the desired normalized values are floats, the audio and image arrays need to have floating-point point dtype before the in-place operations are performed. If they are not already of floating-point dtype, you'll need to convert them using astype. For example,

image = image.astype('float64')
Alex Punnen
  • 5,287
  • 3
  • 59
  • 71
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 7
    Why is multiplication less expensive than division? – endolith Nov 14 '09 at 22:41
  • 22
    I don't know exactly why. However, I am confident of the claim, having checked it with timeit. With multiplication, you can work with one digit at a time. With division, especially with large divisors, you have to work with many digits, and "guess" how many times the divisor goes into the dividend. You end up doing many multiplication problems to solve one division problem. The computer algorithm for doing division may not be the same as human long division, but nevertheless I believe it's more complicated than multiplication. – unutbu Nov 15 '09 at 00:49
  • 18
    Probably worth mentioning a divide by zero for blank images. – cjm2671 Jun 22 '14 at 13:21
  • 10
    @endolith multiplication is less expensive than division because of the way its implemented on the Assembly level. Division algorithms can't be parallelized as well as multiplication algorithms. https://en.wikipedia.org/wiki/Binary_multiplier – Mat Jones Nov 27 '16 at 04:27
  • @mjones.udri Yeah but if you're dividing an entire array by a scalar, shouldn't it save time by multiplying by the scalar's inverse? – endolith Nov 27 '16 at 04:44
  • @endolith if you tell it to behave that way, then yes. – Mat Jones Nov 28 '16 at 18:28
  • @mjones.udri Are there numerical problems with doing it automatically, though? – endolith Nov 28 '16 at 19:29
  • 1
    @endolith nope! Think about the definition of division: multiplication by inverse. 10 / 5 = 10 * (1 / 5) – Mat Jones Nov 29 '16 at 15:26
  • 7
    Minimizing the number of divisions in favor of multiplications is a well know optimization technique. – Mat Jones Nov 29 '16 at 15:27
  • @mjones.udri Yes that's true for mathematical concepts but I'm asking if it's true for fixed-length floating-point numbers. Does it cause any numerical error in odd cases like denormals, etc? – endolith Nov 29 '16 at 15:33
  • @endolith with floating point numbers that's possible, but I'm not sure. I think it depends on how much precision (if any) is lost during manipulation. – Mat Jones Nov 29 '16 at 15:56
  • you eliminate an intermediate temporary array, but you may get `TypeError: Cannot cast ufunc multiply output from dtype('float64') to dtype('int64') with casting rule 'same_kind'` if `image` is int type. – RNA Aug 11 '19 at 14:38
  • what am I missing, won't this normalize between -255 and 255, after multiplying the [-1, 1] normalized array by 255? – Jazz Weisman Feb 07 '23 at 19:17
129

If the array contains both positive and negative data, I'd go with:

import numpy as np

a = np.random.rand(3,2)

# Normalised [0,1]
b = (a - np.min(a))/np.ptp(a)

# Normalised [0,255] as integer: don't forget the parenthesis before astype(int)
c = (255*(a - np.min(a))/np.ptp(a)).astype(int)        

# Normalised [-1,1]
d = 2.*(a - np.min(a))/np.ptp(a)-1

If the array contains nan, one solution could be to just remove them as:

def nan_ptp(a):
    return np.ptp(a[np.isfinite(a)])

b = (a - np.nanmin(a))/nan_ptp(a)

However, depending on the context you might want to treat nan differently. E.g. interpolate the value, replacing in with e.g. 0, or raise an error.

Finally, worth mentioning even if it's not OP's question, standardization:

e = (a - np.mean(a)) / np.std(a)
Alex Poca
  • 2,406
  • 4
  • 25
  • 47
user2821
  • 1,568
  • 2
  • 12
  • 16
  • 2
    Depending on what you want, this is not correct, as it flips the data. For example the normalization to [0, 1] puts the max at 0 and min at 1. For [0, 1], you can simple subtract the result from 1 to get the correct normalization. – Alan Turing May 20 '18 at 10:48
  • Thanks for pointing it out @AlanTuring that was very sloppy. The code, as posted, ONLY worked if the data contained both positive and negative values. That might be rather common for audio data. However, answer is updated to normalise out any real values. – user2821 May 20 '18 at 11:10
  • 2
    The last one is also available as `scipy.stats.zscore`. – Lewistrick May 10 '19 at 09:01
  • d might flip the sign of samples. If you want to keep the sign you can use: `f = a / np.max(np.abs(a))`... unless the whole array all zeroes (avoid DivideByZero). – Pimin Konstantin Kefaloukos Dec 21 '19 at 13:51
  • Please make sure `ptp` value is not 0 to not receive `nan`. – Mcmil Mar 11 '20 at 13:31
  • 2
    `numpy.ptp()` returns 0, if that is the range, but `nan` if there is one `nan` in the array. However, if the range is 0, normalization is not defined. This raises an error as we attempt to divide with 0. – user2821 Mar 12 '20 at 02:27
46

You can also rescale using sklearn.preprocessing.scale. The advantages are that you can adjust normalize the standard deviation, in addition to mean-centering the data, and that you can do this on either axis, by features, or by records.

from sklearn.preprocessing import scale
X = scale(X, axis=0, with_mean=True, with_std=True, copy=True)

The keyword arguments axis, with_mean, with_std are self explanatory, and are shown in their default state. The argument copy performs the operation in-place if it is set to False.

Intrastellar Explorer
  • 3,005
  • 9
  • 52
  • 119
cjohnson318
  • 3,154
  • 30
  • 33
  • X = scale( [1,2,3,4], axis=0, with_mean=True, with_std=True, copy=True ) gives me an error – Yfiua Apr 06 '16 at 08:28
  • X = scale( np.array([1,2,3,4]), axis=0, with_mean=True, with_std=True, copy=True ) gives me an array of [0,0,0,0] – Yfiua Apr 06 '16 at 08:28
  • sklearn.preprocessing.scale() has the backdraw that you do not know what is going on. What is the factor? What compression of the interval? – MasterControlProgram Nov 29 '16 at 18:48
  • These scikit preprocessing methods (scale, minmax_scale, maxabs_scale) are meant to be used along one axis only (so either scale the samples (rows) or the features (columns) individually. This makes sense in a machine learing setup, but sometimes you want to calculate the range over the whole array, or use arrays with more than two dimensions. – Toby Nov 16 '17 at 15:56
  • Does not work for arrays with dimension > 2. – zfj3ub94rf576hc4eegm Dec 18 '20 at 05:58
21

You are trying to min-max scale the values of audio between -1 and +1 and image between 0 and 255.

Using sklearn.preprocessing.minmax_scale, should easily solve your problem.

e.g.:

audio_scaled = minmax_scale(audio, feature_range=(-1,1))

and

shape = image.shape
image_scaled = minmax_scale(image.ravel(), feature_range=(0,255)).reshape(shape)

note: Not to be confused with the operation that scales the norm (length) of a vector to a certain value (usually 1), which is also commonly referred to as normalization.

fabda01
  • 3,384
  • 2
  • 31
  • 37
18

This answer to a similar question solved the problem for me with

np.interp(a, (a.min(), a.max()), (-1, +1))
Surya Narayanan
  • 418
  • 6
  • 9
12

You can use the "i" (as in idiv, imul..) version, and it doesn't look half bad:

image /= (image.max()/255.0)

For the other case you can write a function to normalize an n-dimensional array by colums:

def normalize_columns(arr):
    rows, cols = arr.shape
    for col in xrange(cols):
        arr[:,col] /= abs(arr[:,col]).max()
u0b34a0f6ae
  • 48,117
  • 14
  • 92
  • 101
6

A simple solution is using the scalers offered by the sklearn.preprocessing library.

scaler = sk.MinMaxScaler(feature_range=(0, 250))
scaler = scaler.fit(X)
X_scaled = scaler.transform(X)
# Checking reconstruction
X_rec = scaler.inverse_transform(X_scaled)

The error X_rec-X will be zero. You can adjust the feature_range for your needs, or even use a standart scaler sk.StandardScaler()

Pantelis
  • 91
  • 1
  • 4
  • does not work for 1D array – Wildhammer Oct 25 '21 at 17:14
  • Sure, if you consult the documentation of the function (https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html) the array X: Xarray-like of shape (n_samples, n_features) The data used to compute the per-feature minimum and maximum used for later scaling along the features axis. You can just do X=X[..., np.newaxis] (multiple samples, one feature) and it will work for 1-D array. – Pantelis Oct 26 '21 at 20:14
3

I tried following this, and got the error

TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''

The numpy array I was trying to normalize was an integer array. It seems they deprecated type casting in versions > 1.10, and you have to use numpy.true_divide() to resolve that.

arr = np.array(img)
arr = np.true_divide(arr,[255.0],out=None)

img was an PIL.Image object.

srdg
  • 585
  • 1
  • 4
  • 15