0

I'm trying to subtract the column average of an array from the respective column of the array using slicing and broadcasting. I don't understand how to transpose or why I need to, right now I have the given array Y.

    Y_avg = Y.mean(axis=0)
    Z = (Y.T - Y_avg).T

This is supposed to create an array that now has a column-wise average of 0. But that's not what I am getting

  • This works just fine for me. The reason you need to transpose is because of how numpy internally broadcasts array shapes. If you tried to do `Y - Y_avg` directly, it would not have the correct shapes to perform the operation since `.mean()` on an axis effectively drops a dimension. Then once you perform the operation on a transposed `Y`, you transpose the result back to the original shape of `Y`. – Philip Ciunkiewicz Jul 01 '20 at 21:58

2 Answers2

0

And what are you getting? Initializing an array, performing an average with axis=0 (because this is a 1D array), works as intended.

import numpy as np

Y = np.array([1,2,3])
Y_avg = Y.mean(axis=0)
print Y - Y_avg

This outputs [-1. 0. 1.] as expected.

pkthudah
  • 21
  • 4
0

What you're seeing is that taking the mean along an axis drops a dimension, moving the data from shape (n, k) to shape (n,). This isn't compatible with (n, k) for broadcasting a subtraction. Plenty has been written on that, e.g. here https://stackoverflow.com/a/24564015/3798897

Instead of multiple transposes it might be more convenient to reshape the averages so that they're broadcastable:

# Transform the single-dimension mean into a 2D column vector
Y_avg = Y.mean(axis=1).reshape(-1, 1)
Z = Y - Y_avg
Hans Musgrave
  • 6,613
  • 1
  • 18
  • 37