1

I've been trying to subtract two matrices (not sure if they are really matrices since one of them is pandas series) but the result is not true. I added the code and outputs, how can I get the correct 200X1 shape result?

*X is 200x4 and w is 4x1

Code

apotamkinn
  • 61
  • 6
  • Can you provide a sample and the expected output, please? – Corralien Mar 06 '22 at 10:25
  • 1
    The problem is that you are falling into the [numpy broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html) trap... You need to make sure that y also has the shape of (200,1) instead of (200,). You can try to do this by adding a [newaxis](https://stackoverflow.com/questions/29241056/how-does-numpy-newaxis-work-and-when-to-use-it). – PrinsEdje80 Mar 06 '22 at 10:27
  • I tried to convert y_hat to a pandas series but havent done the inverse... thank you – apotamkinn Mar 06 '22 at 10:32

2 Answers2

1

Maybe it could help:

rng = np.random.default_rng(2022)
df = pd.DataFrame(rng.integers(0, 10, (5, 4)))
sr = pd.Series(rng.integers(0, 10, (5, )))
>>> df
   0  1  2  3
0  7  2  7  0
1  1  6  9  0
2  0  6  8  7
3  8  1  5  0
4  0  4  8  9

>>> sr
0    3
1    9
2    0
3    2
4    6
dtype: int64

>>> df - sr  # does not work
   0  1  2  3   4
0  4 -7  7 -2 NaN
1 -2 -3  9 -2 NaN
2 -3 -3  8  5 NaN
3  5 -8  5 -2 NaN
4 -3 -5  8  7 NaN

>>> df.sub(sr, axis=0)  # work
   0  1  2  3
0  4 -1  4 -3
1 -8 -3  0 -9
2  0  6  8  7
3  6 -1  3 -2
4 -6 -2  2  3
Corralien
  • 109,409
  • 8
  • 28
  • 52
1

You can reshape y_hat :

y - y_hat.reshape(-1,1)

The reason you get (200,200) is because of numpy broadcasting, it treats y_hat as (1,200), so in order for the shapes to match, numpy broadcasts y into (200,200) and y_hat into (200,200) and then does the substraction.

sagi
  • 40,026
  • 6
  • 59
  • 84