2

I have two pandas dataframes, each of them is (15, 1) shape.

When subtracting one from the other, the result is a (15, 15) shape dataframe. The first column gives the correct subtraction values, but the other fourteen columns are filled with NaN values. (I get the same result using both traditional subtraction notation and .sub() )

Why is it introducing 14 additional columns? Shouldn't the result be a (15, 1) dataframe?

enter image description here

The dataframes are a concatenation of sections of another dataframe, hence the column/row labelling.

tcolbert
  • 33
  • 5
  • What is your code for subtract? – jezrael Sep 25 '20 at 06:57
  • @jezrael Something along the lines of ```df3 = df1.sub(df2)``` or ```df3 = df1 - df2``` – tcolbert Sep 25 '20 at 07:19
  • What is `print (df1.info())` and `print (df2.info())` ? – jezrael Sep 25 '20 at 07:21
  • 1
    @jezrael Ah, that may have uncovered the issue. One of them was a true series, the other was a df with one column, 15 entries. Not realizing this, I was attempting to sub a df from a series. and switching the two (although values are negative) got ride of the unwanted columns filled with NaN values.I did not know about info() – tcolbert Sep 25 '20 at 08:05
  • ya, it was reason. Like in my sample `df1.sub(df2['B'])` – jezrael Sep 25 '20 at 08:06

1 Answers1

0

I think you need subtract Series - selecting both columns:

np.random.seed(2020)

df1 = pd.DataFrame({'A':np.random.randint(10, size=15)})
df2 = pd.DataFrame({'B':np.random.randint(10, size=15)})

s = df1['A'].sub(df2['B'])
print (s)
0    -3
1     2
2    -2
3     6
4    -1
5    -5
6     1
7     4
8    -1
9    -1
10    3
11    0
12   -2
13    1
14   -4
dtype: int32

Or select second one with axis=0 in DataFrame.sub:

s = df1.sub(df2['B'], axis=0)
print (s)
    A
0  -3
1   2
2  -2
3   6
4  -1
5  -5
6   1
7   4
8  -1
9  -1
10  3
11  0
12 -2
13  1
14 -4
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • I'm not very familiar with pandas. Does specifically naming the column B to subtract from A yield a different result than subtracting the entirety of df2 from df1? – tcolbert Sep 25 '20 at 07:13
  • @tcolbert - I think it is more complidated like seems, you can check [this](https://stackoverflow.com/questions/53217607/how-do-i-operate-on-a-dataframe-with-a-series-for-every-column) for more info. – jezrael Sep 25 '20 at 07:15