1

I have a quick simple question. I have a dataframe such as this

  Column_4 Column_4
0     Data     Data
1   102832    79639
2    50034    21058
3    38230    18963
4    34503    14216

What I want to do is divide the values in each column by a scalar value that is derived from another dataframe, such as this.

          0         1
0  103000.0     500.0

So this is what I want to happen:

    Column_4           Column_4
0     Data               Data
1   102832/103000.0     79639/500.0
2    50034/103000.0     21058/500.0
3    38230/103000.0     18963/500.0
4    34503/103000.0     14216/500.0

I know you can divide a dataframe by scalar values, and even dataframes by values in other dataframes (by using the index to match them); however I have yet to find a way to divide a column of a dataframe by a scalar value that is obtained from another dataframe using pandas such as the above example.

Edit: There was a string in my column (Data), so I have removed that. This is the new dataframe.

  Column_4 Column_4 
1   102832   79639   
2    50034   21058    
3    38230   18963   
4    34503   14216     
5    34219   11450    
samman
  • 559
  • 10
  • 29

4 Answers4

1

You can convert the second data frame to list and do normal division (pandas will broadcast the division to each column)

# test data
import pandas as pd
df1 = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6]})
df2 = pd.DataFrame({'c':[7], 'd':[8]})

#df1
#   a  b
#0  1  4
#1  2  5
#2  3  6

#df2
#   c  d
#0  7  8

res = df1[['a','b']]/df2.loc[0,['c','d']].tolist()

print(res)

#          a      b
#0  0.142857  0.500
#1  0.285714  0.625
#2  0.428571  0.750
9mat
  • 1,194
  • 9
  • 13
  • Is there anyway to use pandas commands exclusively though (without having to convert my dataframe into a list)? – samman Dec 20 '19 at 03:06
1

If there is only one row in your second df:

df1 = pd.DataFrame({"col1":["data",100,200,300,400],
                    "col2":["data",200,300,400,500]})

df2 = pd.DataFrame({"col1":[100],
                    "col2":[200]})

print (df1.iloc[1:].astype(int)/df2.values)

   col1  col2
0   1.0   1.0
1   2.0   1.5
2   3.0   2.0
3   4.0   2.5
Henry Yik
  • 22,275
  • 4
  • 18
  • 40
  • When I try that I get this error: ValueError: Unable to coerce to DataFrame, shape must be (10, 4): given (1, 2) ......which is strange because you have a (4,2) and are dividing by a (1,2) in this example. So if it works for you it should work for me as well – samman Dec 20 '19 at 02:57
  • I apologize, I had the setup incorrectly (the # of columns didn't match). With that corrected though, I get this error: TypeError: unsupported operand type(s) for /: 'str' and 'float' – samman Dec 20 '19 at 03:16
  • Looking at your data again you seemed to have `data` as a string value in your columns. If so you need to skip that column. See edited above. – Henry Yik Dec 20 '19 at 03:18
  • So I removed that row entirely. But I'm still getting the same error – samman Dec 20 '19 at 03:27
  • Try `.astype(int)` before the division then, but i think you should probably go through your column for some cleanup first. There might be more strings in your column than you realize. – Henry Yik Dec 20 '19 at 03:28
  • Does dataframe.drop actually remove the value from the dataframe? Or does it simply hide it? I notice my datatable starts from 1 now instead of zero (which would imply they're still there, just not visible anymore) – samman Dec 20 '19 at 03:35
  • You need to specify whether you want to do it in place by calling `df.drop(...,inplace=True)` or through reassignment `df = df.drop(...)`. – Henry Yik Dec 20 '19 at 03:41
  • In place simply replies with none. This is the line I used .drop([0],axis=0). Also using .astype(int) worked, but I simply don't understand why it didn't read my values as int – samman Dec 20 '19 at 03:43
  • Like i said, you either do `inplace=True` which modifies your df in place and returns None, or assign it explicitly by `df = df.drop([0],axis=0).`. [Here](https://stackoverflow.com/questions/43893457/understanding-inplace-true) is a good read on how inplace work. – Henry Yik Dec 20 '19 at 03:45
  • I understand that, what I don't understand is why I need to put in .astype(int), when my values are integers and not strings. – samman Dec 20 '19 at 03:48
  • `drop` returns a copy of the original df which had a `dtype` of `object` since you had string in it, so you need to coerce it into `int`. – Henry Yik Dec 20 '19 at 03:57
1

Try this:

import pandas as pd
import numpy as np

df1 = pd.DataFrame([[102832, 79639], [50034, 21058], [38230, 18963], [34503, 14216]], columns=['A', 'B'])
df1

    A       B
0   102832  79639
1   50034   21058
2   38230   18963
3   34503   14216

df2 = pd.DataFrame([[103000.0, 500.0]])
df2

           0        1
0   103000.0    500.0

Columns names need to be the same from both tables

df1 / pd.DataFrame(np.repeat(df2.values, df1.shape[0], axis=0), columns=['A', 'B'])

    A           B
0   0.998369    159.278
1   0.485767    42.116
2   0.371165    37.926
3   0.334981    28.432
cetres
  • 56
  • 3
  • This provides me with this error: TypeError: unsupported operand type(s) for /: 'str' and 'float' – samman Dec 20 '19 at 03:17
  • samman, the values of your first dataframe needs to be in numeric format. Try this: `df1.apply(pd.to_numeric, errors='coerce')` – cetres Dec 20 '19 at 03:32
  • I used dataframe.drop to remove it, but I'm getting the same error – samman Dec 20 '19 at 03:36
  • Can you see both columns with int64 types using this command? `df1.dtypes` – cetres Dec 20 '19 at 03:41
  • This is the output. Column_4 object Column_4 object dtype: object – samman Dec 20 '19 at 03:46
  • so this is why is not working. Both columns needs a numeric format i.e. int64, float64, etc. Not object. Verify the command: `df1 = df1.apply(pd.to_numeric, errors='coerce')` You need to set the variable df1 again – cetres Dec 20 '19 at 03:53
1

With you data set, you need to do it like this:

df4['Column_4'][1:].astype(int) / df5['0'][0]                                                                                                                                        
Out[363]: 
1    0.998369
2    0.485767
3    0.371165
4    0.334981
oppressionslayer
  • 6,942
  • 2
  • 7
  • 24