1

In my script, I would like to read some csv file and at the same time convert input values. But the value of one column depends on another column's (this column is not going to be converted) value. Is there any way to achieve that in read_csv or do I have to change it after csv is read?

file.csv

date        total        percentage
03/25/2017  100          1%
04/15/2016  200          6%

expected output

date        total        success
03/25/2017  100          1
04/15/2016  200          12
def convert_succes(percentage):
    # is there any way to pass an 'total' value to this function?
    return percentage / 100

names = ['date', 'total', 'success']

converters = {
        'date': pandas.to_datetime,
        'success': convert_succes,
    }

input_report = pandas.read_csv('file.csv', names=names, converters=converters)
taurus05
  • 2,491
  • 15
  • 28
maslak
  • 1,115
  • 3
  • 8
  • 22

1 Answers1

1

Strip the string % and convert to float then multiply:

df['success']=df.total*df.percentage.str.rstrip('%').astype('float') / 100.0
print(df)

         date  total percentage  success
0  03/25/2017    100         1%      1.0
1  04/15/2016    200         6%     12.0

To convert it from string to float at the time of reading from a file use the below from here:

def p2f(x):
    return float(x.strip('%'))/100

 pd.read_csv(file, sep='whatever',converters={'percentage':p2f})
anky
  • 74,114
  • 11
  • 41
  • 70
  • Second solution does not calculate the right success, because it just removes the % and divides it by 100. It should use 'total' column to calculate proper success. – maslak Jan 30 '19 at 12:28
  • @myslak that was just for conversion, post that just multiply. :) `df.total*df.percentage` – anky Jan 30 '19 at 12:29
  • @myslak if this solution solved your problem, please consider marking it as [accepted](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work/5235#5235) – anky Jan 30 '19 at 13:53