1

I have 3 datasets which need to be read and then do some calculation on some columns and makes a new columns, finally I have to save new columns in a new csv file. I do not know how to do it dynamically as in each iteration I need to save with different name. For example the following code does not work.

df.to_csv("./dataset/file'+i+'.csv',index=False) 

and i is the iteration number in my loop.

Elham
  • 827
  • 2
  • 13
  • 25

1 Answers1

2

Referring to my answer to a similar problem,

Here is a solution with pandas. Assume the content of csv as follows:

Name, Age, Gender
John, 20, Male
Jack, 22, Male
Jill, 18, Female

And my code is as follows:

import pandas as pd
df = pd.read_csv("mock_data.csv")

for index, row in df.iterrows():
    file_name = row['Name']+".csv"  #Change the column name accordingly
    pd.DataFrame(row).T.to_csv(file_name, index=None)

This will create filenames based on the values of the column "Name" (i.e. Jack, John and Jill) to produce three files John.csv, Jack.csv and Jill.csv. Content of John.csv is as follows:

Name    | Age   |  Gender |
---------------------------
John    | 20    |  Male   |

Content of Jack.csv is as follows:

Name    | Age   |  Gender |
---------------------------
Jack    | 22    |  Male   |

Content of Jill.csv is as follows:

Name    | Age   |  Gender |
---------------------------
Jill    | 20    |  Female   |

P.S: If you don't want the header, just add header = None when calling .to_csv() function. For example:

pd.DataFrame(row).T.to_csv(file_name, index=None, Header=None)

The hint is to use file_name as a variable generated from any of your columns or input before (or while) passing to the to_csv() function.

kingmakerking
  • 2,017
  • 2
  • 28
  • 44