1

I am given a file that contains 1000 .csv files(data0,data1,data2..........,data999) and I need to read all those files. So, I tried it on my own. This was my approach: read data0.csv and perform transpose on it and then loop it through all the data*.csv files and then append them. But I was getting an error. Could someone help me out? Reading data0.csv file and transposing it:

df = pd.read_csv('data0.csv')  
print (df.head(10))
df_temp = df
df_main = df_temp.transpose()
df_main

new_df = [df_main]
for i in range(1000):
filename = "data%d.csv"%i
df_s = pd.read_csv(filename)
new_df= pd.concat([df_s])
new_df[1]

enter image description here

looping through 1000 files, transposing and concating:

enter image description here

after transposing and appending all the 1000 csv files I should be getting 1000 rows x 150 columns. But I am not getting that.

  • Do you have the same headers for the files? – Ranika Nisal Jul 26 '20 at 04:00
  • yes, [link to header of dataset]https://res.cloudinary.com/dnec0sr03/image/upload/v1595737510/Screen_Shot_2020-07-26_at_12.24.01_AM_bdooqr.png I have also put pictures of my code. –  Jul 26 '20 at 04:25
  • I believe https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe should answer your question then. Additionally, when posting new questions, try to post a sample of the code you write instead of adding a picture, It makes it easier for others to debug. – Ranika Nisal Jul 26 '20 at 04:34
  • Hi, I posted the sample code along with the error. Could you please check it now. I have already seen that StackOverflow link before posting this and it didn't work out. –  Jul 26 '20 at 04:41

1 Answers1

0

I couldn't test this, because you did not provide an example of your file as text. Please try to provide a minimal reproducible example next time.

My solution is a minor variation of this SO post mentioned by @Ranika Nisal.

dfs = [pd.read_csv(f'data{i}.csv') for i in range(1000)]
df = pd.concat(dfs, axis=0, ignore_index=True)

Your solution did not generate a list of dataframes which is required for pd.concat() to work. Also, you tried to access the second dataframe with new_df[1] but there was only one element in your list. That's the reason why you've received a KeyError.

above_c_level
  • 3,579
  • 3
  • 22
  • 37