0

I'm trying to create a data frame using multiple CSV files. For this, I created a for loop but it only runs once. All data is already downloaded in CSV format.

csvs = [
    "2014-2015.csv",
    "2015-2016.csv",
    "2016-2017.csv",
    "2017-2018.csv",
    "2018-2019.csv",
    "2019-2020.csv",
    "2020-2021.csv",
    "2021-2022.csv"
]
 
for csv in csvs: 
    data = pd.read_csv(csv)
    data_final = pd.DataFrame()
    data_final = data_final.append(data)
accdias
  • 5,160
  • 3
  • 19
  • 31
  • 1
    You seem to be (re-)creating your "final" dataframe inside the loop, effectively throwing away the previously created df in the next iteration. Move the creation before the loop. – Mike Scotty Sep 01 '21 at 10:50
  • The following link might help you https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe – Rinshan Kolayil Sep 01 '21 at 10:55

2 Answers2

0

You must append your csv files as a dataframe and then merge them

csvs = [ "2014-2015.csv", "2015-2016.csv", "2016-2017.csv", "2017-2018.csv", "2018-2019.csv", "2019-2020.csv", "2020-2021.csv", "2021-2022.csv" ]

list_of_dataframes = []
for filename in csvs :
    list_of_dataframes.append(pd.read_csv(filename))

merged_df = pd.concat(list_of_dataframes)

print(merged_df)
0

You can do that with a comprehension like this:

import pandas as pd

csv_files = [
    '2014-2015.csv',
    '2015-2016.csv',
    '2016-2017.csv',
    '2017-2018.csv',
    '2018-2019.csv',
    '2019-2020.csv',
    '2020-2021.csv',
    '2021-2022.csv'
]

combined_df = pd.concat([pd.read_csv(_) for _ in csv_files])
accdias
  • 5,160
  • 3
  • 19
  • 31