0

I've checked other questions here but I don't think they've answered my issue (though it is quite possible I don't understand the solution).

I have daily data CSV files and have created a year-long pandas dataframe with a datetime index. I'm trying to merge all of these CSVs onto the main DataFrame and populate the columns, but I end up with hundreds of columns with the _x _y appendix as they all have the same column names.

I want to populate all these columns in-place, I know there must be a logical way of doing so but I can't seem to find it.

Edit to add info:

The original dataframe has several columns, of which I use a subset.

Index  SOC  HiTemp  LowTemp  UploadTime           Col_B  Col_C  Col_D  Col_E
0      55    24       22     2019-01-01T00:02:00    z      z      z      z
1
2

I create an empty dataframe with the datetimeindex I want then run a loop for all of the CSV files.

datindex = pd.DatetimeIndex(start="01/01/2019",periods = 525600, freq = 'T')
master_index = pd.DataFrame(index=datindex)

for fname in os.listdir('.'):
        data = pd.read_csv(fname)
        data["UploadTime"] = data["UploadTime"].str.replace('T','-').str[:-3]
        data["UploadTime"] = pd.to_datetime(data["UploadTime"], format="%Y-%m-%d- 
        %H:%M")
        data.drop_duplicates(subset="UploadTime", keep='first', inplace=True)
        data.set_index("UploadTime", inplace=True)
        selection = data[['Soc','EDischarge', 'EGridCharge', 
        'Echarge','Einput','Pbat','PrealL1','PrealL2','PrealL3']].copy(deep=True)
        master_index = master_index.merge(selection, how= "left", left_index=True,right_index=True)

The initial merge creates the appropriate columns in master_index, but each subsequent merge creates a new set of columns: I want them to fill up the same columns, overwriting the NaN that the initial merge put there. In this way I should end up with as complete a dataset as possible (some days and timestamps are missing)

J.Komodo
  • 101
  • 9
  • Could you give some data and code along with an example of your desired output? – bjschoenfeld Aug 27 '19 at 16:35
  • Possible duplicate of [Import multiple csv files into pandas and concatenate into one DataFrame](https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe) – Hampus Larsson Aug 27 '19 at 16:36

1 Answers1

0

If you're talking about the headers as the 'appendix', you probably need to skip the first line before opening the CSVReader. EDIT: This assumes all the columns in the csv's are sequenced the same, otherwise you'd have to map to a list after reading in the header

Catamondium
  • 393
  • 4
  • 10