Adding rows with csv.writer.writerow / csv.writer.writerows is removing headers / columns

Question

Been testing with some data management techniques with pandas and csv. What I'm trying to do is read a csv file, add some extra rows to it, and save again with identical format.

I've created a dataframe of shape (250, 20) with random values, dates as index and alphabets as columns then saved it as a csv file. Ultimately what I've tried is to append the same dataframe below an existing csv file.

def _writeBulk(savefile, data):
    df = data.reset_index()
    with open(savefile, 'w', newline='') as outfile:
        writer = csv.writer(outfile)
        writer.writerows(df.to_numpy().tolist())
        outfile.close()

def _writeData(savefile, data):
    df = data.reset_index()
    with open(savefile, 'w', newline='') as outfile:
        writer = csv.writer(outfile)
        for row in range(df.shape[0]):
            writer.writerow(df.iloc[row].tolist())
        outfile.close()

When reading in the file again after edit, the result I'm expecting is a dataframe of shape (500,20). But it seems that the file does not have headers(columns) anymore, with shape (499, 200).

I've searched for solutions or explanations but skipping the header while writing rows is the closest I've ever got to the actual issue.

Skip the headers when editing a csv file using Python

Any explanations or solutions would be appreciated.

1. Headers: Either add writer.writeheaders or use csv.DictWriter. 2. "I've tried is to append the same dataframe below an existing csv file." - 'w' mode will overwrite your whole file, not append. Any existing headers and all existing rows will be lost — h4z3, Oct 06 '21 at 13:43
Cnt believed I haven't noticed that... thank you very much. I'll try the 'a' mode instead. — Ji-Mu Kang, Oct 06 '21 at 13:48

Shine J · Answer 1 · 2021-10-06T14:15:33.057

Firstly, if your .csv has the first column as date (also the index), when you read that file into a DataFrame you don't have to use .reset_index()

For example, you csv might look like this in Excel:

When you read that to a dataframe it becomes:

If you simply want to append a new DataFrame with the same number of columns Date, A,B,C... you can simply do so:

source_df = pd.read_csv('initial.csv')

# copy of the same dataframe
# it could be a different df as per your requirement
new_df = source_df.copy()

# appending the 2nd dataframe to the 1st
final_df = source_df.append(new_df)

# writing to .csv
final_df.to_csv('final.csv', index=False)

We're setting the index=False to avoid DateFrame default indexing (0,1,2,3...) to be written to the final .csv file

Your final df is going to look like:

And when viewed in Excel:

Having said all this, if you want your DataFrame to have date as the index, but when you write to .csv, it should have columns data, A,B,C,...

Do this:

source_df = pd.read_csv('test.csv')

# copy of the same dataframe
new_df = source_df.copy()

# appending the 2nd dataframe to the 1st
final_df = source_df.append(new_df)

final_df.set_index(['date'], inplace=True)
final_df.to_csv('final.csv')

And you'll have, df:

And when viewed in Excel:

Thanks for your contribution. The data parameter is the dataframe that is already in memory, with dates as index. So before writing rows, I had to reset the index for list conversion. I've chosen this method since the data I'm supposed to actually work with has 1e8 rows & I don't think reading in the whole thing and concatenating it will be a good idea. Do you have any sort of suggestions that effectively concatenates such large datasets? — Ji-Mu Kang, Oct 06 '21 at 16:35

Adding rows with csv.writer.writerow / csv.writer.writerows is removing headers / columns

1 Answers1