1

I have a script that runs on a daily basis to collect data. I record this data in a CSV file using the following code:

old_df = pd.read_csv('/Users/tdonov/Desktop/Python/Realestate Scraper/master_data_for_realestate.csv')
old_df = old_df.append(dataframe_for_cvs, ignore_index=True)
old_df.to_csv('/Users/tdonov/Desktop/Python/Realestate Scraper/master_data_for_realestate.csv')

I am using append(ignore_index=True), but after every run of the code I still get additional columns created at the start of my CSV. I delete them manually, but is there a way to stop them from the code itself? I looked the function but I am still not sure if it is possible. My result file gets the following columns added after every run (one at a time, after each run): enter image description here

This is really annoying to have to delete everytime.

Update: Data looks like that: enter image description here However the id is not unique. Every day it can be repeated. In my case it is not unique. This is an id of an online offer. The offer can be available for one day or for 5 months, or couple of days.

smci
  • 32,567
  • 20
  • 113
  • 146
tsetsko
  • 43
  • 7
  • 1
    Can you show a sample of your actually CSV file (in text)? –  Nov 27 '21 at 18:07
  • 1
    You _have_ to have an index. If you don't want it to be the numbers, you can set a different column to be the index. Perhaps `id`? `old_df = old_df.set_index('id')` –  Nov 27 '21 at 18:08
  • Updated the question. My ID is not really an id in that case. It is the ID of an offer. The offer can be available today and 5 more days, or months, etc. I use this ID to count how many days an offer has been live. – tsetsko Nov 27 '21 at 18:14

1 Answers1

1

Did you try

to_csv(index=False)
Antony Hatchkins
  • 31,947
  • 10
  • 111
  • 111
  • Does the index, appear or `append` or `to_csv`? Why I was not looking at the `to_csv`, I don't know. Yes this solves the problem. – tsetsko Nov 27 '21 at 18:34