0

I want to add new rows in DataFrame pandas each time I run the program I create. I don't know the data in advance, the functions are supposed to put the data in a variable and I want to add these variables in a row. For now I just success to add one row, but when I run the program each time this row is replace by the next one. I don't want the row to be replaced but added in the next row.

net_index = mylist.index('NET PAYE EN EUROS ')
net= mylist[net_index+2]  

total_index= mylist.index('CONGES ')
total = (mylist[total_index-1])


df = pd.DataFrame(columns=['Mois','Nom','Adresse','Net_payé','Total_versé'])
new = {'Mois': mois, 'Nom': nom, 'Adresse': adresse,'Net_payé':net, 'Total_versé':total}
df= df.append(new, ignore_index=True)

This is a part of my code. First I create an empty Dataframe with name of columns, and then a dict with variables which are supposed to change for each run.

This is the result I have, but each time I run, the rows is replace by the next one, and not add

I suppose I have to do a loop, but it never works well, I search everywhere for a solution but don't find one.

So do you know what can I do ? Thank you so much

Fleur
  • 1
  • 1

3 Answers3

0

Apparently, you are not saving the dataframe anywhere. Once your program exits, all data and variables are erased (lost). You cannot retrieve data from a previous run. The solution is to save the dataframe into a file before exiting your program. Then for each run, load the previous data from file.

Abdur Rakib
  • 336
  • 1
  • 12
0

Actually yes I save the dataframe in a csv file. Because my goal is to implement the variables's results in a csv. But the result is the same as I show before, always take the first row and replace it, not add new one.

df = pd.DataFrame(columns=['Mois','Nom', 'Adresse','Net_payé','Total_versé'])
new = {'Mois': mois, 'Nom': nom, 'Adresse': adresse,'Net_payé':net, 'Total_versé':total}
df =df.append(new, ignore_index=True)
    
df.to_csv('test.csv', header=True, index=False, encoding='utf-8')

Thanks for your reply!

Fleur
  • 1
  • 1
0

There are multiple ways to add rows to an existing DataFrame. One way is to use pd.concat, of which the df.append function on the last code line in your questions is a specific use case.

However, the method I prefer is to create a nested list that contains my data, and then create a new DataFrame from scratch. First make sure all the variables you want to place in the columns are lists of the same length. Something like this (lists with a length of 2 in this example):

mois_data = [1,2]
nom_data = [3,4]
adresse_data = [5,6]
net_paye_data = [7,8]
total_verse_data = [9,10]

Place this data in a dictionary. Make sure to set the columns names of your DataFrame as the keywords (note: this might cause problems with the accent aigu you're using in some of the variable names! To be sure, I'm omitting these. You can later rename them including the accent aigu using the rename-function).

data_dict = dict(Mois=mois_data, Nom=nom_data, Adresse=adresse_data, Net_paye=net_paye_data, Total_verse=total_verse_data)

Then create the dataframe, using the dictionary as data input:

df = pd.DataFrame(data=data_dict, columns=['Mois','Nom','Adresse','Net_paye','Total_verse'])

Which results in:

   Mois  Nom  Adresse  Net_paye  Total_verse
0     1    3        5         7            9
1     2    4        6         8           10
MK95
  • 21
  • 5
  • I tried with your code by placing variable in the list like this `data = [[mois,nom,adresse,net,total], []]` The dataframe is create but it's like the same problem as before, it just fill in first row but don't fill the dataframe for each row everytime I run the program. I think I have to use a loop but I don't really know.... – Fleur Jan 19 '22 at 13:50
  • What kind of variables are _mois_, _nom_ etc.? Simply values (e.g. mois=5) or are they list/arrays (e.g. mois=[5,6,7])? – MK95 Jan 19 '22 at 13:55
  • variables mois, net and total are simple value, but the other are list items or string! In my first post there is the link where we can see the dataframe with variables result in. – Fleur Jan 19 '22 at 14:36
  • Okay I understand your problem a bit better now. So a good thing to do first is make sure all variables are lists with the same length, as this makes it easier to place them in a DataFrame. For variables that are simply a value or a string (like mois), you can create this easily using `[mois] * total_list_length`. I have changed my answer based on your response, so once all variables are lists with the same length, you can retry it. Hope this helps! – MK95 Jan 19 '22 at 15:00
  • Thank you so much for your time, I will try it! Have a nice day – Fleur Jan 19 '22 at 15:19
  • No problem! If it works, please accept the answer as the solution of this topic on the left of my post :) – MK95 Jan 19 '22 at 15:24