0

I want to add missing rows based on the column "id" in a dataframe. The id should be continuous integers, starting from 1 to 60000. A small example is as follows: id ranges from 1 to 5. So I need to add 1,3,4 with value "0"s for the table below.

id value1 value2
2 13 33
5 45 24

The final dataframe would become

id value1 value2
1 0 0
2 13 33
3 0 0
4 0 0
5 45 24

2 Answers2

1

You can set column 'id' as index, then use reindex method to conform df to new index with index from 1 to 5. The reindex method places NaN values in locations that had no values in the previous index, so you use fillna method to fill these with 0s, then reset the index and finally cast df to int dtype:

df = df.set_index('id').reindex(range(1,6)).fillna(0).reset_index().astype(int)

Output:

   id  value1  value2
0   1       0       0
1   2      13      33
2   3       0       0
3   4       0       0
4   5      45      24
0

You may want to look at the DataFrame.append method: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.append.html

It adds rows to a DataFrame

You could use something like the following:

for i in [1, 3, 4]:
    df = df.append({'id':i, 'value1': 0, 'value2': 0}, ignore_index=True)

If you want them to be in order by id afterwards, you could sort it:

df.sort_values(by=['id'], inplace=True)
Vayun
  • 123
  • 7