Insert new rows into pandas dataframe depending on range of 2 columns

Question

I have a df like:

ID       value1   start     end
 1         100     1        2       
 1         200     2        4
 2         50      1        1
 2         30      3        5

I want to fill in new rows for each value between start and end

The final df should look like this where period is each unique value between start and end

    ID       value1      period     
     1         100     1        
     1         100     2            
     1         200     2 
     1         200     3 
     1         200     4       
     2         50      1        
     2         30      3    
     2         30      4 
     2         30      5

score 1 · Accepted Answer · answered Feb 28 '19 at 03:42

After using for loop create the list of column with range , this become an unnesting problem

df['New']=[list(range(x,y+1)) for x , y in zip(df.start,df.end)]
Yourdf=unnesting(df,['New'])
Yourdf
   New  ID  value1  start  end
0    1   1     100      1    2
0    2   1     100      1    2
1    2   1     200      2    4
1    3   1     200      2    4
1    4   1     200      2    4
2    1   2      50      1    1
3    3   2      30      3    5
3    4   2      30      3    5
3    5   2      30      3    5

score 1 · Answer 2 · answered Feb 28 '19 at 03:54

Or use the below, list comprehension:

>>> df['period']=[','.join(map(str,range(x,y+1))) for x,y in zip(df['start'],df['end'])]
>>> df.set_index(['ID','value1','start','end']).stack().str.split(',', expand=True).stack().unstack(-2).reset_index(-1,drop=True).reset_index()
   ID  value1  start  end period
0   1     100      1    2      1
1   1     100      1    2      2
2   1     200      2    4      2
3   1     200      2    4      3
4   1     200      2    4      4
5   2      30      3    5      3
6   2      30      3    5      4
7   2      30      3    5      5
8   2      50      1    1      1
>>>

If you want to drop columns:

>>> df['period']=[','.join(map(str,range(x,y+1))) for x,y in zip(df.pop('start'),df.pop('end'))]
>>> df.set_index(['ID','value1']).stack().str.split(',', expand=True).stack().unstack(-2).reset_index(-1,drop=True).reset_index()
   ID  value1 period
0   1     100      1
1   1     100      2
2   1     200      2
3   1     200      3
4   1     200      4
5   2      30      3
6   2      30      4
7   2      30      5
8   2      50      1
>>>

Insert new rows into pandas dataframe depending on range of 2 columns

2 Answers2