4

I have a dataframe that looks like below:

 **L_Type   L_ID    C_Type      E_Code**
    0       1           1         9
    0       1           2         9
    0       1           3         9
    0       1           4         9
    0       2           1         2
    0       2           2         2
    0       2           3         2
    0       2           4         2
    0       3           1         3
    0       3           2         3
    0       3           3         3
    0       3           4         3

I need to insert a new row after every 4 row and increment the value in third column (C_Type) by 01 like below table while keeping the values same as first two columns and does not want any value in last column:

 L_Type     L_ID    C_Type          E_Code
    0       1           1           9
    0       1           2           9
    0       1           3           9
    0       1           4           9
    0       1           5           
    0       2           1           2
    0       2           2           2
    0       2           3           2
    0       2           4           2
    0       2           5           
    0       3           1           3
    0       3           2           3
    0       3           3           3
    0       3           4           3
    0       3           5           

I have searched other threads but could not figure out the exact solution:

How to insert n DataFrame to another every nth row in Pandas?

Insert new rows in pandas dataframe

Community
  • 1
  • 1
Baig
  • 469
  • 2
  • 7
  • 19

1 Answers1

4

You can seelct rows by slicing, add 1 to column C_Type and 0.5 to index, for 100% sorrect slicing, because default method of sorting in DataFrame.sort_index is quicksort. Last join together, sort index and create default by concat with DataFrame.reset_index and drop=True:

df['C_Type'] = df['C_Type'].astype(int)

df2 = (df.iloc[3::4]
         .assign(C_Type = lambda x: x['C_Type'] + 1, E_Code = np.nan)
         .rename(lambda x: x + .5))
df1 = pd.concat([df, df2], sort=False).sort_index().reset_index(drop=True)
print (df1)
    L_Type  L_ID  C_Type  E_Code
0        0     1       1     9.0
1        0     1       2     9.0
2        0     1       3     9.0
3        0     1       4     9.0
4        0     1       5     NaN
5        0     2       1     2.0
6        0     2       2     2.0
7        0     2       3     2.0
8        0     2       4     2.0
9        0     2       5     NaN
10       0     3       1     3.0
11       0     3       2     3.0
12       0     3       3     3.0
13       0     3       4     3.0
14       0     3       5     NaN
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Hi jezreal, thanks for the quick reply, when I run the code I get an error message: TypeError: can only concatenate str (not "int") to str – Baig Sep 17 '19 at 06:27
  • @Baig - I think there are strings in `df['C_Type']` column, so use `df['C_Type'] = df['C_Type'].astype(int)` before my solution – jezrael Sep 17 '19 at 06:28
  • 2
    Thanks for the update. I have changed the column from String to Int and its fine now. Is there any way I can get rid of the values in the last column for 5th row which we just inserted? – Baig Sep 17 '19 at 06:32