7

My question is similar to one asked here. I have a dataframe and I want to repeat each row of the dataframe k number of times. Along with it, I also want to create a column with values 0 to k-1. So

import pandas as pd

df = pd.DataFrame(data={
  'id': ['A', 'B', 'C'],
  'n' : [  1,   2,   3],
  'v' : [ 10,  13,   8]
})

what_i_want = pd.DataFrame(data={
  'id': ['A', 'B', 'B', 'C', 'C', 'C'],
  'n' : [ 1, 2, 2, 3, 3, 3],
  'v' : [ 10,  13, 13, 8, 8, 8],
  'repeat_id': [0, 0, 1, 0, 1, 2]
})

Command below does half of the job. I am looking for pandas way of adding the repeat_id column.

df.loc[df.index.repeat(df.n)]
kampta
  • 4,748
  • 5
  • 31
  • 51

1 Answers1

4

Use GroupBy.cumcount and copy for avoid SettingWithCopyWarning:

If you modify values in df1 later you will find that the modifications do not propagate back to the original data (df), and that Pandas does warning.

df1 = df.loc[df.index.repeat(df.n)].copy()
df1['repeat_id'] = df1.groupby(level=0).cumcount()
df1 = df1.reset_index(drop=True)
print (df1)
  id  n   v  repeat_id
0  A  1  10          0
1  B  2  13          0
2  B  2  13          1
3  C  3   8          0
4  C  3   8          1
5  C  3   8          2
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    Thanks for a quick reply! Apparently I can accept an answer only after 10 minutes :D – kampta May 11 '18 at 11:22
  • @kampta - Really nice question (inut, data, output data, what you try), unfortunately not very often in SO in these days... – jezrael May 11 '18 at 11:25
  • What happens if you don't `copy()`? I'm struggling to see what the issue with doing that is. – FHTMitchell May 11 '18 at 11:27
  • @FHTMitchell - I get `SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy` – jezrael May 11 '18 at 11:29
  • @jezrael Yeah I got that warning too. I continued anyway and nothing happened to `df`. Odd. – FHTMitchell May 11 '18 at 11:31