create n copies of rows in pandas

Question

What is the fastest way to expand (copy n times) the rows of a dataframe based on a value of a column. So, if the value of the column in that row is 10, that row has to be copied 10 times.

Example:

import pandas as pd
df = pd.DataFrame({"A":[1,45], "B":[2,3]})

operation

The result should look like this:

score 4 · Accepted Answer · answered Mar 02 '20 at 19:48

4

You can make do with repeat and loc:

df.loc[df.index.repeat(df['B'])]

Output:

answered Mar 02 '20 at 19:48

Quang Hoang

146,074
10
56
74

this gives an error, "TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe' – pnkjmndhl Mar 02 '20 at 20:04
it worked with changing the code to df.loc[df.index.repeat(df['B']).astype('int')] – pnkjmndhl Mar 02 '20 at 20:10

score 1 · Answer 2 · answered Mar 02 '20 at 19:56

1

Try using this:

df.loc[np.repeat(df.index.values, df['B'])]

Repeats the row in specific index for the number of times specified in column B.

Also you can try and look here: Python Pandas replicate rows in dataframe Not exactly the same need but a lot of solutions to learn from for replicating rows.

answered Mar 02 '20 at 19:56

Miki Segall

51
3

this gives an error, "TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe' – pnkjmndhl Mar 02 '20 at 20:05

ansev · Answer 3 · 2020-03-02T20:11:10.450

1

We could also use DataFrame.reindex with Index.repeat

df.reindex(df.index.repeat(df['B']))
    A  B
0   1  2
0   1  2
1  45  3
1  45  3
1  45  3

if you need:

df.reindex(df.index.repeat(df['B']).astype(int))

edited Mar 02 '20 at 20:11

answered Mar 02 '20 at 19:57

ansev

30,322
5
17
31

this gives an error, "TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe' – pnkjmndhl Mar 02 '20 at 20:05
1

try: `df.index.repeat(df['B']).astype(int)` – ansev Mar 02 '20 at 20:09

create n copies of rows in pandas

3 Answers3