To duplicate the rows according to the value in column 'n', and reassign the value in column 'v' with the average (of v divided by n), like below:
I am following the sample at Replicating rows in a pandas data frame by a column value.
import pandas as pd
import numpy as np
df = pd.DataFrame(data={
'id': ['A', 'B', 'C'],
'n' : [1, 2, 3],
'v' : [ 10, 13, 8]
})
df2 = df.loc[np.repeat(df.index.values,df.n)]
#pd.__version__ 0.20.3
#np.__version__ 1.15.0
But it returns me an error message:
Traceback (most recent call last):
File "C:\Python27\Working Scripts\pv.py", line 14, in <module>
df2 = df.loc[np.repeat(df.index.values, df.n)]
File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 445, in repeat
return _wrapfunc(a, 'repeat', repeats, axis=axis)
File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 61, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 41, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'
What goes wrong here and how can I correct it? Thank you. (Some others pandas and numpy scripts work all fine in the computer. )