My question is little bit different than the question posted here
So I thought to open a new thread.I have a pandas data frame with 5 attributes.One of these attribute is created using pandas series.Here is the sample code for creating the data frame
import numpy as np
mydf1=pd.DataFrame(columns=['group','id','name','mail','gender'])
data = np.array([2540948, 2540955, 2540956,2540956,7138932])
x=pd.Series(data)
mydf1.loc[0]=[1,x,'abc','abc@xyz.com','male']
I have another data frame,the code for creating the data frame is given below
mydf2=pd.DataFrame(columns=['group','id'])
data1 = np.array([2540948, 2540955, 2540956])
y=pd.Series(data1)
mydf2.loc[0]=[1,y]
These are sample data. Actual data will have large number of rows & also the series length is large too .I want to match mydf1 with mydf2 & if it matches,sometime I wont have matching element in mydf2,then I will delete values of id from mydf1 which are there in mydf2 for example after the run,my id will be for group 1 2540956,7138932
. I also tried the code mentioned in above link. But for the first line
counts = mydf1.groupby('id').cumcount()
I got error message as TypeError: 'Series' objects are mutable, thus they cannot be hashed in my Python 3.X. Can you please suggest me how to solve this?