I'm trying to find out how much duplicated sentences does my dataframe has which is any exact match sentences repeated more than one, I'm using Dataframe.Duplicated but it ignore the first oucurency of the sentences and I want it instead of printing the duplicated sentences just print the duplicated sentences one and the number of its occurrence
the code I'm trying is
wdata = pd.read_csv(fileinput, nrows=0).columns[0]
skip = int(wdata.count(' ') == 0)
wdata = pd.read_csv(fileinput, names=['sentences'], skiprows=skip)
data=wdata[wdata.duplicated()]
print(data)
#dataframe example
#hi how are you
#hello sam how are you doing
#hello sam how are you doing
#helll Alex how are you doing
#hello sam how are you doing
#let us go eat
#where is the dog
#let us go eat
I want my output to be something like
#hello sam how are you doing 3
#let us go eat 2
with duplicated function I get this output
#hello sam how are you doing
#hello sam how are you doing
#let us go eat
this is the output I'm getting with second answer
wdata = pd.read_csv(fileinput, nrows=0).columns[0]
skip = int(wdata.count(' ') == 0)
wdata = pd.read_csv(fileinput, names=['sentences'], skiprows=skip)
data=wdata.groupby(['sentences']).size().reset_index(name='counts')
# sentences counts
#0 hello Alex how are you doing 1
#1 hello sam how are you doing 3
#2 hi how are you 1
#3 let us go eat 1
#4 let us go eat 1
#5 where is the dog 1
I want my output to be something like
#hello sam how are you doing 3
#let us go eat 2