I have Pandas dataframe with one text column. I want to count what phrases are the most common in this column.
For example, from the text, you can see that phrases like a very good movie
, last night
etc. appears a lot of time.
I think that there is a way of defining n-grams, for example that phrase is between 3 and 5 words, but I do not know how to do that.
import pandas as pd
text = ['this is a very good movie that we watched last night',
'i have watched a very good movie last night',
'i love this song, its amazing',
'what should we do if he asks for it',
'movie last night was amazing',
'a very nice song was played',
'i would like to se a good show',
'a good show was on tv last night']
df = pd.DataFrame({"text":text})
print(df)
So my goal is to rank the phrases (3-5 words) that appears a lot of times