I am having a dataset of all the abstracts and the author gender. Now i want to get the all the repetitions of words gender wise so that i can plot it as a graph number of repetition of words with respect to gender.
data_path = '/content/digitalhumanities - forum-and-fiction.csv'
def change_table(data_path):
df = pd.read_csv(data_path)
final = df.drop(["Title", "Author", "Season", "Year", "Keywords", "Issue No", "Volume"], axis=1)
fin = final.set_index('Gender')
return fin
change_table(data_path).T
This is the out put i got
| Gender | None | Female | Male | None | None | Male ,Female |None | Male ,Female |
|:----------|---------------------------------------------------|---------------------------------------------------|---------------------------------------------------|------------|---------------------------------------|---------------------------------------------------|---------------------------------------------------|---------------------------------------------------|---------------------------------------------------|---------------------------------------------------:|
| Abstract | This article describes Virginia Woolf's preocc... | The Amazonian region occupies a singular place... | This article examines Kipling's 1901 novel Kim... | Pamela; or | Virtue Rewarded uses a literary fo... | This article examines Nuruddin Farah's 1979 no... | Ecological catastrophe has challenged the cont... | British political fiction was a satirical genr... | The Lydgates have bought too much furniture an...
Now how can i get the repetition of each word in the abstract with respect to gender and append to the data frame.
Expecting output example
|gender|male|female|none|
|------|----|------|----|
| This | 3| 0| 0|
| occupies | 5| 3| 0|
| examines | 6| 0| 0|
| British | 0| 0| 7|