I have a pandas dataframe representing a library. The columns represent meta data, such as author, title, year and text. The text column contains lists with the book text, where each list element represents a sentence in the book (see below)
Author Title Text
0 Smith ABC ["This is the first sentence", "This is the second sentence"]
1 Green XYZ ["Also a sentence", "And the second sentence"]
I want to carry out some NLP analysis on the sentences. For individual examples I would use list comparisons, but how can I use list comparisons for the column in the most Pythonic way?
What I want to do is e.g. make a new column with a list of sentences containing the word "the"
, such as in this example: How to test if a string contains one of the substrings in a list, in pandas?
However, they use a dataframe with a string column not a list column.