I have a long string composed of a youtube video transcript named data. I have a csv of videos with these transcripts implemented in a column named "subtitle". TLDR I want to find the row(video) which contains this string.
My code:
if dfSubtitles['subtitle'].str.contains(data,regex=False).any():
print('exist')
else:
print('not exist')
Currently, my code only checks if the string exists in the data frame, and it was successfully able to identify true. However, every attempt at retrieving the row has been giving me issues because the following returns a boolean value. How can I retrieve the row where my string exists?
No this solution does not work:
print(dfSubtitles[dfSubtitles['subtitle'].str.contains(data)])
The following outputs:
Empty DataFrame
Columns: [Unnamed: 0, categoryName, categoryId, channel, videoId, subtitle]
Index: []
exist
Why is it that my code finds the instance of my string but outputs an empty dataframe?