I want to be able to select a string between two specific substrings (as the following), but with a loop that will iterate through every row in the dataframe.
CODE:
import pandas as pd
df = pd.DataFrame(['first: hello1 \nSecond this1 is1 a1 third: test1\n', 'first: hello2 \nSecond this2 is2 a2 third: test2\n', 'first: hello3 \nSecond this3 is3 a3 third: test3\n'])
df = df.rename(columns={0: "text"})
def find_between(df, start, end):
return (df.split(start))[1].split(end)[0]
df2 = df['text'][0]
print(find_between(df3, 'first:', '\nSecond'))
[OUTPUT NEEDED] Dataframe with following information:
output
0 hello1
1 hello2
2 hello3
the find_between() function is created based on Find string between two substrings, but here you're only able to do that for one specific variable (df2) that is already saved as a string, as the shown example. I need to be able to do that for every row (strings) in the 'df' dataframe.
I would really much appreciate if anyone can help me with this! Thank you!