1

I have seen that similar questions have been psoted, but the solutions there do not work for me because, I believe, I am working with a column in a dataframe.

I have a column which has a string in it. I find the first occurance of a term. This works. I then want to find the second occurance of that term. This doesn't work.

My code

import pandas as pd
data = {"Text" : ["['one', 'one two']","['two one', 'three']"]}
df = pd.DataFrame(data)

#yes the data is in a list in a column but I treat it as a string

#finding if "one" is in the string - works
df["Ones"] = df.Text.str.find("one")
#finding if "one" is in the string another time as in the first row
df["NextOnes"] = df.Text.str.find("one",df.Ones +1)

The line for "NextOnes" returns NAs. In my real code, it returns blanks. If I replace the reference to the column with a number, such as 2, then this returns the correct value. However this value needs to be dynamic.

I have just got this the needle haystack approach to work but building in for loops seems inefficient here,

 for i in range(len(df)):
...     df["Next"][i] = find_nth(str(df.Text[i]),"one",2)
James Oliver
  • 547
  • 1
  • 4
  • 17

1 Answers1

0

You could try using the find method from str, but pass in the previous item index as starting point (as parameter to find).

A.M. Ducu
  • 892
  • 7
  • 19
  • Sadly I already tried this and it didn't work. Covered in the original post. df["NextOnes"] = df.Text.str.find("one",df.Ones +1) Or do you mean something different? – James Oliver Jun 23 '21 at 08:06
  • Yes, that is what I meant. Unfortunately, I got no better idea. Your "slow" solution seems good though. – A.M. Ducu Jun 23 '21 at 18:32