0

I have some values in column like

data1 = "any_number_of_characters&wanted=gsjgj87-hdjh_66&"
data2 = "any_number_of_characters&wanted=g232gj87-hdjh_66#dhvdhohoh"
data3 = "any_number_of_characters&wanted=gsjgj87-hdjh_66?uhdjd=skjhnknkn"

Consider this is coming for same field in loop after one another. Now I am finding value string 'wanted=' in the above string and then want value till i find '&' or '#' or '?'

If i do it in if-else loop, due to unpredictability of the character appearance it works for ine character but fails for other.

I want to achieve this without iterating character by character like find() function but I am not able to search for first occurrence of a single character out of these 3.

  • 1
    Please, elaborate your question. How are you reading this file; chunks or line-by-line (provide extra code)? Are the start (wanted=) and end (&, #, ?) patterns allways present in the same line? May be there more than one in line? – Newbie Dec 03 '20 at 10:44
  • Does this answer your question? [Split string with multiple delimiters in Python](https://stackoverflow.com/questions/4998629/split-string-with-multiple-delimiters-in-python) – harunB10 Dec 03 '20 at 10:52

1 Answers1

0

You could search the any of the stop characters after wanted that is closest to the string you are searching for:

data1 = "any_number_of_characters&wanted=gsjgj87-hdjh_66&"
data2 = "any_number_of_characters&wanted=g232gj87-hdjh_66#dhvdhohoh?uhdjd=skjhnknkn"
data3 = "any_number_of_characters&wanted=gsjgj87-hdjh_66?uhdjd=skjhnknkn#dhvdhohoh"


def extractData(data, startPattern, endCharacters='&#?'):
    start = data.find(startPattern)
    end = min(
        [
            pos if -1 <
            (pos := data.find(ec, start + 1)) > start else len(data)
            for ec in list(endCharacters)
        ]
    )
    return (data[start + len(startPattern):end])


for data in (data1, data2, data3):
    print(extractData(data, 'wanted='))

Out:

gsjgj87-hdjh_66
g232gj87-hdjh_66
gsjgj87-hdjh_66
Maurice Meyer
  • 17,279
  • 4
  • 30
  • 47