-1

I have a code that findme info for some URLs How can i set this code for that only give me one string result?. Actually this code bring me all results, I just need one. In the image below the green rectangle is the correct result, but if the url contains the string more than one time this show me boths, red rectangle.

for idx,row in df.iterrows():
    url = row['e.URL'].replace('/v01/', '/depot/')
    x = urlopen(url)
    new = x.read()
    soup = BeautifulSoup(new, "lxml-xml")
    match = ''.join(re.findall(r"(?i)cl[a-zA-Z]{3}\d{5}", str(soup)))
    df.at[idx,'NEW_APP'] = match

The below code brings me all results:

match = ''.join(re.findall(r"(?i)cl[a-zA-Z]{3}\d{5}", str(soup)))

See image below for reference:

GREEN, CORRECT RESULT

salem1992
  • 75
  • 8

1 Answers1

0

If you do not want to have multiple matches then you can use re.search.

found = re.search(r"(?i)cl[a-zA-Z]{3}\d{5}", str(soup))
match = found.group(0) if found else ''

Or you can use the findall like you do now but only use the first match

matches = re.findall(r"(?i)cl[a-zA-Z]{3}\d{5}", str(soup))
match = matches[0] if matches else ''
Encrypted
  • 188
  • 7