-2

I'm trying to extract string between parentheses() that match certain substring.

I've been able to extract the string between () already, but couldn't fit in the substring requirement.

Reference: Regular expression to extract text between square brackets

links = re.findall(r'\(([^)]+)\)', page_content)

My code only extract the entire string between (), where do I insert the substring inside regex?

Sample input:

(XYZ) | **Birthdate**: Dec 10, 1983; - **Social Media**: [Daum Cafe](http://cafe.daum.net/swedjs), [Instagram](https://www.instagram.com/skslzowk/), [Facebook](https://www.facebook.com/sdiwoel)

The output should only be the link for facebook: https://www.facebook.com/sdiwoel

DatCra
  • 253
  • 4
  • 13
  • 4
    A sample input with matches and non-matches, the substring, and expected output would be very helpful... – MonkeyZeus Aug 01 '19 at 18:42
  • Why should there only be one link? Are you targeting Facebook links explicitly? If there were two Facebook links then should both of them get matched? – MonkeyZeus Aug 01 '19 at 19:58

1 Answers1

0

Depends on what you want to define explicitely. If it's about Facebook as you wrote, perhaps simply

re.findall(r'\(([^)]+www\.facebook[^)]+)\)', page_content)
# ['https://www.facebook.com/sdiwoel']

could help.

SpghttCd
  • 10,510
  • 2
  • 20
  • 25