0

Taking the following string as an input example

Post-Match Thread: Barcelona - Olimpia Milano [EuroLeague Regular Season, Round 24]

The goal is to get the home team, the away team and the competition in hand. In this case, that would be Barcelona, Olimpia Milano and EuroLeague, respectively. When trying to use the group function, I was able to retrieve both teams just fine (even if a strip() call was necessary) but for some reason the current regular expression is not able to get the competition. Several options were tried, with the current iteration being the following:

title_group_regex = r"Post-Match Thread: (?P<home_team>(.*?))-(?P<away_team>(.*?))\[(?P<competition>(.*?))\b"
title_group = re.search(title_group_regex, thread_title)

In this attempt, I was attempting to match anything after the [ char up until a word boundary. However this results in AttributeError: 'NoneType' object has no attribute 'group'. I imagine I'm close but lacking that final touch. Thank you.

Joao Pereira
  • 573
  • 4
  • 16
  • `r"\\b"` means ``\`` and `b`. A word boundary is `r"\b"` or `"\\b"` – Wiktor Stribiżew Feb 19 '20 at 11:47
  • @WiktorStribiżew same thing happens when trying to apply `"\b"` – Joao Pereira Feb 19 '20 at 11:49
  • `[.*?]` matches a single char, `.` , `*` or `?`. What were you up to? Escape `[` if you mean to match a literal `[`. But `]\b` will only match if there is a word char after `]`. Why use `\b` at all? I doubt your actual problem is related to a word boundary. See https://regex101.com/r/m7Vzyg/1. And the [code generated online](https://regex101.com/r/m7Vzyg/1/codegen?language=python). – Wiktor Stribiżew Feb 19 '20 at 11:55
  • @WiktorStribiżew that was a mistake. I updated the regex and it is now returning an empty string for `competition`. gonna check the link though. – Joao Pereira Feb 19 '20 at 12:00
  • Please update the question once you know the actual requirements. – Wiktor Stribiżew Feb 19 '20 at 12:03
  • @WiktorStribiżew the requirements are absolutely clear. they are explained in plain text. the only thing I changed was regex that had an error. – Joao Pereira Feb 19 '20 at 12:04
  • 2
    I guess you just need `\w+` instead of `.*?\b`, use `Post-Match Thread: (?P(.*?))-(?P(.*?))\[(?P(\w+))`, see https://regex101.com/r/m7Vzyg/5. Still a duplicate though. – Wiktor Stribiżew Feb 19 '20 at 12:07
  • @WiktorStribiżew that's it, thank you so much. – Joao Pereira Feb 19 '20 at 12:09

0 Answers0