0

I am writing a Python 3 program that keeps track of hours spent with a client. One way to log hours is to use a string like Client 9:35am 1:35pm where the first time is the beginning and the second is the end.

To extract the times from the string, I used regex101.com to construct the following pattern:

r"[01]?[0-9]:[0-5][0-9]\s*([Aa][Mm]?|[Pp][Mm]?)"

When testing it on the above example with regex101, it correctly identifies the two times as two separate matches. However, when trying to use the pattern with Python, the list re.findall returns only contains AM or PM:

re.findall(r"[01]?[0-9]:[0-5][0-9]\s*([Aa][Mm]?|[Pp][Mm]?)", "Client 9:35am 1:35pm")
['am', 'pm']

How can I change this so that matches contain the whole time?

W Stokvis
  • 1,409
  • 8
  • 15
  • `r"[01]?[0-9]:[0-5][0-9]\s*([Aa][Mm]?|[Pp][Mm]?)"` => `r"(?i)[01]?[0-9]:[0-5][0-9]\s*[AP]M?)"` – Wiktor Stribiżew Jul 25 '18 at 19:57
  • There is no space between the time and the AM\PM. You can also simplify your AM\PM search. `re.findall(r"[01]?[0-9]:[0-5][0-9][Aa]?[Pp]?[Mm]", "Client 9:35am 1:35pm")` https://docs.python.org/2/library/re.html – addohm Jul 25 '18 at 20:13
  • @WiktorStribiżew I don't think this should be marked as a duplicate. The problem was in the OP's syntax. Also, the answer to that post does not help or answer his question. – addohm Jul 25 '18 at 20:14
  • Of course it is a duplicate, and the "syntax" issue is that OP uses a capturing group while the non-capturing one should have been used. – Wiktor Stribiżew Jul 26 '18 at 07:53

1 Answers1

-1

Use a non-capturing group:

r"[01]?[0-9]:[0-5][0-9]\s*(?:[Aa][Mm]?|[Pp][Mm]?)"  # not the "?:"

re.findall returns a list of the groups instead of the entire matches if the pattern contains capturing groups.

user2390182
  • 72,016
  • 6
  • 67
  • 89