Properly using Python re.groupdict()

Question

I have been learning a lot about regex lately, and am just encountering the groupdict method for re.match objects. I am trying to create one from the following email header:

EMAIL_HEADER = """Return-Path: <bounces+5555-7602-redacted-info>
...
Received: by 10.8.49.86 with SMTP id mf9.22328.51C1E5CDF
    Wed, 19 Jun 2013 17:09:33 +0000 (UTC)
Received: from NzI3MDQ (174.37.77.208-static.reverse.softlayer.com [174.37.77.208])
by mi22.sendgrid.net (SG) with HTTP id 13f5d69ac61.41fe.2cc1d0b
for <redacted-info>; Wed, 19 Jun 2013 12:09:33 -0500 (CST)
Content-Type: multipart/alternative;
boundary="===============8730907547464832727=="
MIME-Version: 1.0
From: redacted-address
To: redacted-address
Subject: A Test From SendGrid
Message-ID: <1371661773.974270694268263@mf9.sendgrid.net>
Date: Wed, 19 Jun 2013 17:09:33 +0000 (UTC)
X-SG-EID: P3IPuU2e1Ijn5xEegYUQ...
X-SendGrid-Contentd-ID: {"test_id":"1371661776"}"""

I am looking to match the "From," "To," "Subject" and "Date" lines, and turn it into a groupdict. Trying to start small and build up, I used details = re.search(r'(?P<from>(?<=From: )[a-z]+-[a-z]+)|(?P<to>(?<=To: )[a-z]+-[a-z]+)',header).groupdict()

This returns: {'from': 'redacted-address', 'to': None}

If I remove the |, I get an error that essentially means my regex did not match at all. Can anyone explain what is happening to me? I don't understand why removing the pipe character essentially returns None. I saw examples where they did not use the pipe character; they looked like what I have above without the pipe between. Any tips or help would be appreciated. Thanks!

Not an answer but regex101.com is SUPER helpful for me when trying to figure out patterns. Here is your example; https://regex101.com/r/Dtl3le/1 — Marcel Wilson, Nov 10 '20 at 17:50
With the pipe character, the regex matches *either* From or To. Without it, it tries to match both, immediately adjacent to each other - but they aren't, there are about 5 characters (newline followed by "To: ") between the end of the first group and the start of the second. You'd have to replace the pipe by something like `.*` (and turn on the `DOTALL` option) to allow the regex to skip those characters in between the groups. — jasonharper, Nov 10 '20 at 18:01
jasonharper, that was super helpful and just what I needed. Thanks so much! And thanks to everyone else who answered as well. I appreciate it! — J. B., Nov 10 '20 at 19:38

Properly using Python re.groupdict()

0 Answers0