I have been learning a lot about regex lately, and am just encountering the groupdict
method for re.match
objects. I am trying to create one from the following email header:
EMAIL_HEADER = """Return-Path: <bounces+5555-7602-redacted-info>
...
Received: by 10.8.49.86 with SMTP id mf9.22328.51C1E5CDF
Wed, 19 Jun 2013 17:09:33 +0000 (UTC)
Received: from NzI3MDQ (174.37.77.208-static.reverse.softlayer.com [174.37.77.208])
by mi22.sendgrid.net (SG) with HTTP id 13f5d69ac61.41fe.2cc1d0b
for <redacted-info>; Wed, 19 Jun 2013 12:09:33 -0500 (CST)
Content-Type: multipart/alternative;
boundary="===============8730907547464832727=="
MIME-Version: 1.0
From: redacted-address
To: redacted-address
Subject: A Test From SendGrid
Message-ID: <1371661773.974270694268263@mf9.sendgrid.net>
Date: Wed, 19 Jun 2013 17:09:33 +0000 (UTC)
X-SG-EID: P3IPuU2e1Ijn5xEegYUQ...
X-SendGrid-Contentd-ID: {"test_id":"1371661776"}"""
I am looking to match the "From," "To," "Subject" and "Date" lines, and turn it into a groupdict
. Trying to start small and build up, I used details = re.search(r'(?P<from>(?<=From: )[a-z]+-[a-z]+)|(?P<to>(?<=To: )[a-z]+-[a-z]+)',header).groupdict()
This returns: {'from': 'redacted-address', 'to': None}
If I remove the |
, I get an error that essentially means my regex did not match at all. Can anyone explain what is happening to me? I don't understand why removing the pipe character essentially returns None. I saw examples where they did not use the pipe character; they looked like what I have above without the pipe between. Any tips or help would be appreciated. Thanks!