1

I created a new thread referencing this thread, which is 9 years old :

(Regular Expression patterns for Tracking numbers)

My question here, at the moment, resolves around the format of the UPS tracking number. According to UPS.com, the format for the tracking numbers with 1Z should be 1Z + 6 characters (numbers or letters) + 2 characters (numbers or letters) + 8 characters (numbers or letters), example format: 1Z 89X406 C8 33660056, however in the example for UPS, referenced in the thread link above, the matching format is centered around: 1Z 89S 406 B8 3322 005 6

In the second matching format, the pattern used is:

\b(1Z ?[0-9A-Z]{3} ?[0-9A-Z]{3} ?[0-9A-Z]{2} ?[0-9A-Z]{4} ?[0-9A-Z]{3} ?[0-9A-Z]

but, you could also use this pattern (to match the first format, according to UPS quote on quote proper format): \b(1Z ?[0-9A-Z]{6} ?[0-9A-Z]{2} ?[0-9A-Z]{8}

I guess the question I have boils down to whether there is efficiency in using either matching pattern over the other. I'm not understanding why the OP of the link above uses the second matching pattern instead of the one that conforms to the format of the UPS tracking number.

Thanks in advance, and hope this helps someone else in the future.

dataviews
  • 2,466
  • 7
  • 31
  • 64
  • the second is stricter because it will only match 6 consecutive alphanumeric characters, whereas the first allows for a space between groups of 3. so it depends on how strictly formatted your input it, as it may not always conform to the format UPS specified – Simon Nov 18 '18 at 00:06
  • Well, the second pattern does not match the `1Z 89S 406 B8 3322 005 6` in the first thread, which is probably the reason for the `{3}`s rather than `{6}`s – CertainPerformance Nov 18 '18 at 00:07

1 Answers1

1

If it were me, I wouldn't bother with spaces at all, since they don't appear to matter.

tracking_number = "1Z 89S 406 B8 3322 005 6"
# Strip spaces out
tracking_number = tracking_number.replace(' ', '')
match = re.search(r'1Z[A-Z0-9]{16}', tracking_number)
Jonah Bishop
  • 12,279
  • 6
  • 49
  • 74
  • right, but what if someone enters the tracking number with a space. essentially, my script is to run through a dataset and determine if a tracking number exists. If you get a number with spaces, then your expression would not work. – dataviews Nov 18 '18 at 01:09
  • 1
    Assuming you are in control of the dataset, you should _normalize_ all tracking numbers (i.e. all tracking numbers should omit spaces). Then the look-ups will be trivial, as you could have them in a `set` or `dict` and see if there's a match or not. If you aren't in control the dataset, then you may have to use a regex that considers the spaces. – Jonah Bishop Nov 18 '18 at 01:11