1

I have a large data set of email addresses from 'to' and 'cc' fields and such that are of the general form {some human name} {email addr}. I want to be able to reliably parse the human name section into first (or first+middle) and last name (and optionally suffixes like sr, jr, phd for more points).

Previous SO queries have pointed to a couple of parsing libraries that do a pretty respectable job of parsing name strings, but they all seem to start from the assumption that the names are in the correct order. However many institutional email systems will give you an email string something like:

"Potter, Harry" <hpotter@hogwarts.edu>  or maybe
"Smythe III, Reginold Trevor" <twit@upperclass.org>

Is there a neat solution for handling this with the existing v. nice libraries?

This question was marked as a "duplicate" by a reader, but that seems to be in error, because the original question, above clearly distinguishes between the "duplicated" question about parsing person names and the discussed problem, which is handling the case when the first and last names in a string are in inverted order. It is /not/ the same as dealing with the even more difficult case of those Eastern tradition names where the family name is given first.

maab
  • 11
  • 4
  • https://github.com/derek73/python-nameparser is probably sufficient, but as explained in the dupe, I don't think it is solvable reliably. – kennytm Jan 27 '17 at 17:32
  • With due respect, I think you may have missed the point. I noted above that tools such as python-nameparser exist for the case when you always know the name is in first-middle-last order, but all of the examples are for that case. The issue is _also_ comprehending the occasional email person-name which is supplied as last, first. – maab Jan 27 '17 at 19:48
  • `nameparser` understands both `"First Last"` and `"Last, First"`, try it. – kennytm Jan 27 '17 at 19:49
  • You're quite right, I missed that in the pages I originally read. It seems to handle at least the teat suite nicely, thanks. – maab Jan 27 '17 at 20:50

0 Answers0