Here's one way to do it, using sets. If no string matches the options for a field, then -1 is returned for its index, similar to str.find()
.
#!/usr/bin/env python
accnums = set(['account number', 'account #', 'account num', 'accnum'])
firstnames = set(['first name', 'firstname', '1stname'])
def find_fields(seq):
accnum, firstname = (-1, -1)
for i, field in enumerate(seq):
field = field.lower()
if field in accnums:
accnum = i
elif field in firstnames:
firstname = i
return accnum, firstname
testdata = [
['account number', 'first name'],
['account #', 'First Name'],
['ACCOUNT NUMBER', 'FIRST NAME'],
['accnum', '1stname'],
['country', 'lastname', 'account num', 'account type', 'firstname'],
['accnum', '1stname', 'account #'],
['albatross', 'first name'],
['Account Number', 'duck'],
]
for data in testdata:
print data, find_fields(data)
output
['account number', 'first name'] (0, 1)
['account #', 'First Name'] (0, 1)
['ACCOUNT NUMBER', 'FIRST NAME'] (0, 1)
['accnum', '1stname'] (0, 1)
['country', 'lastname', 'account num', 'account type', 'firstname'] (2, 4)
['accnum', '1stname', 'account #'] (2, 1)
['albatross', 'first name'] (-1, 1)
['Account Number', 'duck'] (0, -1)
Note that if it finds multiple matching entries for a field it returns the index of the last matching field. Thus for ['accnum', '1stname', 'account #']
it returns 2 as the index for the account number field.
You can expand the if: ... elif:
block in find_fields()
to handle more fields with varying names, but if you have a lot of these fields then it would be better to modify the logic so that it's working with a list of sets rather than with individual sets.