TL;DR;
This is a possible solution to your use case:
character_list = []
for char in compound_formula:
if char.isupper():
character_list.append(char)
elif char.isalpha() and len(character_list) == 0:
print 'Unexpected character {} in {}' format(char, compound_formula)
elif char.isalpha():
character_list[-1] += char
print str(list(set(character_list)))
Not really knowing what is in your compound_formula
but still remembering some of my year in chemistry, I would guess you might find some more complex formula where you will have, for example C
arbon and O
xygen elements next to each other.
More precisely, if I feed CH3COOH
Acetic acid in a simple logic based on the fact that char
is a letter won't work.
This said, you might be able to cope with that testing if the letter is in capital, using isupper()
.
Also note that the line
character_list[-1] += char
Is actually a concatenation of the current character to the last element of the list, in case we meet, for example, a e
right after a F
so the last element would become Fe
.
character_list = []
for char in compound_formula:
if char.isupper():
character_list.append(char)
elif char.isalpha() and len(character_list) == 0:
print 'Unexpected character {} in {}' format(char, compound_formula)
elif char.isalpha():
character_list[-1] += char
print str(character_list)
Here are some runs of it:
python chemistry.py Fe6Cr1
['Fe', 'Cr']
python chemistry.py C6H2
['C', 'H']
python chemistry.py CH3COOH
['C', 'H', 'C', 'O', 'O', 'H']
Now looking at the last result, maybe you even want a list with unique elements, which can be done editing the last line to
print str(list(set(character_list)))
With this modification, the last run now gives
python chemistry.py CH3COOH
['H', 'C', 'O']