0

I want my list to have chemical elements like ['K', 'Ca', 'Fe'] but when I run my code I get ['K', 'C', 'a', 'F', 'e']. How do i fix this?

character_list = []
  for char in compound_formula:
     if char.isalpha():
        character_list.append(char)

Thank you! I am a beginner so I would need the code to be as simple as possible also a little explanation would be very helpful!

Here are some examples to test the code, compound_formula could be Fe6Cr1 or C6H2 for example.

>>>molform("Fe6Cr1")
['Fe','Cr']
>>>molform("C6H2")
['C','H']
coco_pops
  • 21
  • 4
  • 4
    Please share a [minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) – yatu Jun 12 '20 at 11:35
  • 1
    Can you show what the original `compound_formula` looked like? – Cory Kramer Jun 12 '20 at 11:36
  • Here are examples you can use! >>>molform("C2H6O1") {'C':2, 'H':6, 'O':1} >>>molform("C1H4") {'C':1, 'H':4}. >>>molform("Fe6Cr1") – coco_pops Jun 12 '20 at 11:39
  • compound_formula = "Fe6Cr1" character_list = [] temp_str = "" for char in compound_formula: if char.isalpha(): temp_str += char else: character_list.append(temp_str) temp_str = "" – sławomir sowa Jun 12 '20 at 12:08
  • From what still remains from my year of chemistry, I would guess you need to split the elements based on a capital letter, because you might encounter things like `ROCOOR` at some point – β.εηοιτ.βε Jun 12 '20 at 19:45

2 Answers2

1

TL;DR;

This is a possible solution to your use case:

character_list = []
for char in compound_formula:
  if char.isupper():
    character_list.append(char)
  elif char.isalpha() and len(character_list) == 0:
    print 'Unexpected character {} in {}' format(char, compound_formula) 
  elif char.isalpha():
    character_list[-1] += char
print str(list(set(character_list)))

Not really knowing what is in your compound_formula but still remembering some of my year in chemistry, I would guess you might find some more complex formula where you will have, for example Carbon and Oxygen elements next to each other.

More precisely, if I feed CH3COOH Acetic acid in a simple logic based on the fact that char is a letter won't work.

This said, you might be able to cope with that testing if the letter is in capital, using isupper().

Also note that the line

character_list[-1] += char

Is actually a concatenation of the current character to the last element of the list, in case we meet, for example, a e right after a F so the last element would become Fe.

character_list = []
for char in compound_formula:
  if char.isupper():
    character_list.append(char)
  elif char.isalpha() and len(character_list) == 0:
    print 'Unexpected character {} in {}' format(char, compound_formula) 
  elif char.isalpha():
    character_list[-1] += char
print str(character_list)

Here are some runs of it:

python chemistry.py Fe6Cr1
['Fe', 'Cr']
python chemistry.py C6H2 
['C', 'H']
python chemistry.py CH3COOH
['C', 'H', 'C', 'O', 'O', 'H']

Now looking at the last result, maybe you even want a list with unique elements, which can be done editing the last line to

print str(list(set(character_list)))

With this modification, the last run now gives

python chemistry.py CH3COOH
['H', 'C', 'O']
β.εηοιτ.βε
  • 33,893
  • 13
  • 69
  • 83
0

This problem can also be solved using regex.

import re
def get_elements(compound):
  pattern = r"([A-Z][a-z]*)[0-9]*"
  elements = re.findall(pattern,compound)
  return list(set(elements))

Pattern: "([A-Z][a-z]*)[0-9]*"

[A-Z] looks for the capital letter to identify start of the element.
Example: F from Fe

[a-z]* looks for other non-capital letters if present, to identify remaining letters of the element.
Example: e from Fe

[0-9]* looks for quantity of the element.
Example: 6 from Fe6

() captures the element only, in order to discard the quantities.
Example: Fe from Fe6

Here are the results:

print(get_elements("Fe6Cr1"))
['Fe', 'Cr']
print(get_elements("CH3COOH"))
['H', 'C', 'O']
print(get_elements("C6H2"))
['H', 'C']
abhit pahwa
  • 153
  • 2
  • 7