I have a macro that converts a cell string that contains a chemical name to just its chemical symbol. e.g. a cell that contains "Silver ICM-MS ug/kg" would become simply "Ag"(the chemical symbol for silver). I am trying to convert this to Python to use in processing of lab data where I would convert wordy column headings in a dataframe to just chemical symbols. Headings that don't contain an element name can be left as they are. My attempt isn't working:
#load dictionary of element names, import pandas and load dummy data
dict_of_elements={"Sulphate":"SO4","Silver":"Ag","Aluminium":"Al","Americium":"Am","Argon":"Ar","Arsenic":"As","Astatine":"At","Gold":"Au","Boron":"B","Barium":"Ba","Beryllium":"Be","Bohrium":"Bh","Bismuth":"Bi","Berkelium":"Bk","Bromine":"Br","Calcium":"Ca","Cadmium":"Cd","Cerium":"Ce","Californium":"Cf","Chlorine":"Cl","Curium":"Cm","Copernicium":"Cn","Cobalt":"Co","Chromium":"Cr","Caesium":"Cs","Copper":"Cu","Dubnium":"Db","Darmstadtium":"Ds","Dysprosium":"Dy","Erbium":"Er","Einsteinium":"Es","Europium":"Eu","Fluorine":"F","Iron":"Fe","Flerovium":"Fl","Fermium":"Fm","Francium":"Fr","Gallium":"Ga","Gadolinium":"Gd","Germanium":"Ge","Hydrogen":"H","Helium":"He","Hafnium":"Hf","Mercury":"Hg","Holmium":"Ho","Hassium":"Hs","Iodine":"I","Indium":"In","Iridium":"Ir","Potassium":"K","Krypton":"Kr","Lanthanum":"La","Lithium":"Li","Lawrencium":"Lr","Lutetium":"Lu","Livermorium":"Lv","Mendelevium":"Md","Magnesium":"Mg","Manganese":"Mn","Molybdenum":"Mo","Meitnerium":"Mt","Nitrogen":"N","Sodium":"Na","Niobium":"Nb","Neodymium":"Nd","Neon":"Ne","Nickel":"Ni","Nobelium":"No","Neptunium":"Np","Oxygen":"O","Osmium":"Os","Phosphorus":"P","Protactinium":"Pa","Lead":"Pb","Palladium":"Pd","Promethium":"Pm","Polonium":"Po","Praseodymium":"Pr","Platinum":"Pt","Plutonium":"Pu","Radium":"Ra","Rubidium":"Rb","Rhenium":"Re","Rutherfordium":"Rf","Roentgenium":"Rg","Rhodium":"Rh","Radon":"Rn","Ruthenium":"Ru","Sulfur":"S","Antimony":"Sb","Scandium":"Sc","Selenium":"Se","Seaborgium":"Sg","Silicon":"Si","Samarium":"Sm","Tin":"Sn","Strontium":"Sr","Tantalum":"Ta","Terbium":"Tb","Technetium":"Tc","Tellurium":"Te","Thorium":"Th","Titanium":"Ti","Thallium":"Tl","Thulium":"Tm","Uranium":"U","Ununoctium":"Uuo","Ununpentium":"Uup","Ununseptium":"Uus","Ununtrium":"Uut","Vanadium":"V","Tungsten":"W","Xenon":"Xe","Yttrium":"Y","Ytterbium":"Yb","Zinc":"Zn"}
import pandas as pd
data = {'Silver ICP-MS': [1,2,3], 'Aluminium(III)': [4,5,6], 'Sulphate-LECO': [7,8,9], 'NOT a match': [10,11,12]}
df = pd.DataFrame(data)
df
Out[1]:
Aluminium(III) NOT a match Silver ICP-MS Sulphate-LECO
0 4 10 1 7
1 5 11 2 8
2 6 12 3 9
I have tried variations along the lines of:
element_cols=[key for key in dict_of_elemets.keys()in df.columns]
print(element_cols)
but I'm still getting to grips with list comprehensions and it's not working.
Or I can loop through and get the matches in a list and then feed that into df.columns:
cols=[]
for i in df.columns:
for key, value in dict_of_elements.items():
i= i.replace(key, value)
cols.append(i)
df.columns=cols
df
which gives:
Out[3]:
Al(III) NOT a match Ag ICP-MS SO4-LECO
0 4 10 1 7
1 5 11 2 8
2 6 12 3 9
But the entire header should be replaced with the symbol "Ag, SO4...".