Find and replace string with both number and letters in pandas

Question

I receive a vectors of scores from our system that generally has numeric values (albeit stored as character 00XXX) or No Hit statements. Sometimes we have processing errors and receive both letters and numbers in the string (ex: 00F69) which cause errors in a later process. I would like to replace them by blanks which are also a valid entry. I assume RE would be the right way to make it work. So far I haven't gotten the structure right though.

This is the first example of how I fixed 2 problematic errors in a subset. In our batch data other patterns can come up so I want something more robust.

import pandas as pd
df = pd.DataFrame(data = {'Score': ['00599', 'NO HIT', '00800', '00B66', '00750', '0010E', '00900', '']})
df["Score"] = df["Score"].replace("00B66", "") 
df["Score"] = df["Score"].replace("0010E", "") 

df

attempt with RE below doesn't seem to work as column does not change

import re
df = pd.DataFrame(data = {'Score': ['00599', 'NO HIT', '00800', '00B66', '00750', '0010E', '00900', '']})
regex = '^(?=.*[0-9]$)(?=.*[a-zA-Z])'
df['Score2']= [re.sub(regex, '', str(x)) for x in df['Score']]
df

So, what is the rule? You want to replace all strings with 2 initial zeros followed with any alpahanumerics? — Wiktor Stribiżew, Jan 22 '20 at 16:44
Replace any string that has **both** numbers and letters with a blank. — DarknessFalls, Jan 22 '20 at 16:45
`df['Score'].str.replace('^([a-zA-Z]+[0-9]+|[0-9]+[a-zA-Z]+)[a-zA-Z0-9]*$', '')`? — Wiktor Stribiżew, Jan 22 '20 at 16:46

Find and replace string with both number and letters in pandas

0 Answers0