We have student anwser MCQs after each lessons on socrative They enter their name first, then anwser. For each lesson, we collect data from the Socrative platform but have issues "normalizing the names" such as 'John Doe', johndoe' or John,Doe' can be transformed into 'doe', as it is written is our main file.
Our main file for following up students (treated as a dataframe with python) has initially just 1 column, the name (as a string 'doe' for Mr. John Doe).
I'l like to write a function that goes through the 'name' column of my lesson1 dataframe and for each value of the name column, replace the badly typed name by the reference name.
To lower the case, suppress excessive spaces and suppress excessive punctuation, i've used the following code
lesson1["name"] = lesson1["name"].str.lower()
lesson1["name"] = lesson1["name"].str.strip()
import re
lesson1["name"]=lesson1["name"].apply(lambda x : re.sub('[^A-Za-z0-9]+', '', x))
Then I want to change the 'name' values for the reference name is necessary I've tried the following code on 2 lists
bad=lesson1['name']
good=reference['name']
def changenames(lesson_list, reference_list):
for i,name in enumerate(lesson_list):
for j,ref in enumerate(reference_list):
if ref in name:
lesson_list[i]=ref
changenames(bad,good)
but 1/ it's not working due to SettingWithCopyWarning 2/ i fail to apply it to a column of the dataframe
Could you help me ? Thx L.