I am forming a new column in a pandas dataframe and I want to enter the short name for operating systems. I am using regex and need to exact match words to exclude from the selection however when I change the regex to not select the words it then stops exact matching. I have read as many regex exact match word posts here as possible and none of the solutions work.
so for example I have data which looks like this:
Android 10kdsh
Chrome OS
Linux ddk2
OS X 10.
Windows 7
iOS c
and I want it to look like this:
Android
Chrome
Linux
OS X
Windows
iOS
I tried code as follows:
def short_OS(webchat):
webchat["OS"] = webchat["Operating System"].str.replace(('[^(Android|^OS X|^Chrome|^Linux|^Windows|^iOS)]'),"", regex = True)
return webchat
but this leaves some of the characters in as leaving:
Androiddsh
ChromeOS
Linuxdd
OS X
Windows
iOS
obviously the above are just examples but the principle about some of the characters being left in as they are in the words are the same.
I should note that framing the words with \b did not change the outcome. and if I use the $ for the end of string, in the example of 'Android' it still leaves the '10kdsh' in on the same line
can anyone help please?
thank you