I have a df with journals. I have different journals.
I want to extract journals with titles below only
Blood, Cancer, Chest, Circulation, Diabetes, JAMA, Endocrinology, Gastroenterology, Gut, Medicine, Neurology, Pediatrics, Physical therapy, Radiology, Surgery, Geriatrics
Some journals have the same words - Blood circulation, Cancer History, etc. I do not want to select them.
Example
Id Title
1 Blood
2 Blood
3 Blood purification
4 Blood transfusion
5 Cancer
6 Chest
7 Cancer History
8 Chest Analysis
I want to keep the exact journal title and create new column "Influential", but cannot find the way with str.contains
or str.match
.
I am trying two approaches
df.loc[df['Title'].str.contains("Blood", case = True, na = False), 'Influential'] = 'Blood'
df.loc[df['Title'].str.match("Blood", case = True, na = False), 'Influential'] = 'Blood'
Expected output with the exact title of the journal:
Id Title Influential
1 Blood Blood
2 Blood Blood
3 Blood purification NA
4 Blood transfusion NA
5 Cancer Cancer
6 Chest Chest
7 Cancer History NA
8 Chest Analysis NA
Should I do it somehow via regex
? Thanks.