I have a dataframe df
:
df:
chr gene_name
0 1 ARF3
1 1 ABC
2 1 ARF3, ENSG123
3 1 ENSG,ARF3
4 1 ANG
5 2 XVY
6 2 PQR
7 3 RST
8 4 TAC
and a gene_list
gene_list = ['ARF3','ABC' ]
Now, I need to get the rows from the data frame (df
) for which the gene name is either an exact match with elements in gene_list
.
So, I tried:
df2 = df1[df.gene_name.isin(gene_list)]
I retrieved:
chr gene_name
0 1 ARF3
1 1 ABC
but what I am expecting is:
chr gene_name
0 1 ARF3
1 1 ABC
2 1 ARF3, ENSG123
3 1 ENSG,ARF3
so basically all the rows in the data frame where the element in gene_list
is a substring of gene_name
in the data frame.
I thought of using .contains()
had it been I was looking the other way that is gene_name
in the data frame would have been a substring on element in gene_list
.
All the help appreciated