I have two DataFrames
. One looks like this:
df1.head()
#CHR Start End Name
chr1 141474 173862 SAP
chr1 745489 753092 ARB
chr1 762988 794826 SAS
chr1 1634175 1669127 ETH
chr1 2281853 2284259 BRB
And the second DataFrame
looks as follows:
df2.head()
#chr start end
chr1 141477 173860
chr1 745500 753000
chr16 56228385 56229180
chr11 101785507 101786117
chr7 101961796 101962267
I am looking to map the first three columns from two DataFrames
and create a new DataFrame
, df3
. For example, if #chr
from both df1
and df2
are equal, then look for df2.start >= df1.start
and df2.end <= df1.end
.
If this is the case, print it out as the following:
df3.head()
#chr start end Name
chr1 141477 173860 SAP
chr1 745500 753000 ARB
So far I have tried to create a function for doing this:
def start_smaller_than_end(df1,df2):
if df1.CHR == df2.CHR:
df2.start >= df1.Start
df2.End <= df2.End
return df3
However, when I run it I get the following error:
df3(df1, df2)
name 'df3' is not defined
Any suggestions and help are greatly appreciated.