I'd like to create a DataFrame from two different type of DataFrame with a condition as well as keep extra column. My first DataFrame is:
sample_id motif chromosome position
1 CT-G.A chr1 7300
1 TA-C.C chr1 1000
1 TC-G.C chr2 1200
1 TC-G.C chr2 3000
2 CG-A.T chr2 12898
2 CA-G.T chr2 234235
and the second DataFrame is:
geneID chromosome start end
E1 chr1 100 10300
E2 chr1 1100 20122
E3 chr2 1200 2000
E4 chr2 400 234236
E5 chr2 12000 20000
then I want to create a DataFrame with this condition that:
if (first$chromosome == second$chromosome & second$start<= first$position <= second$end)
then I have a motif in that gene. Hence I want to create this DataFrame:
sample_id E1,CT-G.A E1,TA-C.C E1,TC-G.C E1,TC-G.C E1,CG-A.T E1,CA-G.T E2,CT-G.A E2,TA-C.C E2,TC-G.C E2,CG-A.T E2,CA-G.T E3,CT-G.A E3,TA-C.C E3,TC-G.C E3,CG-A.T E3,CA-G.T E4,CT-G.A E4,TA-C.C E4,TC-G.C E4,CG-A.T E4,CA-G.T E5,CT-G.A E5,TA-C.C E5,TC-G.C E5,CG-A.T E5,CA-G.T E6,CT-G.A E6,TA-C.C E6,TC-G.C E6,CG-A.T E6,CA-G.T
1 1 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0