0

i have a problem trying to concatenate under certain condition two dataframes. I looked different post but i found no solutions that helped me.

Here is my data :

Dataframe 1 :

"year"     "var"          "x"                 "y"               "info"
"1992","mean_ndvi","4878686.57157449","5393968.15997648","0.386875003576279"
"1992","mean_ndvi","4896433.83572102","5398120.2484886","0.373374998569489"
"1992","mean_ndvi","4900572.93504345","5370687.20427196","0.394125014543533"
"1992","mean_ndvi","4902934.77310431","5361773.82267221","0.271333336830139"
"1992","mean_ndvi","4763325.11415408","5286260.42907455","0.341958343982697"
"1992","mean_ndvi","4659782.7218849","5251960.76092113","0.407333344221115"
"1992","mean_ndvi","4672416.53746615","5253639.4841048","0.443416655063629"
"1992","mean_ndvi","4688194.71187035","5255824.40292703","0.334916681051254"
"1992","mean_ndvi","4697653.82879809","5257181.46577816","0.367166668176651"

Dataframe 2 :

"year"         "x"             "y"             "species"
 "2014" "4001758.3924046" "3138415.9463486"     "Sus scrofa"
 "2016" "3990684.89200331" "3088575.79671371" "Capreolus capreolus"
 "2014" "4002641.44272945" "3078682.12799716" "Capreolus capreolus"
 "2014" "3946723.09681777" "3153792.59524072" "Capreolus capreolus"
 "2014" "3975356.46700669" "2974349.6604129" "Cervus elaphus"
 "2014" "4001283.9265329" "3137527.57584417" "Capreolus capreolus"
 "2014" "3946723.09681777" "3153792.59524072" "Capreolus capreolus"
 "2014" "3946723.09681777" "3153792.59524072" "Capreolus capreolus"
 "2017" "4000195.01511827" "3103181.07855945" "Capreolus capreolus"

The first dataframe contains way more data than the second one. What i want to do is : concatenate the two dataframes and keep only the row from the first dataframe that appears in the second dataframe.

I tried different methods: select and filter, merge, cbind, "by hand" with for loops, but i can't manage to obtain anything that work.

I also spend a lot of time looking for a solution online, but, or i'm too dumb to see how i could use one solution for my problem, or nobody have the same problem, i don't know, or i didn't make enough research.

If you have any clue of how i could do this, i know it can be very simple.

Datafrale 1 :

"1992","mean_ndvi","4688194.71187035","5255824.40292703","0.334916681051254"
"1992","mean_ndvi","4697653.82879809","5257181.46577816","0.367166668176651"
"1992","mean_ndvi","4657938.8843526","5242452.09422199","0.43491667509079"
"1992","mean_ndvi","4661111.26475011","5242863.65256642","0.523041665554047"
"1992","mean_ndvi","4692800.91855509","5247191.53424558","0.405791670084"

Dataframe 2 :

"2014" "4001758.3924046" "3138415.9463486" "Sus scrofa"
"2016" "3990684.89200331" "3088575.79671371" "Capreolus capreolus"
"1992" "4657938.8843526" "5242452.09422199" "Capreolus capreolus"
"2017" "4000167.53545378" "3103446.42513062" "Sus scrofa"
"1992" "4688194.71187035" "5255824.40292703 "Capreolus capreolus"

Result : 

"1992" "4657938.8843526" "5242452.09422199" "Capreolus capreolus""0.43491667509079"
"1992" "4688194.71187035" "5255824.40292703 "Capreolus capreolus" "0.334916681051254"

Here is the result for dput (the first 10 rows) :

First dataframe (with a lot of data)
structure(list(x = c(4878686.57157449, 4896433.83572102, 4900572.93504345, 
4902934.77310431, 4763325.11415408, 4659782.7218849, 4672416.53746615, 
4688194.71187035, 4697653.82879809, 4657938.8843526), y =     c(5393968.15997648, 
5398120.2484886, 5370687.20427196, 5361773.82267221, 5286260.42907455, 
5251960.76092113, 5253639.4841048, 5255824.40292703, 5257181.46577816, 
5242452.09422199), year = c(1993L, 1993L, 1993L, 1993L, 1993L, 
1993L, 1993L, 1993L, 1993L, 1993L), info = c(0.396166652441025, 
0.373374998569489, 0.394125014543533, 0.28979167342186, 0.344375014305115, 
0.414458334445953, 0.416541665792465, 0.342583328485489, 0.378208339214325, 
0.440750002861023)), .Names = c("x", "y", "year", "info"), row.names = c(NA, 
10L), class = "data.frame")

It returns me the whole data frame for the other dataframe, i don't understand why but i can't put it the result it doesn't make any sense

1 Answers1

0

Try this:

# Get fancy data
set.seed(666)

df1 <- iris[sample(x = 1:10, size = 6, replace = FALSE),]
df2 <- iris[sample(x = 1:10, size = 6, replace = FALSE),]

# Get common rows
index <- match(apply(df1, 1, paste, collapse = "-"), 
               apply(df2, 1, paste, collapse = "-"))
index <- index[!is.na(index)]

df3 <- df2[index,]

As you can see, df3 will be a data.frame with only common rows.