0

I have 2 data frames

> a1
v1  v2  v3
ABCA1   --> GIF
ACTA1   --| CSNK2A1
ACTN4   --| HDAC7
ACTN4   --> RARA

> a2
v1  v2  v3
ABCA1   --| GIF
ACTA1   --| CSNK2A1
ABCD2   --| HDAC7
ACTN4   --> XYZ1

I want output where a1$v1 == a2$v1 && a1$v3 == a2$v3 && a1$v2 != a2$v2. So, the outcome will be:

> a3
ABCA1   --> GIF

Because Row 1 fulfill all those conditions. In row 2 condition 3 is not fulfilled. In row 3 condition 1 is not fulfilled and in row 4 condition 2 is not fulfilled.

user3253470
  • 191
  • 1
  • 4
  • 11

1 Answers1

0

If we are comparing the 'v1' column in 'a1' and 'a2' datasets, and use & instead of &&, we get the expected output

a1[(a1$v1==a2$v1) & (a1$v3==a2$v3) & (a1$v2 != a2$v2), , drop=FALSE]
#    v1  v2  v3
#1 ABCA1 --> GIF

According to the description from ?"&&"

‘&’ and ‘&&’ indicate logical AND and ‘|’ and ‘||’ indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector.

Update

If we need to compare one row in 'a1' against all the rows, we can paste the rows in each datasets using do.call(paste,.., and loop through lapply on the paste elements of 'a1' and compare against the pasted 'a2' or this can be done using outer.

 lapply(do.call(paste, a1), '==', do.call(paste, a2))

Or

 outer(do.call(paste, a1), do.call(paste, a2), '==')

data

a1 <- structure(list(v1 = c("ABCA1", "ACTA1", "ACTN4", "ACTN4"),
 v2 = c("-->", 
"--|", "--|", "-->"), v3 = c("GIF", "CSNK2A1", "HDAC7", "RARA"
)), .Names = c("v1", "v2", "v3"), class = "data.frame", 
row.names = c(NA, -4L))

a2 <- structure(list(v1 = c("ABCA1", "ACTA1", "ABCD2", "ACTN4"), 
v2 = c("--|", 
"--|", "--|", "-->"), v3 = c("GIF", "CSNK2A1", "HDAC7", "XYZ1"
)), .Names = c("v1", "v2", "v3"), class = "data.frame",
row.names = c(NA, -4L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • @user3253470 It gave the expected output as you showed based on the input datasets. – akrun Oct 05 '15 at 16:29
  • @user3253470 Check whether the columns have leading/lagging spaces. In that case `==` may return `FALSE`, ie. `"-->" == " -->" #[1] FALSE` – akrun Oct 05 '15 at 16:33
  • @user3253470 If you have that cases, `library(stringr); a1[] <- lapply(a1, str_trim); a2[] <- lapply(a2, str_trim)` and then proceed as before. – akrun Oct 05 '15 at 16:37
  • 1
    Thanks it's done, I don't have enough score to upvote the answer..! – user3253470 Oct 06 '15 at 08:56
  • There is a little problem. I want to compare if "any" `a1$v1 == a2$v1` & `a1$v3` == `a2$v3` & `a1$v2 != a2$v2`. This code is making a comparison of every particular row (for example Row1 of `a1` vs Row 1 of `a2`). Can you help me on this ? @akrun – user3253470 Oct 06 '15 at 14:18
  • I think my comment is confusing. In other words, take 1 row from `a1` and compare it against all rows of `a2` for fulfilling the above condition. Not just a comparison of Row 1 from `a1` to Row1 from `a2` and Row 2 from `a1` to Row 2 from `a2`and so on.. – user3253470 Oct 06 '15 at 14:23
  • Try `lapply(do.call(paste, a1), '==', do.call(paste, a2))` or `outer(do.call(paste, a1), do.call(paste, a2), '==')` – akrun Oct 06 '15 at 14:32
  • Sorry for bothering you again. I think I am still not clear. `ABCA1` is `a1$V1`, `-->` is `a1$V2` and `GIF` is `a1$V3`, overall this comprises Row 1 of `a1`. Now apply above condition (`a1[(a1$v1==a2$v1) & (a1$v3==a2$v3) & (a1$v2 != a2$v2), , drop=FALSE]`) for row 1 of `a1` vs. every row of `a2`. Where all 3 conditions meet, print only those lines. – user3253470 Oct 06 '15 at 14:51
  • @user3253470 You mentioned to compare the rows of one against the other. Sorry. – akrun Oct 06 '15 at 14:52
  • The previous code was comparing row wise. it worked in case of example because there the rows which were fulfilling the conditions were ranked the same. (e.g. row 1 of a1 vs. row 1 of a2). If I disturb the row orders then that code doesn't work. – user3253470 Oct 06 '15 at 14:59
  • @user3253470 Based on the example, wouldn't it be all `FALSE` after comparing. – akrun Oct 06 '15 at 15:05
  • Yes you're right. Instead of giving hard coded position e.g `a1$V1` it should be something else. Just re rank the data.frame a2, replace the row 1 with row 3 and vice versa. Now I want the same result as with old `ABCA1 --> GIF`. – user3253470 Oct 06 '15 at 15:11
  • @user3253470 If you want the same result as before, why you are doing this way? – akrun Oct 06 '15 at 15:12
  • After this re-ranking. the row 1 of a1 and row 3rd of a2 will be fulfilling the matching condition and should be printed as result. As my dataset is quite big so there should be a general approach for matching. Any row vs. any row comparison, where 1st and 3rd element of those rows are same and 2nd element is not same. – user3253470 Oct 06 '15 at 15:14
  • @user3253470 If you are comparing any rows, then use `%in%` instead of `==` – akrun Oct 06 '15 at 15:17
  • In actual my a1 comprise of 2919 rows and a2 comprise of 3623 rows. I want every row in both data.frames where element 1 and 3 are identical (`(a1$element1==a2$element1) & (a1$element3==a2$element3)`) while element 2 are not identical (`(a1$element2 != a2$element2)`) – user3253470 Oct 06 '15 at 15:18
  • @user3253470 I guess this should be a separate question as the original question was different. – akrun Oct 06 '15 at 15:19
  • And what I should use against `!=` ?? – user3253470 Oct 06 '15 at 15:20
  • @user3253470 You can do `!a1$element2 %in% a2$element2` – akrun Oct 06 '15 at 15:22
  • How to represent `element` ?? I have posted the new question, perhaps you can help me there [link](http://stackoverflow.com/questions/32973838/comparing-elements-of-data-frames-in-r) – user3253470 Oct 06 '15 at 15:31
  • @user3253470 I think you need to create a separate example where this solution won't give the same output as you desired so that it becomes less confusing. – akrun Oct 06 '15 at 15:32
  • [Here is the new question] (http://stackoverflow.com/questions/32973838/comparing-elements-of-data-frames-in-r) – user3253470 Oct 06 '15 at 15:38
  • @user3253470 Glad to know that the problem is solved. – akrun Oct 08 '15 at 08:56
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/91716/discussion-between-user3253470-and-akrun). – user3253470 Oct 08 '15 at 09:07
  • can you help me with this question: [http://stackoverflow.com/questions/36420909/compare-matrices-to-find-the-differences] – user3253470 Apr 05 '16 at 08:56