1

I'm just trying to repeat what I've done months ago, with the same set of data. Comparing two dataframe in r

These are my dataframes:

df1
>    1  2  3  
> 1 AT GC CC 
> 2 AG GC CT 
> 3 GG TT <NA>

df2
>    1  2   3  
> 1 AT <NA> GG 
> 2 AG  GC  CG 
> 3 GG  TT  AA

results I want    
>      1     2     3  
> 1 TRUE <NA>  FALSE 
> 2 TRUE TRUE  FALSE 
> 3 TRUE TRUE  <NA>

My two dfs (df1,df2) aren't of the same levels, so I need to change them to characters.

A <- lapply(df1, as.character)
B <- lapply(df2, as.character)

So I do the comparison:

A == B

But I obtain an error:

Error in A == B : 
  comparison of these types is not implemented

str(df1): `

'data.frame':   82 obs. of  24 variables:
 $ rs1  : Factor w/ 3 levels "AA","AC","CC": 2 3 3 3 3 1 2 3 3 2 ...
 $ rs2 : Factor w/ 2 levels "TC","TT": 2 2 2 2 2 2 2 2 2 2 ...
 $ rs3 : Factor w/ 2 levels "AG","GG": 2 2 2 2 2 2 1 2 2 2 ...
 $ rs4 : Factor w/ 3 levels "CC","TC","TT": 1 1 1 1 1 2 1 1 1 1 ...
 $ rs5 : Factor w/ 2 levels "TC","TT": 2 2 2 2 2 2 2 2 2 2 ...
 $ rs6 : Factor w/ 3 levels "GG","TG","TT": 3 1 2 2 1 2 1 2 2 1 ...
 $ rs7 : Factor w/ 3 levels "AA","AG","GG": 1 2 2 1 2 1 2 2 2 2 ...
 $ rs8  : Factor w/ 3 levels "AA","AG","GG": 3 2 3 2 3 3 2 2 2 2 ...
 $ rs9 : Factor w/ 3 levels "CC","CG","GG": 3 3 3 3 3 3 3 3 3 3 ...
 $ rs10  : Factor w/ 2 levels "CC","TC": 1 1 1 1 1 1 1 1 2 1 ...
 $ rs11  : Factor w/ 3 levels "GG","TG","TT": 1 2 1 1 2 1 1 1 2 1 ...
 $ rs12  : Factor w/ 2 levels "AC","CC": 2 2 2 2 2 2 2 2 2 2 ...
 $ rs13  : Factor w/ 2 levels "CC","TC": 1 1 1 1 1 1 1 1 1 1 ...
 $ rs14  : Factor w/ 2 levels "CC","TC": 1 1 2 1 1 1 1 1 1 1 ...
 $ rs15  : Factor w/ 2 levels "CG","GG": 2 2 2 2 2 2 2 2 2 2 ...
 $ rs16: Factor w/ 2 levels "AC","CC": 2 2 1 2 2 2 2 2 2 2 ...
 $ rs17  : Factor w/ 3 levels "AA","AG","GG": 3 3 3 3 3 3 3 1 3 3 ...
 $ rs18  : Factor w/ 2 levels "AA","AG": 1 1 1 1 1 1 1 1 2 1 ...
 $ rs19  : Factor w/ 3 levels "AA","AG","GG": 1 3 1 1 1 2 1 1 2 2 ...
 $ rs20    : Factor w/ 2 levels "AA","AC": 1 1 1 1 1 1 1 1 1 1 ...
 $ rs21: Factor w/ 1 level "CC": 1 1 1 1 1 1 1 1 1 1 ...
 $ rs22  : Factor w/ 3 levels "AA","AC","CC": 2 2 2 2 3 2 2 3 3 1 ...
 $ rs23      : Factor w/ 4 levels "A","AA","AC",..: 4 2 NA NA 2 2 2 3 3 2 ...
 $ rs24   : Factor w/ 2 levels "AG","GG": 2 2 2 2 2 2 2 2 2 1 ...`

str(df2): `

'data.frame':   82 obs. of  24 variables:
     $ rs1  : Factor w/ 3 levels "CC","AC","AA": 2 1 1 1 1 3 2 1 1 2 ...
     $ rs12  : Factor w/ 2 levels "TC","TT": 2 2 2 2 2 2 2 2 2 2 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs3 : Factor w/ 2 levels "AG","GG": 2 2 2 2 2 2 1 2 2 2 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs4 : Factor w/ 3 levels "CC","TC","TT": 1 1 1 1 1 2 1 1 1 1 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs5 : Factor w/ 3 levels "TT","TC","CC": 1 1 1 1 1 1 1 1 1 1 ...
     $ rs6 : Factor w/ 3 levels "GG","TG","TT": 3 1 2 2 1 2 1 2 2 1 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs7 : Factor w/ 3 levels "GG","AG","AA": 3 2 2 3 2 3 2 2 2 2 ...
     $ rs8  : Factor w/ 3 levels "GG","AG","AA": 1 2 1 2 1 1 2 2 2 2 ...
     $ rs9 : Factor w/ 3 levels "CC","CG","GG": 3 3 3 3 3 3 3 3 3 3 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs10  : Factor w/ 2 levels "TC","CC": 2 2 2 2 2 2 2 2 1 2 ...
     $ rs11  : Factor w/ 3 levels "GG","TG","TT": 1 2 1 1 2 1 1 1 2 1 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs12  : Factor w/ 2 levels "AC","CC": 2 2 2 2 2 2 2 2 2 2 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs13  : Factor w/ 2 levels "CC","TC": 1 1 1 1 1 1 1 1 1 1 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs14  : Factor w/ 2 levels "CC","TC": 1 1 2 1 1 1 1 1 1 1 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs15  : Factor w/ 3 levels "AG","CG","GG": 3 3 3 3 3 3 3 3 3 3 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs16: Factor w/ 2 levels "AC","CC": 2 2 1 2 2 2 2 2 2 2 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs17  : Factor w/ 3 levels "AA","AG","GG": 3 3 3 3 3 3 3 1 3 3 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs18  : Factor w/ 2 levels "AG","AA": 2 2 2 2 2 2 2 2 1 2 ...
     $ rs19  : Factor w/ 3 levels "AA","AG","GG": 1 3 1 1 1 2 1 1 2 2 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs20    : Factor w/ 3 levels "AA","AC","CC": 1 1 1 1 1 1 1 1 1 1 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs21: Factor w/ 1 level "CC": 1 1 1 NA 1 1 1 1 1 1 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs22  : Factor w/ 3 levels "AA","AC","CC": 2 2 2 1 3 2 2 3 3 1 ...
      ..- attr(*, "names")= chr  "i001.p2" "i002.p2" "i003.p2" "i005.p2" ...
     $ rs23      : Factor w/ 3 levels "CC","AC","AA": 1 3 2 2 3 3 3 2 2 3 ...
     $ rs24   : Factor w/ 2 levels "GG","AG": 1 1 1 1 1 1 1 1 1 1 ...`
mppd
  • 57
  • 1
  • 1
  • 9
  • This will work with data.frames. In your question above, post the result of `str(df1)` and `str(df2)`. – lmo Oct 02 '17 at 15:52
  • Please update your question with a reproducible example. – josliber Oct 02 '17 at 15:56
  • I finally recreated your error message, `==` does not work to compare two list objects (that are not data.frames). For example, `list(1) == list(1)` will return the same error. You have to convert A and B to data.frames using `as.data.frame` or `data.frame` before the comparison. – lmo Oct 02 '17 at 16:20
  • @lmo I've tried but it doesn't work. `A <- lapply(df1, as.character) B <- lapply(df2, as.character) > A <-data.frame(A) > B<-data.frame(B) > A == B Error in Ops.factor(left, right) : levels of two factors are different` – mppd Oct 02 '17 at 16:39
  • In that instance, you'll need to include stringsAsFactors=FALSE to the `data.frame` call. – lmo Oct 02 '17 at 16:41

0 Answers0