0

I have two files. The first file is a data frame that is simply times in one column and individuals in a second

#      [Time]      [Individual]
# [1]    1528142     C5A1790 
# [2]    1528142     C5A1059 
# [3]    1528142     C5A1084 
# [4]    1528142     C5A1564
# [5]    1528142     C5A1239
# [6]    1528142     C5A1180 

the second is an N X N matrix in which both rows and columns are individuals, including those in the first matrix.

#            [C5A1084] [C5A1059] [C5A1790] [C5A1180] 
# 1 [C5A1084]    0        0.5        1         0
# 2 [C5A1059]   0.5        0         0         1
# 3 [C5A1790]    1         1         0        0.5
# 4 [C5A1180]    0         1        0.5        0

I need to create a vector containing the row numbers in the matrix at which I can find the individuals from the data frame, and in the order that they are listed in the data frame. For these example data it would be (3,2,1,4).

I tried to use the which() function as

RingIndex <- which(Matrix$IDcolumn == FrameIDs)

and received the "longer object length is not a multiple of shorter object length" message, presumably because the matrix includes more individuals than the data frame. %in% and match() are also returning errors stating that replacement has fewer rows than data.

Following the advice in the comments, I tried

RingIndex <- which(Matrix$IDcolumn %in% FrameIDs)

which successfully returned the correct row numbers, but in ascending order rather than the order of the original data. The match() function continues to complain of different replacement and original lengths.

What approach could I use to get my vector?

Many thanks!

Tess H
  • 25
  • 4

1 Answers1

0
df <- data.frame(Time = runif(6,1528142,1528150), 
                 Individuals = c("C5A1790","C5A1791","C5A1792","C5A1793","C5A1794","C5A1795"))
> df
     Time Individuals
1 1528144     C5A1790
2 1528143     C5A1791
3 1528144     C5A1792
4 1528148     C5A1793
5 1528145     C5A1794
6 1528143     C5A1795

nnMatrix <- matrix(runif(36,0,1),6,6)
colnames(nnMatrix) <- df$Individuals
rownames(nnMatrix) <- df$Individuals
> nnMatrix
           C5A1790   C5A1791   C5A1792    C5A1793   C5A1794    C5A1795
C5A1790 0.08096946 0.8716328 0.6895134 0.05692825 0.4555460 0.53224424
C5A1791 0.42568532 0.5920239 0.4523232 0.11516185 0.8053652 0.72299411
C5A1792 0.42439187 0.6101881 0.8534429 0.86010851 0.1269521 0.41066857
C5A1793 0.26043345 0.8011337 0.8032234 0.30930988 0.2298927 0.93320166
C5A1794 0.43065533 0.2161525 0.6702832 0.89304071 0.6765714 0.09769635
C5A1795 0.70594252 0.1048099 0.7478553 0.87839534 0.5173364 0.69957502


> sapply(df$Individuals, function(t) which(colnames(nnMatrix) == t))
[1] 1 2 3 4 5 6

If you change the order

colnames(nnMatrix) <- rev(colnames(nnMatrix))

[1] 6 5 4 3 2 1

You may want to check for repetition and missing values, but the main approach is the same.

As suggested in the comments (@GKi) also match will work

> match(df$Individuals,colnames(nnMatrix))
[1] NA  1  3  4  5  6
fra
  • 832
  • 6
  • 14