3

I have a datatable formatted as follows:

Name X1234 X5555 X3000 X5000 X7500 X8745 X9451 X8338 X8377 Object 1 0+ 0+ 1+ 0+ 0+ 0+ 0+ 0+ 0+ Object 2 1+ 0+ 0+ 0+ 0+ 0+ 0+ 0+ 0+ Object 3 0+ 0+ 0+ 0+ 1+ 0+ 0+ 0+ 0+

My datatable is filled with a couple of hundred rows; let's say Objects 1 to 100. All structured as followed. Each row, and thus Object, contains hundred+ columns. In one of these columns (which names are dynamic, but always start with X) I'm looking for the value 1+. The next step i want to acccomplish is adding an extra column, let's name it Number, and fill it with the column name where the value of the row == 1+.

So, my desired result would be:

Name X1234 X5555 X3000 X5000 X7500 X8745 X9451 X8338 Number Object 1 0+ 0+ 1+ 0+ 0+ 0+ 0+ 0+ X3000 Object 2 1+ 0+ 0+ 0+ 0+ 0+ 0+ 0+ X1234 Object 3 0+ 0+ 0+ 0+ 1+ 0+ 0+ 0+ X7500

In R, what would be the best way to accomplish this? I have looked up and experimented with the functions like apply, which etc, but unfortunally haven't found a working solution yet.

I'm fairly new to developing scripts in R so my apologies if my question isn't clear or simple to answer.

Similar case in Python: Find the column name which has the maximum value for each row

rewiind
  • 43
  • 5
  • What if there are multiple columns with values +1 in the same row? – missuse Oct 13 '17 at 08:52
  • For each row there is only one column where the value is 1+. – rewiind Oct 13 '17 at 08:57
  • Related: [For each row return the column name of the largest value](https://stackoverflow.com/questions/17735859/for-each-row-return-the-column-name-of-the-largest-value). Just use a _condition_ in `max.col` (i.e `d == "1+"`), instead of the raw data. – Henrik Oct 13 '17 at 09:02

4 Answers4

2

An approach with which:

dat$Number <- names(dat)[which(dat == "1+", arr.ind = TRUE)[ , 2]]
# [1] "X1234" "X3000" "X7500"
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
1

We can use max.col to find the column index of logical matrix (df1[-1]=="1+"). Add 1 to it because we used only from 2nd column. Then, with names(df1) get the corresponding names

df1$Number <- names(df1)[max.col(df1[-1]=="1+")+1]
df1$Number
#[1] "X3000" "X1234" "X7500"
akrun
  • 874,273
  • 37
  • 540
  • 662
1

You can use apply and which:

df <- data.frame( x1 = c(0, 0, 1), x2 = c(1, 0 , 0), x3 = c(0, 1 , 0) )
idx <- apply( df, 1, function(row) which( row == 1 ) )
cbind( df, Number = colnames( df[ , idx] ) )

  x1 x2 x3 Number
1  0  1  0     x2
2  0  0  1     x3
3  1  0  0     x1
Adam
  • 756
  • 4
  • 10
  • 23
0

You can also use the col function to return the proper variable name index like this:

names(mat)[col(mat)[which(mat == "1+")]]
[1] "X1234" "X3000" "X7500"
lmo
  • 37,904
  • 9
  • 56
  • 69