2

I would like to be able to refer to columns by name and index in one vector. As example I specify only:

EDIT: I changed the order of the original vector as I want the order to not matter.

columns <- c(1:7, "j", 8, "i")

I would then like to retrieve the names of the index 1 to 9 and add them to the vector (in the correct place). I have a general idea, but coding wise I am not getting very far:

library(data.table)
df <- fread(
"a b c d e f g h i j
1 2 3 4 5 6 7 8 9 10",
  header = TRUE
)

function(data, columns){
nums <- as.numeric(columns)
named_columns <- ?
nums <- nums[!is.na(nums)]
name_nums <- colnames(df)[nums]

all_columns <- setdiff(named_colums, name_nums)
# Order of the original vector?
column_names <- result
}

and then return it to the vector so that the outcome will be:

column_names <- c("a", ..., "j", "h", "i")

Could anyone help me to get a bit further?

Tom
  • 2,173
  • 1
  • 17
  • 44

1 Answers1

2

If I understand your question, you want to pass a vector which has column names and indices and you want to get back a vector with column indices only. Then the following should help;

df <- data.table::fread("a b c d e f g h i j
                         1 2 3 4 5 6 7 8 9 10",
                                               header = TRUE)
columns <- c(1:8, "i", 9, "j")


col2num <- function(df, columns){
              nums <- as.numeric(columns)
              nums[is.na(nums)] <- which(names(df)==columns[is.na(nums)])
              return(nums)
            }

col2num(df, columns)
#> Warning in col2num(df, columns): NAs introduced by coercion
#>  [1]  1  2  3  4  5  6  7  8  9  9 10
M--
  • 25,431
  • 8
  • 61
  • 93
  • Thank you very much! I was initially going for column names as output instead of indices, but I guess it is only a small step from there. I will try it with my actual data now. – Tom Jul 18 '19 at 19:29
  • 1
    @Tom for that `return(names(df)[nums])` works instead of `return(nums)`. p.s. I would name the function `col2name` in that case ;) – M-- Jul 18 '19 at 19:31
  • 1
    I combined your answer with the movemedatatable solution proposed here: https://stackoverflow.com/questions/18339370/reordering-columns-in-a-large-dataframe to make it very easy to move around columns. Works like a charm. Thank you so much! (I wanted the names because they don't change as opposed to the indexes) – Tom Jul 18 '19 at 19:40
  • I have been trying to pass an object (vector of column names) to your solution, but I am getting an error `Error in nums[is.na(nums)] <- which(names(data) == colnums[is.na(nums)])` : replacement has length zero Is there any reason this should not work? – Tom Jul 22 '19 at 07:00
  • when you say *vector of column names* you mean something like: `c("i", "j")`? Off the top of my head, I think it should still work. Maybe post another question with a reproducible example :? – M-- Jul 22 '19 at 14:05
  • That is indeed what I meant. Please see the new post here https://stackoverflow.com/questions/57293986/a-function-that-allows-referral-to-columns-with-both-index-and-name – Tom Jul 31 '19 at 15:12