36

Consider the following code

a = "col1"
b = "col2"
d = data.frame(a=c(1,2,3),b=c(4,5,6))

This code produces the following data frame

  a b
1 1 4
2 2 5
3 3 6

However the desired data frame is

  col1 col2
1 1    4
2 2    5
3 3    6

Further, I'd like to be able to do something like d$a which would then grab d$col1 since a = "col1"

How can I tell R that "a" is a variable and not a name of a column?

webb
  • 4,180
  • 1
  • 17
  • 26
CodeGuy
  • 28,427
  • 76
  • 200
  • 317
  • 13
    You can't use `$` like that. [**See here**](http://stackoverflow.com/a/18228613/1478381) for more information on why. You can however do `d[ , a ]` to achieve what you want. – Simon O'Hanlon Nov 01 '13 at 16:28
  • Try out this code. Any idea how to avoid this error, or what this error is? columnName = "col1"; value = 5; d = data.frame(); d[,columnName] = value; – CodeGuy Nov 01 '13 at 16:53
  • You have an empty data frame. There is no variable "columnName" in it, so you can't call it or assign a value to it. – gung - Reinstate Monica Nov 01 '13 at 16:56
  • So how can I fix this so it works? I want to start with an empty data frame – CodeGuy Nov 01 '13 at 16:57
  • I suppose you could start w/ `d = data.frame(NA)`, although you'd always have a column of `NA`s in your data frame. I don't usually start w/ an empty data frame. – gung - Reinstate Monica Nov 01 '13 at 17:04

2 Answers2

41

After creating your data frame, you need to use ?colnames. For example, you would have:

d = data.frame(a=c(1,2,3), b=c(4,5,6))
colnames(d) <- c("col1", "col2")

You can also name your variables when you create the data frame. For example:

d = data.frame(col1=c(1,2,3), col2=c(4,5,6))

Further, if you have the names of columns stored in variables, as in

a <- "col1"

you can't use $ to select a column via d$a. R will look for a column whose name is a. Instead, you can do either d[[a]] or d[,a].

joran
  • 169,992
  • 32
  • 429
  • 468
gung - Reinstate Monica
  • 11,583
  • 7
  • 60
  • 79
  • 3
    In case of `data.frame`, `names(d)<- c("col1", "col2")` will do. – Metrics Nov 01 '13 at 16:34
  • 1
    That's a good point, @Metrics. In truth, I almost never use `names()`; `colnames()` seems conceptually clearer to me. Is there some benefit, other than typing 3 fewer characters? – gung - Reinstate Monica Nov 01 '13 at 16:49
  • You are correct @gung. That also holds if you want consistency :) – Metrics Nov 01 '13 at 16:55
  • @Metrics, consistency in what sense? A `list` doesn't have `colnames`, but a `data.frame` is a type of a `list`. On the other hand, `data.frame`s are similar to a rectangular `matrix` which has `colnames` and `rownames`.... Sigh... – A5C1D2H2I1M1N2O1R2T1 Nov 01 '13 at 17:27
11

You can do it this way

a = "col1"
b = "col2"
d = data.frame(a=c(1,2,3),b=c(4,5,6))

>d
  a b
1 1 4
2 2 5
3 3 6

#Renaming the columns
names(d) <- c(a,b)
> d
  col1 col2
1    1    4
2    2    5
3    3    6

#Calling by names
 d[,a]
Rohit Das
  • 1,962
  • 3
  • 14
  • 23