73

I'm attempting to pass the column indices to ggplot as part of a function I'll be using repeatedly. like:

myplot <- function(df){
    ggplot(df, aes(df[, 1], df[, 2])) + geom_point()
}

I'll always be using the first column as my x variable and the second column as my y-variable, but the column names change between data sets. I've searched all over.. Any ideas?

EDIT:

This is the answer I used:

require(ggplot2)

myplot <- function(df){
   ggplot(df, aes_string(colnames(df)[1], colnames(df)[2])) + geom_point()
}
Community
  • 1
  • 1
N8TRO
  • 3,348
  • 3
  • 22
  • 40
  • It seems that your question title corresponds to Paul Hiemstra's answer, which regards using string column names. But the body of your question regards using their index, which is indeed a duplicate as flagged. I'd recommend changing the body of your question to match the answer. – Max Ghenis Aug 14 '15 at 20:43
  • @MaxGhenis Both cases (and more) are answered in Paul's answer and within the comments below. – N8TRO Aug 16 '15 at 18:29
  • I see that. The mismatch in content is still confusing for those who will stumble upon this; questions should be specific and unambiguous. Changing it may also provide an opportunity to be unflagged as duplicate. – Max Ghenis Aug 17 '15 at 06:18
  • @MaxGhenis What would you suggest as a revised title? – N8TRO Aug 20 '15 at 06:00
  • **Edit:** ignore below, I didn't initially misread. The question should read "R pass variable column *indices* to ggplot2", and is correctly flagged as a duplicate. *Orig; ignore:* Shoot, I'm sorry, I just realized I'd misread the question, thinking that passing indices was being problematic. This is just a false duplicate marking, probably takes someone with more karma than I to request a review. Sorry about that, N8TRO! – Max Ghenis Aug 22 '15 at 16:56
  • @MaxGhenis Title edit done. – N8TRO Aug 24 '15 at 00:14

1 Answers1

109

You can use the aes_string in stead of aes to pass string in stead of using objects, i.e.:

myplot = function(df, x_string, y_string) {
   ggplot(df, aes_string(x = x_string, y = y_string)) + geom_point()
 }
myplot(df, "A", "B")
myplot(df, "B", "A")
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • Thanks, it works, though not ideal because I'd have to manually give the column names. Any way to get around this? – N8TRO Mar 17 '13 at 07:47
  • 12
    `aes_string(colnames(df)[1], colnames(df)[2])` – baptiste Mar 17 '13 at 08:00
  • In general, in `ggplot2` you do not provide vectors in `aes`. In `aes` you provide a mapping of the aesthetics of the plot to columns in the data, with no need to hardcode the data in `aes`. – Paul Hiemstra Mar 17 '13 at 08:04
  • If you column name is `a-b` for example, then this gives the error `Error in eval(expr, envir, enclos) : object 'a-b' not found`. Using `environment = environment()` with `aes` is another fix as linked above under OP's post. – Arun Mar 17 '13 at 08:38
  • I think there is no way, other than passing vectors, in `ggplot2` to make a plot with this kind of column name. I never encountered this, but I follow Hadley's style and always use `_` ;). – Paul Hiemstra Mar 17 '13 at 08:45
  • 5
    @PaulHiemstra, I already linked one way of doing this (without using column names at all). If you insist on using column names, then try this instead: `set.seed(45); df <- data.frame(x=gl(5,5), y=runif(25)); myplot2 = function(df, col1, col2) { ggplot(df, aes(x = get(names(df)[col1]), y = get(names(df)[col2])), environment = environment()) + geom_point() }`. From this it is straightforward to change this function to take column names as arguments. – Arun Mar 17 '13 at 10:33
  • Nice, thanks for the code Arun. – Paul Hiemstra Mar 17 '13 at 11:36
  • just was i was looking for, you saved me a good time, thanks :) – vruizext Mar 25 '15 at 09:46
  • @PaulHiemstra, @Arun: Using `aes_q` is another way of passing non-standard column names to `ggplot`. Why I think that's the preferable solution see [here](http://stackoverflow.com/a/30618800/2591234). – shadow Jun 03 '15 at 11:33
  • aes_string() is now soft deprecated. I think current tidyverse practice would be to use .data calls, as I show at my solution on this related post: https://stackoverflow.com/a/69286410/9059865 – Bryan Shalloway Sep 23 '21 at 20:55
  • Unfortunately, this solution doesn't work if there are multiple words in a title of a column – Vladimir Mikheev Mar 21 '23 at 10:29