23

I am looping over a list of dataframes in R and want to use their names as part of the filename I save my plots under.

The code below is my attempt at iterating through dataframes, plotting their first column (var1) versus their second (var2) and then saving the plot.

first.data = data.frame( var1 = 1:4, var2 = 5:8 );
second.data = data.frame( var1 = 9:12, var2 = 13:16 );

for ( dataFrame in list(first.data, second.data) ) {
     plot( dataFrame[["var1"]], dataFrame[["var2"]] );
     dev.copy( pdf, paste( dataFrame, "_var1_vs_var2.pdf", sep="" ) );
     dev.off();
}

I expect this loop to produce PDF files with filenames of the form "first.data_var1_vs_var2.pdf" but instead the name of the data frame is replaced with the first column in the frame and so I get something like "c(1, 2, 3, 4)_var1_vs_var2.exchemVbuffer.pdf".

holocronweaver
  • 2,171
  • 3
  • 18
  • 20
  • It's more difficulty and less likely that you'll get a response as your code isn't reproducible (i.e. I can't run it because you haven't supplied a data set). Check out this [LINK](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) to learn how to do this. – Tyler Rinker May 18 '12 at 16:59
  • 2
    Loop through the names of the list instead of the list elements. – Aaron left Stack Overflow May 18 '12 at 17:03

3 Answers3

27

The only way I know to work this way directly on the dataframes in a list would be to attach a comment that holds the name, which you can then use to carry its name inside the loop:

df1 <- data.frame(var1=rnorm(10), var2=rnorm(10))
df2 <- data.frame(var1=rnorm(10), var2=rnorm(10))
comment(df1) <- "df1"
comment(df2) <- "df2"

for ( dataFrame in list(df1,df2) ) { 
     dFnm <- comment(dataFrame) 
     pdf(file=paste( dFnm, "_var1_vs_var2.pdf", sep="" ))
     plot( dataFrame[["var1"]], dataFrame[["var2"]] )     
     dev.off();
}

(You do lose the names of objects when they get passed as the loop variables. If you do deparse(substitute()) inside that loop, you get "dataFrame" rather than the original names.) The other way would be to use names of the dataframes, but then you will need to use get or do.call, which might get a bit messier. This way seems fairly straightforward.

Brian Diggs
  • 57,757
  • 13
  • 166
  • 188
IRTFM
  • 258,963
  • 21
  • 364
  • 487
19

The code below answers the question in the title, but may or may not be of any help regarding the question in the body of the post:

my.data <- read.table(text='
    x1    x2     x3
     1    10    111
     2    20    222
     3    30    333
     4    40    444
     5    50    555
', header = TRUE, stringsAsFactors = FALSE)

my.data

deparse(substitute(my.data))

# [1] "my.data"

I found this solution here:

https://www.mail-archive.com/r-help@r-project.org/msg60789.html

after fairly extensive searching and thought if might be helpful to others to include the code with the current question, which is the first hit I get when searching the internet for: convert data frame name to string r.

The answer by BondedDust does mention deparse.

Here is code that appears to address the question in the body of the post. Although I left out the code for plot generation:

df1 <- data.frame(var1=rnorm(10), var2=rnorm(10))
df2 <- data.frame(var1=rnorm(10), var2=rnorm(10))

list.function <-  function() { 

     sapply(c("df1", "df2"), get, environment(), simplify = FALSE) 
} 

my.list <- list.function()

my.df.names <- names(my.list)
# [1] "df1" "df2"

for (i in 1:length(my.list) ) {

     df.name <- my.df.names[i]
     print(df.name)

}

[1] "df1"
[1] "df2"
Mark Miller
  • 12,483
  • 23
  • 78
  • 132
  • 4
    So to others: just keep in mind that the simple answer is `deparse(substitute(your_df))`. – MS Berends Aug 01 '17 at 06:24
  • Thanks @MarkMiller and @MSBerends I struggled with this when using it inside a function where I passed in the df name e.g. clean_data(df1). I wanted to report "dataframe df1 is cleaned". When I used `deparse(substitute(df1))` in the paste it gave me a dput like structure of all the data and not the name of the dataframe. How do you get this to work with parameters passed into a function? – micstr May 27 '21 at 08:19
  • Perhaps it’s some of the `sys.calls()`? – MS Berends May 28 '21 at 14:35
  • Thanks this helped me a lot! AND I don't think there's in easy equivalent in Python that can do this in Pandas (i.e., take the name of a dataframe and make it a string)! The Lord Always Delivers! – George Hayward Mar 18 '22 at 04:13
8

Slightly different solution:

dataframe1 = data.frame(iv = rnorm(50), dv = rnorm(50))
dataframe2 = data.frame(iv = rnorm(50), dv = rnorm(50))
dataframe3 = data.frame(iv = rnorm(50), dv = rnorm(50))

LIST = list(dataframe1 = dataframe1,
            dataframe2 = dataframe2,
            dataframe3 = dataframe3)


for(i in 1:length(LIST)){
  pdf(file=paste(names(LIST)[i], paste(colnames(LIST[[i]]), collapse="."), 
                 "pdf", sep="."))
  plot(LIST[[i]][,1],LIST[[i]][,2], 
       xlab = colnames(LIST[[i]])[1], 
       ylab = colnames(LIST[[i]])[2],
       main = paste("Plot based on data in", names(LIST)[i]))
  dev.off()
  }
Alex
  • 4,030
  • 8
  • 40
  • 62