2

I'm looking to create multiple data frames using a for loop and then stitch them together with merge().

I'm able to create my data frames using assign(paste(), blah). But then, in the same for loop, I need to delete the first column of each of these data frames.

Here's the relevant bits of my code:

for (j in 1:3)
{
    #This is to create each data frame
    #This works
    assign(paste(platform, j, "df", sep = "_"), read.csv(file = paste(masterfilename,    extension, sep = "."), header = FALSE, skip = 1, nrows = 100))

    #This is to delete first column
    #This does not work
    assign(paste(platform, j, "df$V1", sep = "_"), NULL)
}

In the first situation I'm assigning my variables to a data frame, so they inherit that type. But in the second situation, I'm assigning it to NULL.

Does anyone have any suggestions on how I can work this out? Also, is there a more elegant solution than assign(), which seems to bog down my code? Thanks,

n.i.

swetharevanur
  • 99
  • 1
  • 11
  • 5
    This would all be much, much simpler if you stopped using `assign` and simply put all the data frames in a list. – joran Jul 11 '14 at 14:23
  • Maybe read [Why `assign` is bad](http://stackoverflow.com/questions/17559390/why-is-assign-bad) – MrFlick Jul 11 '14 at 14:37

2 Answers2

4

assign can be used to build variable names, but "name$V1" isn't a variable name. The $ is an operator in R so you're trying to build a function call and you can't do that with assign. In fact, in this case it's best to avoid assign completely. You con't need to create a bunch of different variables. If you data.frames are related, just keep them in a list.

mydfs <- lapply(1:3, function(j) {
    df<- read.csv(file = paste(masterfilename, extension, sep = "."), 
        header = FALSE, skip = 1, nrows = 100))
    df$V1<-NULL
    df
})

Now you can access them with mydfs[[1]], mydfs[[2]], etc. And you can run functions overall data.sets with any of the *apply family of functions.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
1

As @joran pointed out in his comment, the proper way of doing this would be using a list. But if you want to stick to assign you can replace your second statement with

assign(paste(platform, j, "df", sep = "_"), 
    get(paste(platform, j, "df", sep = "_"))[
        2:length(get(paste(platform, j, "df", sep = "_")))]

If you wanted to use a list instead, your code to read the data frames would look like

dfs <- replicate(3,
    read.csv(file = paste(masterfilename, extension, sep = "."),
        header = FALSE, skip = 1, nrows = 100), simplify = FALSE)

Note you can use replicate because your call to read.csv does not depend on j in the loop. Then you can remove the first column of each

dfs <- lapply(dfs, function(d) d[-1])

Or, combining everything in one command

dfs <- replicate(3,
    read.csv(file = paste(masterfilename, extension, sep = "."),
        header = FALSE, skip = 1, nrows = 100)[-1], simplify = FALSE)
konvas
  • 14,126
  • 2
  • 40
  • 46