0

I’m trying to do this action in loop:

library(data.table)

dc_clean202211 <- dc_clean202211[, .SD[.N], by="id_call"]

To start I read RDS files like this:

file_list <- list(202211,202210,202209,202208,202207,202206,202205,202204,202203,202202,202201,202112)
for (i in file_list){
#On crée un object par fichier
assign(paste0("dc_clean", i),readRDS(paste0("data/dc_clean", i , ".RDS")))
assign(paste0("dc_clean", i), paste0("dc_clean", i)[, .SD[.N], by="id_call"])
}

But I want to do my request in first row in my loop (or in another).

I tried a lot of possibilities but idk how to do this.

C.Tom
  • 3
  • 2
  • 4
    So what's wrong with your current code? Do you get an error or something? It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Jan 05 '23 at 15:04
  • Yeah sry I change my code with the problem – C.Tom Jan 05 '23 at 15:11
  • 4
    To get the value of a variable from a string, use `get()`. `assign(paste0("dc_clean", i), get(paste0("dc_clean", i))[, .SD[.N], by="id_call"])` But if you r are new to R, I strongly suggest avoiding get/assign. Things are much easier if you store values in named lists and apply transformation functions to those lists rather than creating a bunch of global variables with data embedded in the variable name. – MrFlick Jan 05 '23 at 15:13

1 Answers1

0

I suggest one of two options, assuming that the numbers are similarly structured.

  1. Put all frames into a list of frames, so that any operation you do to one can be easily extended to all with the simple use of lapply.

    alldat <- lapply(setNames(nm = files), function(z) readRDS(paste0("data/dc_clean", z, ".RDS")))
    ## or perhaps more succinctly
    alldat <- lapply(setNames(paste0("data/dc_clean", files, ".RDS"), files), readRDS)
    

    names(alldat) should be the file names, for easy referencing.

    Your follow-on code than then be:

    lapply(alldat, function(DT) DT[,. SD[, .N], by = "id_call"])
    

    to see the number of rows per id_call per file.

  2. Put all frames into one frame, optionally preserving the name from which it was imported. Since we already read in all of the .rds files, we can simply row-bind them with:

    alldat1 <- rbindlist(alldat, idcol = "filename")
    

    There are a few caveats:

    • If there are columns present in some that are not present in all, then you need to add fill=TRUE, and those columns will have NA values where they did not exist in the original file;
    • If the order of columns is not identical (or there are some present/missing), then I also suggest adding use.names=TRUE;
    • All classes should match. That is, column A must always be the same class as all the other frame's column A. rbindlist can deal with simpler issues, demonstrated by
    rbindlist(list(data.table(a=1), data.table(a=2L))) # numeric and integer
    #        a
    #    <num>
    # 1:     1
    # 2:     2
    rbindlist(list(data.table(a=1), data.table(a="2"))) # numeric and character
    #         a
    #    <char>
    # 1:      1
    # 2:      2
    

    but if there is ambiguity or uncertainty, I suggest you verify your columns before moving too far further.

    With this, you can do

    alldat1[, .SD[,.N], by = .(filename, id_call)]
    

    for the same basic results (notice the addition of filename to the by= argument).

r2evans
  • 141,215
  • 6
  • 77
  • 149