-2

I have a table looking like this:

    df <- data.frame(week = c("one","one","two","two"),
                     Day = c("day1", "day2","day1","day2"),
                     daily_freq = c(100,110,90,90),
                     city1 = c(20,30,20,30),
                     city2 = c(10,20,30,40),
                     city3 = c(30,40,10,10),
                     city4 = c(40,20,30,10))

And I'm calculating several summary tables, for example, one table with total frequencies for that period:

resume_table <- setDT(df)[, .( total_freq = sum(daily_freq),
                               city1 = sum(city1),
                               city2 = sum(city2),
                               city3 = sum(city3),
                               city4 = sum(city4))
                                       ,by = .(week)]

Finding a table with total frequencies looking like this:

    week  total_freq city1 city2 city3 city4
    one        210    50    30    70    60
    two        180    50    70    20    40

But because I have several cities (more than 40), and I need to calculate several summary tables, I would like to have, for example, a vector with the cities:

      cities <- c("city1","city2","city3","city4")

and beeing able to call this vector each time I do a table summing over this vector variables and still summing the other columns. My code doesn't work:

resume_table2 <- setDT(df)[, .(total_freq = sum(daily_freq),
                     lapply(.SD, sum), .SDcols = cities)
                               ,by = .(week)]

What's wrong?

1 Answers1

0

We can specify the 'cities' vector in .SDcols and loop over the .SD to get the sum

setDT(df)[, lapply(.SD, sum), .SDcols = cities]
#   city1 city2 city3 city4
#1:   105   100    55    65

If we need the 'daily_freq' as well, then concatenate with the 'cities'

setDT(df)[, lapply(.SD, sum), .SDcols = c('daily_freq', cities)]
#    daily_freq city1 city2 city3 city4
#1:        325   105   100    55    65
akrun
  • 874,273
  • 37
  • 540
  • 662
  • But when I add the sum of the daily_freq it doesn't work: res <- setDT(df)[, freq = sum(daily_freq), lapply(.SD, sum), .SDcols = cities] – Student1000 Feb 27 '18 at 13:20
  • but how do I sum the first column with daily frequencies? – Student1000 Feb 27 '18 at 13:27
  • I mean, how do I sum the SECOND column with daily frequencies: daily_freq – Student1000 Feb 27 '18 at 13:34
  • It Works, but the problem is I have more columns and I need to calculate summary columns in my summary tables like data_min = min(created), and other columns with tests like men = sum(gender == 'M') – Student1000 Feb 27 '18 at 14:24
  • 1
    @Student1000 You can do that `setDT(df)[, c(list(min_freq = min(daily_freq)), lapply(.SD, sum)), .SDcols = cities]` – akrun Feb 27 '18 at 15:08