Is there an R function to group a table by a certain variable?

Question

I am trying to remove some rows of my data by adding them to a different row, in the form of another column. Is there a way I can group rows together by a certain variable?

I have tried using group_by statement in the dplyr package, but it does not seem to solve my issue.

library(dplyr)
late <- read.csv(file.choose())
late <- group_by(late, state, add = FALSE)

The data set I have (named "late") now is in this form:

ontime   state   count

0        AL        1

1        AL        44

null     AL        3

0        AR        5

1        AR        50

...

But I would like it to be:

state    count0    count1    countnull

AL       1         44        3

AR       5         50        null

...

Ultimately, I want to calculate count0/count1 for each state. So if there is a better way of going about this, I would be open to any suggestions.

score 0 · Answer 1 · answered Apr 11 '19 at 19:59

0

You could do this with dcast() from the reshape2 package

library(reshape2)

df = data.frame(
  ontime = c(0,1,NA,0,1),
  state = c("AL","AL","AL","AR","AR"),
  count = c(1,44,3,5,50)
)

dcast(df,state~ontime,value=count)

answered Apr 11 '19 at 19:59

Fino

1,774
11
21

score 0 · Answer 2 · answered Apr 11 '19 at 20:06

With spread:

library(dplyr)
library(tidyr)

df %>%
  mutate(ontime = paste0('count', ontime)) %>%
  spread(ontime, count)

Output:

  state count0 count1 countnull
1    AL      1     44         3
2    AR      5     50        NA

Data:

df <- structure(list(ontime = structure(c(1L, 2L, 3L, 1L, 2L), .Label = c("0", 
"1", "null"), class = "factor"), state = structure(c(1L, 1L, 
1L, 2L, 2L), .Label = c("AL", "AR"), class = "factor"), count = c(1L, 
44L, 3L, 5L, 50L)), class = "data.frame", row.names = c(NA, -5L
))

Is there an R function to group a table by a certain variable?

2 Answers2