0

I am trying to remove some rows of my data by adding them to a different row, in the form of another column. Is there a way I can group rows together by a certain variable?

I have tried using group_by statement in the dplyr package, but it does not seem to solve my issue.

library(dplyr)
late <- read.csv(file.choose())
late <- group_by(late, state, add = FALSE)

The data set I have (named "late") now is in this form:

ontime   state   count

0        AL        1

1        AL        44

null     AL        3

0        AR        5

1        AR        50

...

But I would like it to be:

state    count0    count1    countnull

AL       1         44        3

AR       5         50        null

...

Ultimately, I want to calculate count0/count1 for each state. So if there is a better way of going about this, I would be open to any suggestions.

Cettt
  • 11,460
  • 7
  • 35
  • 58

2 Answers2

0

You could do this with dcast() from the reshape2 package

library(reshape2)

df = data.frame(
  ontime = c(0,1,NA,0,1),
  state = c("AL","AL","AL","AR","AR"),
  count = c(1,44,3,5,50)
)

dcast(df,state~ontime,value=count)
Fino
  • 1,774
  • 11
  • 21
0

With spread:

library(dplyr)
library(tidyr)

df %>%
  mutate(ontime = paste0('count', ontime)) %>%
  spread(ontime, count)

Output:

  state count0 count1 countnull
1    AL      1     44         3
2    AR      5     50        NA

Data:

df <- structure(list(ontime = structure(c(1L, 2L, 3L, 1L, 2L), .Label = c("0", 
"1", "null"), class = "factor"), state = structure(c(1L, 1L, 
1L, 2L, 2L), .Label = c("AL", "AR"), class = "factor"), count = c(1L, 
44L, 3L, 5L, 50L)), class = "data.frame", row.names = c(NA, -5L
))
acylam
  • 18,231
  • 5
  • 36
  • 45