0

I have a dateset, df:

  state      id     year     yes
  ga         1      2020     10%
  ca         2      2020     10%
  va         1      2020     20%
  ga         1      2001     10%
  ca         2      2001     20%
  va         1      2001     10%

I wish for the ouput to look like this:

  state      id     year     yes
  ga         1      2001     10%
  ga         1      2020     10%
  ca         2      2001     20%
  ca         2      2020     10%
  va         1      2001     10%
  va         1      2020     20%

dput:

structure(list(state = structure(c(2L, 1L, 3L, 2L, 1L, 3L), .Label = c("ca", 
"ga", "va"), class = "factor"), id = c(1L, 2L, 1L, 1L, 2L, 1L
), year = c(2020L, 2020L, 2020L, 2001L, 2001L, 2001L), yes = structure(c(1L, 
1L, 2L, 1L, 2L, 1L), .Label = c("10%", "20%"), class = "factor")), class = "data.frame", row.names = 
c(NA, 
-6L))

This is what I have tried

library(dplyr)
df1<-df %>% group_by(state)
Lynn
  • 4,292
  • 5
  • 21
  • 44
  • You didn't explain in words what you need. You just want to order the data? Why is this the order you need? – David Arenburg May 10 '20 at 09:37
  • 1
    Hey, please check https://stackoverflow.com/questions/18839096/rearrange-a-data-frame-by-sorting-a-column-within-groups. – Rhino8 May 10 '20 at 09:38
  • 1
    Does this answer your question? [rearrange a data frame by sorting a column within groups](https://stackoverflow.com/questions/18839096/rearrange-a-data-frame-by-sorting-a-column-within-groups) – Rhino8 May 10 '20 at 09:39

2 Answers2

2

If you want to arrange the data based on it's occurrence we can use match and unique.

library(dplyr)
df %>% arrange(match(state, unique(state)), year)

#  state id year yes
#1    ga  1 2001 10%
#2    ga  1 2020 10%
#3    ca  2 2001 20%
#4    ca  2 2020 10%
#5    va  1 2001 10%
#6    va  1 2020 20%

In base R, we can use order :

df[with(df, order(match(state, unique(state)), year)), ]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

We can use factor

library(dplyr)
df %>%
   arrange(factor(state, levels = unique(state)), year)
#  state id year yes
#1    ga  1 2001 10%
#2    ga  1 2020 10%
#3    ca  2 2001 20%
#4    ca  2 2020 10%
#5    va  1 2001 10%
#6    va  1 2020 20%

Or with base R

df[order(with(df, factor(state, levels = unique(state)), year)),]
akrun
  • 874,273
  • 37
  • 540
  • 662