1

Reordering a data frame according to a vector (for instance for correct ggplot2 plotting) has been up on SO several times, for instance in this nice thread here. Yet, I cannot get this to work - perhaps because some values are dupliate (at least this is what R's warning is about). Toy example using dplyr:

require(dplyr)
set.seed(8)
df <- tbl_df(data.frame(
  v1 = rnorm(8),
  v2 = rep(rnorm(4),2),
  v3 = rep(sample(LETTERS[],4),2)))

v1is only here so that all columns may be different across. The levels of v3 is now

levels(df$v3) 
[1] "A" "B" "C" "D"

I would like to reorder v3 according to v2, which contains duplicate values.

df[order(df$v2),"v2"][[1]]
[1] -3.0110517 -3.0110517 -0.7597938 -0.7597938 -0.5931743 -0.5931743  0.2920499  0.2920499

How come this does not work? :

df %>%
  mutate(v3 = factor(v3, levels=df[order(df$v2),"v2"][[1]]))

UPDATE: Nor does this work:

df %>%
  mutate(v3 = factor(v3, levels=unique(df[order(df$v2),"v2"][[1]])))

Gives:

           v1         v2 v3
1 -0.08458607 -3.0110517 NA
2  0.84040013 -0.5931743 NA
3 -0.46348277 -0.7597938 NA
4 -0.55083500  0.2920499 NA
5  0.73604043 -3.0110517 NA
6 -0.10788140 -0.5931743 NA
7 -0.17028915 -0.7597938 NA
8 -1.08833171  0.2920499 NA
Community
  • 1
  • 1
user3375672
  • 3,728
  • 9
  • 41
  • 70
  • @hrbrmstr: No (see my update) – user3375672 Jan 29 '15 at 12:47
  • 1
    You have `"v2"` in the `df[order…` but but I think you want that to be `v3` - try `unique(df[order(df$v2),"v3"][[1]])` But, if you want an ordered factor, you'll need to set `ordered=TRUE` in the `factor` call. – hrbrmstr Jan 29 '15 at 12:51
  • Thats a beauty! I did not notice that! If you put it a answer I will immediatley acceot. – user3375672 Jan 29 '15 at 12:53

1 Answers1

5

To avoid the "duplicate" warnings and also create an ordered factor over v3 (ordered by v2), you can do:

df %>%
  mutate(v3 = factor(v3, 
                     ordered=TRUE, 
                     levels=unique(df[order(df$v2),"v3"][[1]])))
hrbrmstr
  • 77,368
  • 11
  • 139
  • 205