5

I'm trying the following function:

stest <- data.frame(group=c("John", "Jane", "James"), mean=c(3, 5, 1))
transform(stest, group = reorder(group, mean))

And expect the output be ordered by mean. Instead, I get:

  group mean
1  John    3
2  Jane    5
3 James    1

That is, same order as in the original dataframe.

Do I miss something? How to order a data frame correctly by one of its numerical variables?

Recommendations around is about using reorder, but I can't make it work as expected. Can any loaded packages interfere?

double-beep
  • 5,031
  • 17
  • 33
  • 41
Anton Tarasenko
  • 8,099
  • 11
  • 66
  • 91
  • 1
    Maybe I don't get what you're after, but is : stest[order(stest$mean),] would be sufficient ? – Chargaff Dec 02 '13 at 14:28
  • @Chargaff Yep, it returns the right order, but when I'm trying to use this dataframe in `ggplot`, `ggplot` still plots it in the previous order. – Anton Tarasenko Dec 02 '13 at 14:38
  • 1
    @BlueMagister from the OPs last comment it looks like it may actually be a dupe of http://stackoverflow.com/q/5208679/1317221 – user1317221_G Dec 02 '13 at 14:44
  • @user1317221_G Agreed. I cannot change my close vote to that question, however - only retract the close vote altogether. At the least, the title is ambiguous enough to point to both questions. – Blue Magister Dec 02 '13 at 14:47

3 Answers3

5

from the documentation

reorder is a generic function. The "default" method treats its first argument as a categorical variable, and reorders its levels based on the values of a second variable, usually numeric.

Note : Reordering levels, not the values of the factor variable(group in your case).

Compare:

levels(stest$group)
[1] "James" "Jane"  "John" 

with

>  reorder(stest$group, c(1,2,3))
[1] John  Jane  James
attr(,"scores")
James  Jane  John 
    3     2     1 
Levels: John Jane James

EDIT 1

From your comment:

"@Chargaff Yep, it returns the right order, but when I'm trying to use this dataframe in ggplot, ggplot still plots it in the previous order."

it seems you do actually want to reorder levels for a ggplot. I suggest you do:

stest$group <- reorder(stest$group, stest$mean)

EDIT 2

RE your last comment that the above line of code has "no effect". Clearly it does:

> stest$group
[1] John  Jane  James
Levels: James Jane John         # <-------------------------------
> stest$group <- reorder(stest$group, stest$mean)              # |
> stest$group                                                  # |
[1] John  Jane  James                                          # |
attr(,"scores")                                                # | DIFFERENT :)
James  Jane  John                                              # |
    1     5     3                                              # | 
Levels: James John Jane        # <--------------------------------
user1317221_G
  • 15,087
  • 3
  • 52
  • 78
  • Sorry, I didn't understand the difference. The docs says that it reorders the levels, so why Jane with `5` isn't on the top or bottom? – Anton Tarasenko Dec 02 '13 at 14:44
  • 2
    look at my example. your original levels are in the order `"James" "Jane" "John" ` I changed them by `1,2,3` hence now the levels, *not the data in the columns* , are `John Jane James` . Perhaps you should read `?levels` – user1317221_G Dec 02 '13 at 14:46
  • I've tried `levels(stest$group) <- reorder(stest$group, stest$mean)` on the initial data, and it returned the same results `"John" "Jane" "James"` of `levels(stest$group)`. Can you help me to clarify why this happens? – Anton Tarasenko Dec 02 '13 at 15:03
  • 1
    use `stest$group <- reorder(stest$group, stest$mean)` – user1317221_G Dec 02 '13 at 15:17
  • Unfortunately, that has no effect. The result is identical to the initial data. – Anton Tarasenko Dec 02 '13 at 15:22
  • I have `Error in attr(, "scores") : argument 1 is empty` when try `attr(,"scores")`. Is some package involved? – Anton Tarasenko Dec 02 '13 at 15:31
  • 1
    ...............only paste the code infront of `>` . At this point I give up. – user1317221_G Dec 02 '13 at 15:34
  • I found my mistake: http://stackoverflow.com/a/20335767/911945 . Thanks a lot for your clues. Wouldn't solve it without your help. – Anton Tarasenko Dec 02 '13 at 18:59
1

I think you are wanting the order function which returns an index, not reorder which is used to change the order of factor levels. This would do it.

> stest[order(stest$mean),]
Stephen Henderson
  • 6,340
  • 3
  • 27
  • 33
1

I've found my mistake thanks to user1317221_G and others.

The correct code that would order my dataset is:

stest$group <- reorder(stest$group, stest$mean, FUN=identity)

While

stest$group <- reorder(stest$group, stest$mean)

didn't order my dataframe. Not sure why FUN = mean didn't work, but I had to specify identity.

Possible reason is this: Reordering factor gives different results, depending on which packages are loaded

UPDATE

It's not enough to have the first line of code. reorder does not coerce the second argument to factors, thus final ordering may be incomplete (e.g., higher values below lower values in descending order).

Therefore, to be sure you have the right order:

stest$group <- reorder(stest$group, as.factor(stest$mean), FUN=identity)
Community
  • 1
  • 1
Anton Tarasenko
  • 8,099
  • 11
  • 66
  • 91