0

My data.frame is stateData and when I execute stateData[order(stateData$"heart failure"),], with heart failure being a column name, I'm getting my dataframe back with the heart failure column having increasing values like this:

10.0, 10.1, 10.3, 10.7, 15.0, 15.1, 15.9, 8.1, 8.3, 8.9, 9.0, 9.1

Here are details:
dput(head(stateData)) heart failure = structure(c(97L, 44L, 25L, 6L, 52L, 57L ), .Label = c("10.0", "7.2", "7.3", "7.4", "7.5", "7.6", "7.7", "7.8", "7.9", "8.0", "8.1", "8.2", "8.3", "8.4", "8.5", "8.6", "8.7", "8.8", "8.9", "9.0", "9.1", "9.2", "9.3", "9.4", "9.5", "9.6", "9.7", "9.8", "9.9", "Not Available"), class = "factor"),

Why is it not sorting it all the way?

Any help is appreciated! Thank you!

Edit: Here is my solution! I got it, thanks for all of the advice!

stateData[,"heart failure"] <- as.numeric(levels(stateData["heart failure"])[stateData[,"heart failure"]]) 
sortedData <- stateData[order(stateData[,"heart failure"]),]
user1047260
  • 91
  • 1
  • 11
  • 7
    Without seeing your data, my best guess is that your "heart failure" entries are characters. But, you should provide a reproducible example (hint: provide the output of `dput(head(stateData))` or a subset of the columns to make a minimal reproducible example). – Jota May 05 '14 at 22:04
  • @beginneR - that will throw an error as the interpreter will see `stateData$heart` and then freak out because there is some extra text just hanging around. – thelatemail May 05 '14 at 22:39
  • 1
    @user1047260 as you can see, the "Not Available" value caused the data to be stored as a factor instead of as a numeric value. You might try `stateData$"heart failure" <- as.numeric(stateData$"heart failure")` – josliber May 05 '14 at 22:58
  • That is a malformed factor variable. You need to review your data management procedures. The problem likely starts at the initial data input steps. – IRTFM May 06 '14 at 04:15

1 Answers1

0

The column stateData$"heart failure" is a factor, so when R sorts it, it puts it in alphabetical order. If you want the data sorted numerically, try this:

 stateData$"heart failure" <- as.numeric(levels(stateData$"heart failure"))[stateData$"heart failure"]
 stateData[order(stateData$"heart failure"),]
shirewoman2
  • 1,842
  • 4
  • 19
  • 31
  • close, but see http://stackoverflow.com/questions/3418128/how-to-convert-a-factor-to-an-integer-numeric-without-a-loss-of-information – GSee May 05 '14 at 23:32
  • 2
    I suppose it could be called "close" if it weren't simply wrong and that the correct answer is in the R-FAQ. – IRTFM May 05 '14 at 23:40
  • Ah, I see your point, @GSee. I wasn't thinking about the ranking problem. user1047260: I like your answer. I don't know how you're reading in your data, but using colClasses when you do so and setting the data type as you read it in might eliminate the need for it. i.e. read.csv("Data.csv", colClasses=c("character", "numeric")) – shirewoman2 May 06 '14 at 04:58
  • 2
    Laura, please edit your answer per the FAQ, i.e. `as.numeric(as.character(....))` – Carl Witthoft May 06 '14 at 11:25