I'm presently working with a dataset (made from a biostatistics research project) which denotes expression ranks in a value such as "1.00e3" rather than "1.00e+03" which seems to confuse the system when ranking. Does anyone have any ideas as to how to work within the data frame to force it to convert "e" notations to standard form? I have already tried scipen and formatC.
Asked
Active
Viewed 139 times
1
-
Try with `options(scipen = 999)` – akrun Jan 14 '20 at 20:23
-
3`1.00e3` is still considered as standard. What do you mean by `non-standard`? – Onyambu Jan 14 '20 at 20:30
-
`as.numeric("1.00e3")` seems to work just fine. Can you create some sort of [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and the code you are running that makes it clear what the problem is? – MrFlick Jan 14 '20 at 21:30
1 Answers
0
It seems you are trying to rank a character vector numerically. Which can indeed go awry. The trick is to convert to numeric for ranking.
x = c("1.00e3", "1.00e+04", "1.0e05")
sort(x)
# "1.00e+04" "1.00e3" "1.0e05"
sort(as.numeric(x))
# 1e+03 1e+04 1e+05
If you really need the values in character format rather than numeric (which seems unlikely), you can do
format(as.numeric(x), scipen=999)
# [1] "1e+03" "1e+04" "1e+05"

dww
- 30,425
- 5
- 68
- 111