2

I have a numeric vector that I've imported from excel that is formatted in a "weird" way. For example: 12.000 stands for 12000. I want to convert all numeric variables that have decimals to entire values (in this example multiplying by 1000 - since R reads 12.000 as 12, and what I really want is 12000). I've tried to convert it to character and then manipulate it in order to add zeros. I don't think this is the best way, but what I'm trying looks like this:

vec <- c(12.000, 5.300, 5.000, 33.400, 340, 3200)
vec <- as.character(vec)

> vec
[1] "12"   "5.3"  "5"    "33.4" "340"  "3200"



x <- "([0 -9]{1})"
xx <- "([0 -9]{2})"
x.x <- "([0 -9]{1}\\.[0 -9]{1})"
xx.x <- "([0 -9]{2}\\.[0 -9]{1})"

I created this regular expressions so what I could do is create a condition that if grep(x, vec) is true, then I do : paste0("000", vec) for when vec is true in the condition set. My idea is to do this for all possible cases, which are: add "000" if x or if xx & add "00" if x.x or if xx.x

Does anyone has an idea of what I could do? If there is any simpler idea?

Thank you!!

oguz ismail
  • 1
  • 16
  • 47
  • 69
rebeca
  • 153
  • 1
  • 3
  • 8
  • Please show the desired result. Why not try to read it in with `colClasses = "character"` and then use `as.numeric(gsub("[.]", "", df$V1))` – Rich Scriven Dec 15 '15 at 01:29
  • It's not clear how the content of `vec` was derived / read in? There should be an upstream solution. – Benjamin Dec 15 '15 at 01:36
  • How are you importing the data from Excel; saving it as CSV first? Is your issue because the file uses continental (European) numeric format? If so you may want to check the top answer in this question http://stackoverflow.com/questions/6123378/how-to-read-in-numbers-with-a-comma-as-decimal-separator – Ricky Dec 15 '15 at 01:49

1 Answers1

1

You need to read the vector as a character in the first place. If you read as numeric R will interpret it as a number and remove the decimal followed by 0

df <- read.csv(text= "Index, Vec
           1, 12.00
           2, 5.3
           3, 5
           4, 33.4
           5, 340
           6, 3200",
           colClasses = c("numeric", "character"))


isDot <- grepl("\\.", df$Vec)
df$Vec[isDot] <- as.numeric(df$Vec[isDot])*1000
df$Vec <- as.numeric(df$Vec)
jMathew
  • 1,057
  • 8
  • 13