I'm trying to convert a column of money amount to numeric values. A very simplified version of my database would be:
SoccerPlayer = c("A","B","C","D","E")
Value = c("10K","25.5K","1M","1.2M","0")
database = data.frame(SoccerPlayer,Value)
I'm facing the currently issues. If there were no dots, and all money amount was at the same level of units such as only K(thousands) or only M(millions), this would work perfectly
library(stringi)
database$Value = as.numeric(gsub("K","000",database$Value))
But since there are K and M values in my data I'm trying to write it like this:
library(stringi)
if(stri_sub(database$Value,-1,-1) == 'M'){
database$Value = gsub("M","000000",database$Value)
}
if(stri_sub(database$Value,-1,-1) == 'K'){
database$Value = gsub("K","000",database$Value)
}
as.numeric(database$Value)
Which reports the following warnings messages
Warning message:
In if (stri_sub(database$Value, -1, -1) == "M") { :
the condition has length > 1 and only the first element will be used
Warning message:
In if (stri_sub(database$Value, -1, -1) == "K") { :
the condition has length > 1 and only the first element will be used
Warning message:
NAs introduced by coercion
Looking the data after the procedure, it looks like this:
> print(database$Value)
[1] "10000" "25.5000" "1M" "1.2M" "0"
Only the K(thousands) values were converted and I also have a problem on how to solve the dot issue like in "25.5000" or "1.2000000" (if the M conversion would have worked).
I'm new on programming and any help or thoughts on how to solve this would be much appreciated.