I have a dataframe df_before
which one of columns contains values such as:
id
123456789
1.11E+2
3.52E+4
5.60E+5
0001112345857RAE
and would like to convert them in df_after
to:
id
123456789
111
35200
560000
0001112345857RAE
Basically I want to strip off the period .
and replace any E+XX
with 0's
according to the number/ power of the exponent. This is what I have tried:
df_after$id <- ifelse(str_detect(df_before$id, "E\\+\\d+$"),
gsub("E\\+\\d+",
strrep("0", as.numeric(gsub(".*E\\+(\\d+)$", "\\1", df_before$id)) - 2),
gsub("\\.", "", df_before$id)),
df_before$id)
Each smaller chunk of the above codes worked with 1 single input, for example this:
strrep("0", as.numeric(gsub(".*E\\+(\\d+)$", "\\1", "6.32E+3")))
results in:
"000" # which is as expected
also:
gsub("E\\+\\d+",
strrep("0", as.numeric(gsub(".*E\\+(\\d+)$", "\\1", "6.32E+3")) - 2),
gsub("\\.", "", "6.32E+3"))
gives:
"6320" # as expected and desired
But when I applied it to the whole column using ifelse and str_detect (which also works as expected for those entries containing E+XX
, it runs very slowly and returned NA
's and some values like 6320NA000NA000NA000NA000....<truncated>
Could someone please assist me in fixing this block of code so it will work with the dataframe column?
Thank you so much!