0

I have been given an excel file extracted from an SQL database which means the numerical fields are presented as 12345,67 .

I import the file in R 3.2.5 using library(readxl) and it correctly displays the numerical field as 12345.67.

I set the comma as decimal separator in the options for I need the reports to be printed in that format. However, I can't find away to set "." dot, as thousands separator. I have tried format and formatc which results in

"Warning message:
NAs introduced by coercion"

error. format and formatc results in string not numeric format.

Please use this mock data frame for reproducible research.

variable=c("a","b","c","d","e","f","g","h","i","j")
value=c(0.00,196036.95,196036.95, 10244.29,90470.73,33926.53,37142.59,6.65,15.40,180941.47)
df=data.frame(variable, value)
#View the data frame

print(df)

#a quick bar plot
plot=ggplot(data=df, aes(x=variable, y=value)) +  geom_bar(stat="identity")
plot

#set comma as decimal point. We need dot as thousands separator.
options(OutDec= ",")

print(df)  # here the numbers are correctly presented as extracted from the SQL database
plot

Format and formatc results when turning the result into numeric:

as.numeric(format(df$value, big.mark = "."))
 [1] NA NA NA NA NA NA NA NA NA NA
Warning message:
NAs introduced by coercion 

I need the final results to be reported in this format but in numeric so i can generate plots and tables:

 format(df$value, big.mark = ".")
 [1] "      0,00" "196.036,95" "196.036,95" " 10.244,29" " 90.470,73" " 33.926,53" " 37.142,59" "      6,65" "     15,40" "180.941,47"

EDIT: My sessionInfo()

R version 3.2.5 (2016-04-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=Greek_Greece.1253  LC_CTYPE=Greek_Greece.1253    LC_MONETARY=Greek_Greece.1253 LC_NUMERIC=C                  LC_TIME=Greek_Greece.1253    

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pander_0.6.0   scales_0.4.1   forcats_0.2.0  png_0.1-7      gtable_0.2.0   cowplot_0.6.3  ggpubr_0.1.0   zoo_1.8-0      gtools_3.5.0   reshape2_1.4.2
[11] gridExtra_2.3  gridBase_0.4-7 ggplot2_2.2.1  readxl_0.1.1  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.10     magrittr_1.5     munsell_0.4.3    colorspace_1.3-2 lattice_0.20-33  stringr_1.2.0    plyr_1.8.4       tools_3.2.5      yaml_2.1.14     
[10] lazyeval_0.2.0   digest_0.6.12    tibble_1.3.0     labeling_0.3     stringi_1.1.5   
KRStam
  • 393
  • 5
  • 18
  • If you want your variable to print with a given format, but actually be numeric underneath - you can define your own print method:. this might help https://stackoverflow.com/questions/28159936/formatting-large-currency-or-dollar-values-to-millions-billions – user20650 Feb 23 '18 at 11:30

1 Answers1

1

You were close, you just need to specify this in scale_y_continuous:

plot + scale_y_continuous(label = function(x) format(x, big.mark = ".", decimal.mark = ","))
Hugh
  • 15,521
  • 12
  • 57
  • 100
  • Thank you. I need however the results to be presented in a table for reporting with dot as thousands separator and comma as decimal separator. – KRStam Feb 20 '18 at 08:57