I have a vector of raw text that includes English and French words, so there are French words that have accented characters like this:
1 entretien ménager
2 concepteur réseaux
3 service à la clientèle
4 sécurité
5 infirmière auxiliaire
6 opérateur de machinerie en usine
7 consultant stratégique
8 ménage
9 ingénieur civil, gérant projet
10 éducatrice
The command Encoding(variable)
tells me that there's a mix of 'unknown' and UTF-8 encodings. All of the ones above are coded as UTF-8.
This code replicates the problem on my mac:
library(foreign)
vec<-c('sécurité', 'service à la clientèle', 'assembleur', 'labour')
write.csv(data.frame(vec), file='~/Desktop/test.csv')
I have tried the same with write_excel_csv()
and I get the same results.
I can only assume this is some kind of problem with the utf-8 encoding, but I can' t see my way to figure this out.
Thank you.
Results of sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] readr_1.1.1 bindrcpp_0.2.2 labelled_1.0.0 haven_1.1.1.9000
[5] survey_3.32-1 survival_2.41-3 Matrix_1.2-10 car_2.1-5
[9] stargazer_5.2 foreign_0.8-69 tidyr_0.8.0 dplyr_0.7.4
[13] ggplot2_2.2.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.16 pillar_1.2.1 compiler_3.4.1 nloptr_1.0.4
[5] plyr_1.8.4 bindr_0.1.1 forcats_0.3.0 tools_3.4.1
[9] lme4_1.1-13 tibble_1.4.2 gtable_0.2.0 nlme_3.1-131
[13] lattice_0.20-35 mgcv_1.8-17 pkgconfig_2.0.1 rlang_0.2.0
[17] cli_1.0.0 rstudioapi_0.7 yaml_2.1.18 parallel_3.4.1
[21] SparseM_1.77 hms_0.4.1 MatrixModels_0.4-1 nnet_7.3-12
[25] glue_1.2.0 R6_2.2.2 minqa_1.2.4 purrr_0.2.4
[29] magrittr_1.5 scales_0.5.0 MASS_7.3-47 splines_3.4.1
[33] assertthat_0.2.0 pbkrtest_0.4-7 colorspace_1.3-2 quantreg_5.33
[37] utf8_1.1.3 lazyeval_0.2.0 munsell_0.4.3 crayon_1.3.4 `
I should add, I have looked at some of the issues on GitHub and SO such as this, this, this, but have not found my answer.