My system is win 10,with R 3.5.3 and Rstudio 1.1.463,locale
as below:
> Sys.getlocale()
[1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
My classmate gave me a UTF8
csv file sample.csv
produced in linux system,this file can be produced by php script as below:
<?php
$a=
array (
'col1' => 12,
'col2' => 'Y' ,
'col3' => '<p style="text-align: center;">
<strong style="text-align: center;"><span style="color: rgb(105, 105, 105); font-family: verdana, arial, sans-serif; font-size: 13px;">版权</span></strong></p>
<p>
<span style="color: rgb(105, 105, 105); font-family: verdana, arial, sans-serif; font-size: 13px;">bla</span></p>
<p>
<span style="color: rgb(105, 105, 105); font-family: verdana, arial, sans-serif; font-size: 13px;"><img alt="" src="/functions/2.jpg" style="width: 400px; height: 500px;" /></span></p>
<p>
<span style="color: rgb(105, 105, 105); font-family: verdana, arial, sans-serif; font-size: 13px;">bla</span></p>
' ,
'col4' => '<br />
' );
$fp = fopen("sample.csv", "wb");
$question_list_cols=array('col1','col2','col3','col4');
fputcsv($fp, $question_list_cols);
if (!fputcsv($fp, array_values($a))) {
echo "fail<br />";
}
fclose($fp);
?>
When I read sample.csv
in R df<-read.csv("sample.csv",header=TRUE)
, I got error invalid input found on input connection
.
I tried similar questions in SO, but no one is workable.
The problem caused by Chinese characters 版权
. Everything is OK when I remove these Chinese Characters.
How to read utf8 csv with Chinese character in R?