0

I've spent a fair bit of time reviewing and I have probably missed something but here goes.

I get a csv from a vendor that is UTF-8 encoded. Just to be sure I open in in notepad and save as utf-8. My php admin Server charset is: cp1252 West European (latin1). I am on a shared server and this cannot be changed. (I believe this is causing the issue. I have this same code running on another server that is set to UTF-8 and it works correctly)

The Server Connection Collation is utf8mb4_unicode_ci

I have a PHP script and form that is trying to upload the file to a table.
The table collation is utf8_general_ci. All the column collations are set to utf_8_general_ci.

About 50% of the rows import correctly, but I get many errors in fields that have quotation marks, apostrophe's and special characters.

The Initial Character set in my code is reporting Latin 1.

I am trying to figure out what I need to do here. If I use the phpmyadmin import everything imports and displays fine.

Somehow I assume I have to code the utf8-file as cp1252. I see lots of examples converting cp1252 to UTF-8 but not the other way around if this is indeed what I need to do.

I've learned a lot reading these forums but am stumped here. Any help appreciated.

Edit: I don't have enough points to respond so thanks all. As some imports worked and some didn't I got stuck on this process, but see the issue now. Not as smart as I thought obviously. I feel like this should be deleted now as it may lead others astray with poor code. I will do so if advised.

Edit 2: I have it working with prepared statements now and am posting the code below so it may help others. I still need to add confirmation (ie(your import was successful etc).


error_reporting(E_ALL);
include "conn.php";


echo "Initial character set is: " . $conn -> character_set_name();

if(isset($_POST["import"])){
    $fileName = $_FILES["file"]["tmp_name"];
mysqli_query($conn,'TRUNCATE TABLE data');
    if($_FILES["file"]["size"] > 0){
    $file = fopen($fileName, "r");
    fgets($file);  // read one line for nothing (skip header)

        while(($column = fgetcsv($file, 0, ",")) !== FALSE){
        $sqlInsert = "Insert into data (ID, Fname, Lname,  ShortBiography ) 
        
        
        values ('"  . $column[0] ."','" . $column[1] ."','" . $column[2] ."','" . $column[3] "')";
        
        $result = mysqli_query($conn, $sqlInsert);
                 * if ( ! mysqli_query( $conn, $sqlInsert ) ) {
                printf( "Error: %s\n", mysqli_error( $conn ) );
                } */ 
            if(!empty($result)){
            echo "";
            }else{
            echo "<br />Problem in importing - ".$column[1]." /<br> ";
            
            }
 
        }
    }
}
?>
<form class="form-horizontal" action="" method="post" name="uploadCsv" enctype="multipart/form-data">
<div>
<label>Choose CSV File</label>
<input type="file" name="file" accept=".csv">
<button type="submit" name="import">Import</button>
</div>
</form>

Edit 2 working code:

<?php
error_reporting(E_ALL);
include "conn.php";


echo "Initial character set is: " . $conn -> character_set_name();

if(isset($_POST["import"])){
    $fileName = $_FILES["file"]["tmp_name"];
mysqli_query($conn,'TRUNCATE TABLE data');
    if($_FILES["file"]["size"] > 0){
    $file = fopen($fileName, "r");
    fgets($file);  // read one line for nothing (skip header)

        while(($column = fgetcsv($file, 0, ",")) !== FALSE){
        $stmt = $conn->prepare("Insert into data (ID, Fname, Lname,  ShortBiography )  VALUES (?,?,?,?)");
        $stmt->bind_param("isss",$column[0], $column[1], $column[2], $column[3]);
        $stmt->execute();
        }
    }
}
?>
<form class="form-horizontal" action="" method="post" name="uploadCsv" enctype="multipart/form-data">
<div>
<label>Choose CSV File</label>
<input type="file" name="file" accept=".csv">
<button type="submit" name="import">Import</button>
</div>
</form>```

rookiest
  • 19
  • 4
  • It's not an encoding issue, it's how you're building the query. You should bind the queries or escape them at the very least. Reference this for binding: https://stackoverflow.com/questions/38919714/using-bind-param-with-mysqli-query – Marshall C Jul 06 '22 at 01:58
  • And use a phpadmin which support UTF-8. Shared server? Should not matter. Just install you own instance on your working space. It is just a PHP program which it is also free software (and open source), so you can download and install freely anywhere you want (also multiple instance on the same machine). CP1252 is not Latin1 (try not to confuse encoding, or things become more complex). Saving again the file in UTF-8 is often worse (but if notepad is magic and can detect the initial encoding): windows programs tend to add unpredictably BOM (and most other program dislike it). – Giacomo Catenazzi Jul 06 '22 at 06:43
  • **Warning:** You are wide open to [SQL Injections](https://php.net/manual/en/security.database.sql-injection.php) and should use parameterized **prepared statements** instead of manually building your queries. They are provided by [PDO](https://php.net/manual/pdo.prepared-statements.php) or by [MySQLi](https://php.net/manual/mysqli.quickstart.prepared-statements.php). Never trust any kind of input! Even when your queries are executed only by trusted users, [you are still in risk of corrupting your data](http://bobby-tables.com/). [Escaping is not enough!](https://stackoverflow.com/q/32391315) – Dharman Jul 06 '22 at 10:06

0 Answers0