4

I have been handed this project that has a large number of issues with it. One of them is that the end-user is uploading a CSV file that is exported directly from MS Access into a directory on their web server. The next step is to truncate a couple of database tables and then insert all the records from the CSV into the database.

However, there is an issue with the apostrophe character that MS Access is using for apostrophes. It isn't a single quote ', nor is it a double quote ". It is an apostrophe. Like this:

"Rock/Classic 80’s-90’s"

Now, I have tried the following in my PHP to strip them out:

$d = str_replace("’", "", $d);
$d = str_replace(array("'", "\'", "\’", "’"), "", $d);

However, this does not seem to work. In fact, when running SQL queries based off of this data, it always seems to somehow convert the ’ into ' without stripping them out, and then causing a SQL error since it thinks that the string has been terminated early.

This is one of the code blocks I am using:

$band_insert = "INSERT INTO `schedule` (`Band`, `Date`, `Genre`, `Club`, `Location`, `Venue`, `Time`) VALUES ( '%s', '%s', '%s', '%s', '%s', '%s', '%s' )";
$result = $mysqli->query('TRUNCATE TABLE `schedule`');
if(!$result) die('Truncate error');

if( ($handle=fopen('./export/schedule.csv', 'r')) !== FALSE)
{
    while( ($data=fgetcsv($handle, 1000, ',', '"', '\\')) !== FALSE )
    {
        foreach($data as $d) 
        {
            $d = str_replace("’", "", $d);
            # For debugging purposes only
            echo "<p>$d</p>";
        }
        $sql = sprintf($band_insert, $data[0], $data[1], $data[2], $data[3], $data[4], $data[5], $data[6]);
        #$sql = $mysqli->real_escape_string($sql);
        $result = $mysqli->query($sql);
        if( ! $result ) $log[] = lg("Unable to perform query ($mysqli->errno): $mysqli->error");
    }
    $log[] = lg("Successful upload (".date("Y-m-d").").");

    fclose($handle);
}

The question becomes, why is this not working? When I echo out the $d value, it prints a ? in a square. Even with header('Content-type: text/html; charset=utf-8'); at the top of the file.

Jim P
  • 534
  • 2
  • 8
  • 24
  • You should be able to leverage this answer: http://stackoverflow.com/questions/1262038/how-to-replace-microsoft-encoded-quotes-in-php – blearn Jun 15 '12 at 17:23
  • possible duplicate: http://stackoverflow.com/questions/1318869/mysql-choking-on-curly-smart-quotes – Alexander Jun 15 '12 at 17:30
  • I've tried both of those solutions, neither of which seem to work correctly. – Jim P Jun 15 '12 at 17:36

3 Answers3

4

I've had some similar hurdles with Access and Excel, and use this that I picked up somewhere to scrub the MS characters (so all due credit to it's original author). Perhaps you can use it as is, or adapt accordingly:

// First, replace UTF-8 characters.
$text = str_replace(
array("\xe2\x80\x98", "\xe2\x80\x99", "\xe2\x80\x9c", "\xe2\x80\x9d", "\xe2\x80\x93", "\xe2\x80\x94", "\xe2\x80\xa6"),
array("'", "'", '"', '"', '-', '--', '...'),
$text);

// Next, either REPLACE their Windows-1252 equivalents.
$text = str_replace(
array(chr(145), chr(146), chr(147), chr(148), chr(150), chr(151), chr(133)),
array("'", "'", '"', '"', '-', '--', '...'),
$text);

// OR, STRIP their Windows-1252 equivalents.
$text = str_replace(
array(chr(145), chr(146), chr(147), chr(148), chr(150), chr(151), chr(133)),
array('', '', '', '', '', '', ''),
$text);
GDP
  • 8,109
  • 6
  • 45
  • 82
  • This works like a charm. The only issue being that it now refuses to remove the ' from parts of the data. – Jim P Jun 15 '12 at 17:48
  • Not following you....example? Just adjusted the answer to STRIP those characters instead of REPLACING them. Is that what you mean? – GDP Jun 15 '12 at 18:06
  • I tried that also, but for some reason it is still adding a ' character in the spot where the apostrophes were. When running a different script, it is saying that there is an invalid UTF-8 character on each of those lines. I took the route of stripping all of the items from but the UTF and the Windows replaces. – Jim P Jun 15 '12 at 18:51
  • It's actually still putting in the MS versions of those symbols for some reason. – Jim P Jun 15 '12 at 18:57
  • So, found the issue with this being the pass by value in the foreach loop. Changed `foreach($data as $d)` to `foreach($data as &$d)` and it works great. Thanks! – Jim P Jun 15 '12 at 21:03
0

I think there is something wrong you had written echo "<p>$d</p>"; .I think this should be

echo "<p>".$d."</p>";
Rakesh Shetty
  • 4,548
  • 7
  • 40
  • 79
0

I took your code and got it to work using a variable.

$string = "Rock/Classic 80’s-90’s";
$replace = "’";
$string = str_replace($replace, "", $string);
echo "<p>$string</p>";
Cameron Chapman
  • 796
  • 9
  • 19