So this one let me doubt my sanity - I'm currently working on a simple script that should read URLs from a csv file and than perform a regex but my pattern kept on failing. Upon close inspection I noticed something strange: The strings that fgetcsv returns showing a completely wrong length when displayed with var_dump(). Any idea what is going on here and how to sanitize this string?
Sample-Code:
<?php
$read = fopen("input.csv","r");
while($data = fgetcsv($read,null,",","\"","\\")){
var_dump($data[0]);
echo mb_detect_encoding($data[0]);
echo "\n";
}
?>
And the response looks like this:
string(25) "/index.html"
ASCII
string(23) "/login.html"
ASCII
string(15) "/insta/"
ASCII
The strings are the same that are in my csv-file, but as you see the string length reported isn't right. What is going on here? Are there invisible characters I'm not seeing? Is it some strange encoding problem? How can I fix this?