0

The encoding of the $_FILES['file'] global variable in php is different when a file is uploaded from a Mac than from a Windows computer.

I'm developing a script using PHP7 on a newly updated Debian/Apache2. It worked before the update, but is now failing for file uploads from Mac.

The script is to receive an upload file, and store the file name in a database (MariaDB/mySql)

$filename = "$_FILES['files']";
$sql = "INSERT INTO upload_files (filename) VALUES ('$filename')";

When the filename contains non-ascii characters, such as æøå, this query is executed correctly when the file is uploaded from a Windows computer. But when the file is uploaded from a Mac, the encoding is different, and the query fails:

#1366 - Incorrect string value '\xCC\x8A...' for column filename....
user29809
  • 85
  • 7
  • 2
    `INSERT INTO upload_files (filename) VALUES ('$filename')`. Why do you have quotes around `$_FILES['files']`? And you know that will probably be an array, right? This will also be vulnerable to SQL injections attacks. You also mention two index names for your file upload, `files` and `file`, which is it? – Jonnix Aug 13 '19 at 10:35
  • This isn't the original code. I wrote directly to illustrate the problem: filenames are encoded differently when files are uploaded from a Mac vs. a Windows PC. SOrry for sloppy writing. – user29809 Aug 13 '19 at 10:44
  • https://stackoverflow.com/questions/10957238/incorrect-string-value-when-trying-to-insert-utf-8-into-mysql-via-jdbc may be relevant. – Jonnix Aug 13 '19 at 10:45
  • **WARNING**: Whenever possible use **prepared statements** to avoid injecting arbitrary data in your queries and creating [SQL injection bugs](http://bobby-tables.com/). These are quite straightforward to do in [`mysqli`](http://php.net/manual/en/mysqli.quickstart.prepared-statements.php) and [PDO](http://php.net/manual/en/pdo.prepared-statements.php) where any user-supplied data is specified with a `?` or `:name` indicator that’s later populated using `bind_param` or `execute` depending on which one you’re using. – tadman Aug 13 '19 at 18:23
  • You really need to kick the habit of stringifying things for no reason. Don't do `"$x"`, just do `$x`. Your `$_FILES` thing here is surrounded in quotes which can actually mangle the meaning badly. PHP is not bash. – tadman Aug 13 '19 at 18:24

1 Answers1

1

Hex CC8A is utf8 for "COMBINING RING ABOVE". Perhaps there is a vowel next?

Perhaps something is defaulting to latin1 or some other non-utf8 encoding. It is hard to say what because it is a filename, not data in a table.

Hex C385 is utf8 for Å, which is equivalent to 41CC8A, which is A plus a ring. If you were to use the former, it could (if appropriate) be converted to a latin1 character.

Rick James
  • 135,179
  • 13
  • 127
  • 222