%-encoding is for URLs. Filenames are not URLs. You use the form:
http://example.org/images/113_Atl%C3%A9tico%20Madrid.png
in the URL, and the web server will decode that to a filename something like:
/var/www/example-site/data/images/113_Atlético Madrid.png
You should use rawurlencode()
when you're preparing the filename to go in a URL, but you shouldn't use it to prepare the filename for disc storage.
There is an additional problem here in that storing non-ASCII filenames on disc is something that is unreliable across platforms. Especially if you run on a Windows server, the PHP file APIs like move_uploaded_file()
can very likely use an encoding that you didn't want, and you might end up with a filename like 113_Atlético Madrid.png
.
There isn't necessarily an easy fix to this, but you could use any form of encoding, even %-encoding. So if you stuck with your current rawurlencode()
for making filenames:
/var/www/example-site/data/images/113_Atl%C3%A9tico%20Madrid.png
that would be OK but you would then have to use double-rawurlencode
to generate the matching URL:
http://example.org/images/113_Atl%25C3%25A9tico%2520Madrid.png
But in any case, it's very risky to include potentially-user-supplied arbitrary strings as part of a filename. You may be open to directory traversal attacks, where the name contains a string like /../../
to access the filesystem outside of the target directory. (And these attacks commonly escalate for execute-arbitrary-code attacks for PHP apps which are typically deployed with weak permissioning.) You would be much better off using an entirely synthetic name, as suggested (+1) by @MatthewBrown.
(Note this still isn't the end of security problems with allowing user file uploads, which it turns out is a very difficult feature to get right. There are still issues with content-sniffing and plugins that can allow image files to be re-interpreted as other types of file, resulting in cross-site scripting issues. To prevent all possibility of this it is best to only serve user-supplied files from a separate hostname, so that XSS against that host doesn't get you XSS against the main site.)