0

Im doing a webpage and have a problem with file upload, that changes the file name umlauts into a weird name.

For example when i upload a file called "töö.docx" and look at the name in the uploaded folder, it shows me this "tƶƶ.docx".

When i call out the name of the file in index.php it shows me the correct name "töö.docx".

But after i go into the upload folder and change the name "tƶƶ.docx" manually into "töö.docx" and than call out the name of the file in index.php, it shows me "t��.docx" which is wrong.

Here is the code for upload in index.php:

<form method="post" enctype="multipart/form-data">
  <strong>File upload:</strong>
  <small>(max 8 Mb)</small>
  <input type="file" name="fileToUpload" required>
  <input type="submit" value="Upload" name="submit">
</form>

And here is the upload controller code:

$doc_list = array();
   foreach (new DirectoryIterator('uploads/') as $file)
{
   if ($file->isDot() || !$file->isFile()) continue;
   $doc_list[] = $file->getFilename();
}

$target_dir = "uploads/";
$target_file = $target_dir . basename( isset($_FILES["fileToUpload"]["name"]) ? $_FILES["fileToUpload"]["name"] : "");
$file = isset($_FILES["fileToUpload"]) ? $_FILES["fileToUpload"] : "";
$up_this = isset($_FILES["fileToUpload"]["tmp_name"]) ? $_FILES["fileToUpload"]["tmp_name"] : "";
$file_name = isset($_FILES["fileToUpload"]["name"]) ? $_FILES["fileToUpload"]["name"] : "";

if (!empty($file)) {
    if(isset($_POST["submit"])) {
        if (file_exists($file_name)) {
            echo "File already exists.";
            exit;
        } else {
            $upload =  move_uploaded_file($up_this, $target_file);
            if ($upload) {
                echo "File ". '"' . basename($file_name). '"' . " has been uploaded";
            } else if (!$upload) {
                echo "Could not upload file";
                exit;
            }
        }
    }
}

I use the variable $doc_list to call out the names of the documents in folder in index.php:

<div>
    <?php if (!empty($doc_list)) foreach ($doc_list as $doc_name) { ?>
        <tr>
            <td><?= $doc_name ?></td>
        </tr>
    <?php } ?>
</div>

I've set the website charset into utf-8. and i still don't know why it's not displaying the correct file name with umlauts.

Phil
  • 157,677
  • 23
  • 242
  • 245
User_T
  • 247
  • 3
  • 8
  • 18
  • 1
    Have you tried wrapping the filename in `utf8_encode(..)` – Rizky Fakkel May 20 '15 at 23:06
  • That fixed the file name on form but not the filename in the folder – User_T May 20 '15 at 23:11
  • 1
    When you say *"look at the name in the uploaded folder"*, how are you doing that? Also, how is `$doc_list` populated? – Phil May 20 '15 at 23:14
  • "look at the name in the uploaded folder" - by that i mean when i go into that folder and the umlauts of the file name have been replaced by thoseweird characters. $doc_list is an array from the controller that holds all the file names in uploads folder. – User_T May 20 '15 at 23:19
  • Program: PhpStorm; $doc_list is array that gets its items from foreach construct and uses the getFilename function – User_T May 20 '15 at 23:31
  • Sorry if my english is not so good. I use windows 8.1, PHPStorm encoding is set to utf-8, $doc_list source and everything else i use is already in the first post but if you missed it, this is all i use: $doc_list = array(); foreach (new DirectoryIterator('uploads/') as $file) { if ($file->isDot() || !$file->isFile()) continue; $doc_list[] = $file->getFilename(); } – User_T May 20 '15 at 23:49
  • My apologies, I did miss `$doc_list` at the top of your script. I assume you have `` in your page's `` section? – Phil May 21 '15 at 00:12
  • 1
    possible duplicate of [how to iterate over non-English file names in PHP](http://stackoverflow.com/questions/2947941/how-to-iterate-over-non-english-file-names-in-php) – Phil May 21 '15 at 00:14
  • every page i have has a charset="utf-8" in it. I think something else is messing with the encoding either during move_uploaded_file or somewhere else. – User_T May 21 '15 at 00:18
  • i think its easier just to use preg_replace to remove the umlauts. – User_T May 21 '15 at 00:50

1 Answers1

1

Try to add accept-charset="UTF-8" like this:

<form method="post" enctype="multipart/form-data" accept-charset="UTF-8">
Michal Przybylowicz
  • 1,558
  • 3
  • 16
  • 22