0

I am creating a site where the authenticated user can write messages for the index site.

On the message create site I have a textbox where the user can give the title of the message, and a textbox where he can write the message.

The message will be exported to a .txt file and from the title I'm creating the title of the .txt file and like this:

Title: This is a message (The filename will be: thisisamessage.txt)

The original given text as filename will be stored in a database rekord among with the .txt filename as path.

For converting the title text I am using a function that looks like this:

function filenameconverter($title){
        $filename=str_replace(" ","",$title);
        $filename=str_replace("ű","u",$filename);
        $filename=str_replace("á","a",$filename);
        $filename=str_replace("ú","u",$filename);
        $filename=str_replace("ö","o",$filename);
        $filename=str_replace("ő","o",$filename);
        $filename=str_replace("ó","o",$filename);
        $filename=str_replace("é","e",$filename);
        $filename=str_replace("ü","u",$filename);
        $filename=str_replace("í","i",$filename);
        $filename=str_replace("Ű","U",$filename);
        $filename=str_replace("Á","A",$filename);
        $filename=str_replace("Ú","U",$filename);
        $filename=str_replace("Ö","O",$filename);
        $filename=str_replace("Ő","O",$filename);
        $filename=str_replace("Ó","O",$filename);
        $filename=str_replace("É","E",$filename);
        $filename=str_replace("Ü","U",$filename);
        $filename=str_replace("Í","I",$filename);
        return $filename;
    }

However it works fine at the most of the time, but sometimes it is not doing its work. For example: "Pamutkéztörlő adagoló és higiéniai kéztörlő adagoló". It should stand as a .txt as:

pamutkeztorloadagoloeshigieniaikeztorloadagolo.txt, and most of the times it is. But sometimes when im giving this it will be:

pamutkă©ztă¶rlĺ‘adagolăłă©shigiă©niaikă©ztă¶rlĺ‘adagolăł.txt

I'm hungarian so the title text will be also hungarian, thats why i have to change the characters.

I'm using XAMPP with apache and phpmyadmin.

Isaac Bennetch
  • 11,830
  • 2
  • 32
  • 43
zoldingo
  • 31
  • 2
  • So, why are you saving it to a .txt file? The database can have a UTF-8 character encoding meaning you don't have to do such a replace and it would make more sense to just get the submitted text from the database. Or do you have a specific reason not to do that? – Florian Humblot Feb 24 '17 at 07:21
  • That will solve the problem I think, but any idea why the above mentioned function not working well? – zoldingo Feb 24 '17 at 07:53
  • it's a homebrew function, I wouldn't know exactly what is wrong, but this answer on a similar question could prove to be useful: http://stackoverflow.com/questions/1525830/how-do-i-use-filesystem-functions-in-php-using-utf-8-strings – Florian Humblot Feb 24 '17 at 07:55

2 Answers2

0

I would rather use a generated unique ID for each file as its filename and save the real name in a separate column. This way you can avoid that someone overwrites files by simply uploading them several times. But if that is what you want you will find several approaches on cleaning filenames here on SO and one very good that I used is http://cubiq.org/the-perfect-php-clean-url-generator

Alex Odenthal
  • 211
  • 1
  • 11
0

intl

I don't think it is advisable to use str_replace manually for this purpose. You can use the bundled intl extension available as of PHP 5.3.0. Make sure the extension is turned on in your XAMPP settings.

Then, use the transliterator_transliterate() function to transform the string. You can also convert them to lowercase along. Credit goes to simonsimcity.

<?php
$input = 'Pamutkéztörlő adagoló és higiéniai kéztörlő adagoló';
$output = transliterator_transliterate('Any-Latin; Latin-ASCII; lower()', $input);
print(str_replace(' ', '', $output)); //pamutkeztorloadagoloeshigieniaikeztorloadagolo
?>

P.S. Unfortunately, the php manual on this function doesn't elaborate the available transliterator strings, but you can take a look at Artefacto's answer here.


iconv

Using iconv still returns some of the diacritics that are probably not expected.

print(iconv("UTF-8","ASCII//TRANSLIT",$input)); //Pamutk'ezt"orl"o adagol'o 'es higi'eniai k'ezt"orl"o adagol'o

mb_convert_encoding

While, using encoding conversion from Hungarian ISO to ASCII or UTF-8 also gives similar problems you have mentioned.

print(mb_convert_encoding($input, "ASCII", "ISO-8859-16")); //Pamutk??zt??rl?? adagol?? ??s higi??niai k??zt??rl?? adagol??
print(mb_convert_encoding($input, "UTF-8", "ISO-8859-16")); //PamutkéztörlŠadagoló és higiéniai kéztörlŠadagoló

P.S. Similar question could also be found here and here.

Community
  • 1
  • 1
Rheza
  • 11
  • 4