4

Due to compressing files t move from a server to another many times a huge number of images had their names changed.. from Text to Unicode U+0600 ... Which is Arabic BTW

Here is a sample

#U062a#U0637#U0628#U064a#U0642#U0627#U062a-#U0645#U062c#U0627#U0646#U064a#U0629-#U0644#U0644#U062a#U0644#U0648#U064a#U0646.jpg

I used this tool to convert it http://www.branah.com/unicode-converter I had to delete the "#" though .

The problem is that there are more than 500 files. probably up to 1000. And I am using WordPress, and All the files are on the server.

Is there anyway to convert them? Probably using PHP or any script.


Update 01:

I found this useful tool, since I am using CentOs: It's called convmv

Here's a link to the tool: https://www.j3e.de/linux/convmv/

It's a Perl script. And here is a list of its commands : https://www.j3e.de/linux/convmv/man/

The problem is still that I don't know where from and where to..

Does anyone have any experience with this ?


Update 02: Trying to run the Script provided by Kenosis

I first ran the script to test it:

# perl -wc perl_script.pl
perl_script.pl syntax OK

The I ran the Script without the check syntax:

# perl -w perl_script.pl
Testing: #U0627#U0644#U0623#U064a#U0628#U0627#U062f-Air-150x150.png -> lfybd-Air
-150x150.png
Testing: #U0627#U0644#U0623#U064a#U0628#U0627#U062f-Air-244x300.png -> lfybd-Air
-244x300.png
Testing: #U0627#U0644#U0623#U064a#U0628#U0627#U062f-Air-332x190.png -> lfybd-Air
-332x190.png
Testing: #U0627#U0644#U0623#U064a#U0628#U0627#U062f-Air-518x400.png -> lfybd-Air
-518x400.png
Testing: #U0627#U0644#U0623#U064a#U0628#U0627#U062f-Air.png -> lfybd-Air.png
File 'perl_script.pl' not in convertible format!
Done!

Your help is so much appreciated. thanks

Bilal Khoukhi
  • 1,040
  • 1
  • 15
  • 22

2 Answers2

3

Perhaps the following will be helpful:

use strict;
use warnings;
use open qw(:std :utf8);

my $rename = 0;

for my $oldFileName (<*>) {
    my $newFileName = $oldFileName;
    $newFileName =~ s/#U([a-f0-9]+)([^#]+)?/chr( hex $1 ) . ( $2 ? $2 : '' )/gei;

    if ( $newFileName eq $oldFileName ) {
        warn "File '$oldFileName' not in convertible format!\n";
        next;
    }

    if ( -e $newFileName ) {
        warn "File '$newFileName' already exists!\n";
        next;
    }

    print $rename ? 'Renaming: ' : 'Testing: ';
    print "$oldFileName -> $newFileName\n";
    rename $oldFileName, $newFileName if $rename;
}

print "Done!\n\n";

Run this on a test or backup directory first!

Place the script into the directory where the files need to be renamed, then invoke it as follows:

perl script.pl

The script will read in all the file names. The subroutine converts the names into Unicode and then decodes those into ASCII using the module Text::Unidecode. You're warned if the file name's not in a convertible format or if it already exists: these for safety reasons.

By default, $rename is set to zero (false), so you can do a non-invasive run to see the renaming results. Set $rename to 1 or a non-zero value to do the actual renaming.

Hope this helps!

Kenosis
  • 6,196
  • 1
  • 16
  • 16
  • It gave me an Internal Error "The server encountered an internal error or misconfiguration and was unable to complete your request." – Bilal Khoukhi Dec 22 '13 at 23:09
  • I don't think so... is that a RPM on centos? – Bilal Khoukhi Dec 22 '13 at 23:31
  • 1
    @BilalKhoukhi - It looks like there an rpm of it [here](http://rpmfind.net//linux/RPM/mandriva/devel/cooker/sparcv9/media/contrib/release/perl-Text-Unidecode-0.40.0-2.noarch.html), but you could also do the following at the command line: `sudo cpan install Text::Unidecode` and you'd should be set. – Kenosis Dec 22 '13 at 23:45
  • I installed Test::Unicode, and still get the Internal Server Error – Bilal Khoukhi Dec 22 '13 at 23:52
  • Actually before no, I didn't. and that was my fault. I didn't know I should run it from the command line. Now I did, I first tested with perl -wc, then I ran it with perl -w, I updated my question to see the result. – Bilal Khoukhi Dec 23 '13 at 00:02
  • I need to point out that I need it to be converted to U+0600 - U+06FF Arabic I notice that the script converted it to Latin letters.. Example: lfybd-Air-150x150.png – Bilal Khoukhi Dec 23 '13 at 00:08
  • 1
    @BilalKhoukhi - My apologies, as I may have misunderstood. You just need it converted to Arabic? – Kenosis Dec 23 '13 at 00:12
  • 1
    @BilalKhoukhi - For example: #U062a#U0637#U0628#U064a#U0642#U0627#U062a-#U0645#U062c#U0627#U0646#U064a#U0629-#U0644#U0644#U062a#U0644#U0648#U064a#U0646.jpg -> تطبيقات-مجانية-للتلوين.jpg – Kenosis Dec 23 '13 at 00:17
  • Yes only to Arabic.. and I need it to be ignoring the numbers that are added in the end of the filename.. I man, I want them to be there, and not changed, also it may contain Latin characters, But their Unicode isn't messed up. So it needs to be ignored as well. This is an example. #U0627#U0644#U0623#U064a#U0628#U0627#U062f-Air-150x150.png – Bilal Khoukhi Dec 23 '13 at 00:40
  • 1
    @BilalKhoukhi - Updated the script. Example: #U0627#U0644#U0623#U064a#U0628#U0627#U062f-Air-150x150.png -> الأيباد-Air-150x150.png `Text::Unidecode` wasn't needed after all... – Kenosis Dec 23 '13 at 00:46
  • I ran the script in ssh, I see the result there, as shown on my pdate, except this time characters are ????? which are arabic,just can't be shown on the ssh window... The problem is that I don't see the files updated.. I mean their names are still the same. What's happening? – Bilal Khoukhi Dec 23 '13 at 00:53
  • 1
    @BilalKhoukhi - 1) Your console's not able to display the Unicode characters. 2) You need to `my $rename = 1;` for the renaming to occur. – Kenosis Dec 23 '13 at 00:55
  • 1
    let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/43716/discussion-between-kenosis-and-bilal-khoukhi) – Kenosis Dec 23 '13 at 01:04
  • It actually worked when I created a directory, and threw a 5 files, and tested... However when I went to the directory that contains all the messed up files, it didn't work.. I have to tell you that it also contains files with Latin characters, and Arabic Characters. So why is it ignoring the files that need to be changed? – Bilal Khoukhi Dec 23 '13 at 01:07
1

Referring to this answer. Using this simple function:

<?php
    function uni2arabic($uni_str) 
    {   
          for($i=0; $i<strlen($uni_str); $i+=4)
             {
                    $new="&#x".substr($uni_str,$i,4).";"; 
                    $txt = html_entity_decode("$new", ENT_COMPAT, "UTF-8");
                    $All.=$txt;
             }

        return $All;
    }
?>

You can then use a foreach loop on all the files and it will convert the unicode to Arabic text for you.

Community
  • 1
  • 1
Enijar
  • 6,387
  • 9
  • 44
  • 73
  • Is this an automated script? I mean, I want something that I just show it the files, and it converts the characters.. I don't want to do it manually, It's a very big number. I am only interested to changed the names of the files. that's all. – Bilal Khoukhi Dec 22 '13 at 11:54