5

I struggle to create directory names containing Unicode. I am on Windows XP and Perl Camelbox 5.10.0.

Up until now I used to use File::Path qw ( make_path ) to create directories - which worked fine until the first cyrillic directory appeared.

For files Win32API::File qw ( CreateFileW ) works fine if the file name is UTF-16LE encoded. Is there something similar for directories? Or maybe a parameter to tell CreateFileW to create the Unicode path if it doesn't exist?

Thanks,
Nele

Nele Kosog
  • 117
  • 1
  • 11
  • See also http://stackoverflow.com/questions/1721807/how-do-i-create-then-use-long-windows-paths-from-perl – Sinan Ünür Feb 02 '10 at 14:43
  • @Sinan: If I read it correctly, your example below is a small part of the answer you gave in the beforementioned thread. I read the thread briefly when it came up on SO in November but didn't remember it. Thanks! – Nele Kosog Feb 03 '10 at 07:09

2 Answers2

6

Win32.pm provides an interface to CreateDirectory and friends:

Win32::CreateDirectory(DIRECTORY)

Creates the DIRECTORY and returns a true value on success. Check $^E on failure for extended error information.

DIRECTORY may contain Unicode characters outside the system codepage. Once the directory has been created you can use Win32::GetANSIPathName() to get a name that can be passed to system calls and external programs.

Previous answer:

Note: Keeping this here for the record because you were trying to use CreateDirectoryW directly in your program.

To do this by hand, import CreateDirectoryW using Win32::API:

Win32::API->Import(
    Kernel32 => qq{BOOL CreateDirectoryW(LPWSTR lpPathNameW, VOID *p)}
);

You need to encode the $path for CreateDirectoryW:

#!/usr/bin/perl

use strict; use warnings;
use utf8;

use Encode qw( encode );
use Win32::API;

Win32::API->Import(
    Kernel32 => qq{BOOL CreateDirectoryW(LPWSTR lpPathNameW, VOID *p)}
);

binmode STDOUT, ':utf8';
binmode STDERR, ':utf8';

my $dir_name = 'Волгогра́д';

my $ucs_path = encode('UCS-2le', "$dir_name\0");
CreateDirectoryW($ucs_path, undef)
    or die "Failed to create directory: '$dir_name': $^E";
E:\> dir
2010/02/02  01:05 PM              волгогра́д
2010/02/02  01:04 PM              москва
Community
  • 1
  • 1
Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
  • Thank you. I'll use Win32::CreateDirectory. I am new to the Unicode universe and find it quite complicated - maybe this is only true on Windows using Perl - I don't know. – Nele Kosog Feb 03 '10 at 15:04
  • The name returned by Win32::GetANSIPathName() doesn't seem to work with external commands. Does it perhaps need to be translated to the system codepage or something? – Freon Sandoz Mar 04 '22 at 00:05
0

Just updating this question for 2018.

Unbelievably, ActivePerl 5.24 still does not support simply passing Unicode paths to open()/mkdir(), and by extension File::Path::mkpath(), because the underlying Perl code still calls the 20th-century ASCII version of Windows CreateFile(). Insanity! How could this not have been a higher priority than the myriad obscure Perl changes made in the intervening 10 years?!

This is true even if you 'use utf8;' or a myriad other incantations.

So, even today, we still have to have special-case code for Windows for this most basic function (creating/accessing Unicode filenames).

Fortunately, the Win32::Unicode module has a nice, easy-to-use Win32::Unicode::Dir::mkpathW() function which does exactly what you want and works for Unicode (as well as great copyW() and moveW() functions).

Unfortunately, this module hasn't passed its installation tests since Perl 5.16 and ActiveState dropped it from its handy ppm repository ( https://code.activestate.com/ppm/Win32-Unicode/ ).

Fortunately, there is a way to get it working, as the 3 tests that fail (relating to "print") are not used for file/directory creation:

Step 1: ppm install dmake

Step 2: ppm install MinGW

Step 3: perl -MCPAN -e shell then force install Win32::Unicode

Step 1 and 2 are required even if you already have Microsoft Visual Studio installed on your computer; apparently Perl is now built with MinGW and modules must be as well.

Phew.

Louis Semprini
  • 3,515
  • 2
  • 30
  • 31