19

I have part of a build process that creates a hideously long paths in Windows. It's not my fault. It's several directories deep, and none of the directory names are abnormally long; they're just long and numerous enough to make it over MAX_PATH (260 chars). I'm not using anything other than ASCII in these names.

The big problem is that the blow-up happens deep in the guts of Module::Build during the dist target, although I figure the build system doesn't matter because they'd make the same directories.

Creating one of these overly-long directories with File::Path fails:

 use File::Path qw( make_path );

 make_path( 'C:\\.....' ); # fails if path is over 260 chars

Similarly, constructing each directory level by hand fails once the absolute path would go over MAX_PATH.

This isn't new, isn't Perl's fault, and Microsoft documents it in Naming Files, Paths, and Namespaces. Their fix suggests adding the \\?\ in front of any path to access the Unicode filename API. However, that doesn't seem to be the full fix for a Perl script because it still fails:

 use File::Path qw( make_path );

 make_path( '\\\\?\\C:\\.....' );  # still fails if path is over MAX_PATH, works otherwise

This might be because make_path pulls apart its argument and then goes through the directories one level at a time, so \\?\ only applies to the top-level, which is within MAX_PATH.

I dug up a bug report to ActiveState that suggests there's something else I need to fix up to get to the Unicode filenames, and Jan Dubois gives a bit more details in Re: "long" filenames on Windows 2K/XP, although I'm not sure it applies (and is extremely old). perlrun mentions that this use to be the job of the -C switch, but apparently that part was abandoned. The perl RT queue has a more recent bug 60888: Win32: support full unicode in filenames (use Wide-system calls).

Miyagawa notes some Unicode filename issues and Win32API::File without specifically mentioning long paths. However, the Win32API::File CPAN Forum entry seems to indicate only fear, which leads to anger, which leads to hate, and so on. There's an example in the Perlmonks post How to stat a file with a Unicode (UTF16-LE) filename in Windows?. It seems the Win32::CreateDirectory is the answer, and I'll try that the next time I get next to a Windows machine.

Then, supposing I can create the long path path. Now I have to teach Module::Build, and maybe other things, to handle it. That might be immediately easy with monkeypatches if Win32::GetANSIPathName() does what it says on the tin.

brian d foy
  • 129,424
  • 31
  • 207
  • 592

5 Answers5

10

Windows has two separate system call for each function that needs to deal with strings, an "A" call using the ANSI aka Active Code Page as the encoding (e.g. cp1252) and a "W" call using UTF-16le. Perl uses "A" calls, while \\?\ only works with "W" calls.

You can use Win32::API to access the "W" calls as shown in the script below, but Win32::LongPath not only uses the "W" calls, but automatically adds \\?\!

Example of using Win32::API to call CreateDirectoryW to use a long path (\\?\-prefixed path):

#!/usr/bin/perl

use strict;
use warnings;

use Carp;
use Encode qw( encode );
use Symbol;

use Win32;

use Win32API::File qw(
    CreateFileW OsFHandleOpen
    FILE_GENERIC_READ FILE_GENERIC_WRITE
    OPEN_EXISTING CREATE_ALWAYS FILE_SHARE_READ
);

use Win32::API;
use File::Spec::Functions qw(catfile);

Win32::API->Import(
    Kernel32 => qq{BOOL CreateDirectoryW(LPWSTR lpPathNameW, VOID *p)}
);

my %modes = (
    '<' => {
        access => FILE_GENERIC_READ,
        create => OPEN_EXISTING,
        mode   => 'r',
    },
    '>' => {
        access => FILE_GENERIC_WRITE,
        create => CREATE_ALWAYS,
        mode   => 'w',
    },
    # and the rest ...
);

use ex::override open => sub(*;$@) {
    $_[0] = gensym;

    my %mode = %{ $modes{$_[1]} };

    my $os_fh = CreateFileW(
        encode('UCS-2le', "$_[2]\0"),
        $mode{access},
        FILE_SHARE_READ,
        [],
        $mode{create},
        0,
        [],
    ) or do {$! = $^E; return };

    OsFHandleOpen($_[0], $os_fh, $mode{mode}) or return;
    return 1;
};

my $path = '\\\\?\\' . Win32::GetLongPathName($ENV{TEMP});
my @comps = ('0123456789') x 30;

my $dir = mk_long_dir($path, \@comps);
my $file = 'test.txt';
my $str = "This is a test\n";

write_test_file($dir, $file, $str);

$str eq read_test_file($dir, $file) or die "Read failure\n";

sub write_test_file {
    my ($dir, $file, $str) = @_,

    my $path = catfile $dir, $file;

    open my $fh, '>', $path
        or croak "Cannot open '$path':$!";

    print $fh $str or die "Cannot print: $!";
    close $fh or die "Cannot close: $!";
    return;
}

sub read_test_file {
    my ($dir, $file) = @_,

    my $path = catfile $dir, $file;

    open my $fh, '<', $path
        or croak "Cannot open '$path': $!";

    my $contents = do { local $/; <$fh> };
    close $fh or die "Cannot close: $!";
    return $contents;
}

sub mk_long_dir {
    my ($path, $comps) = @_;

    for my $comp ( @$comps ) {
        $path = catfile $path, $comp;
        my $ucs_path = encode('UCS-2le', "$path\0");
        CreateDirectoryW($ucs_path, undef)
            or croak "Failed to create directory: '$path': $^E";
    }
    return $path;
}
ikegami
  • 367,544
  • 15
  • 269
  • 518
Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
  • Try appending \0 at the end of path – n0rd Nov 12 '09 at 15:30
  • ...before converting it to UTF-16LE – n0rd Nov 12 '09 at 15:31
  • 2
    Okay, this is half of an answer I think. I also need the path in a form that the rest of the program can use without the acrobatics. I'm not going to be able to patch most of the CPAN toolchain for this. :) – brian d foy Nov 12 '09 at 20:28
  • I'm not in front of my Windows machine until tomorrow. That Win32::GetANSIPathName() looks promising. It sounds like it can give me a path name that everything else can use without knowing the details. – brian d foy Nov 12 '09 at 22:32
  • @Sinan - If you're going that route, what about junction points? I used your code above (the mk_long_dir part) to create a long path that was inaccessible from Explorer. I traversed as far as possible along the path and created a junction point at C:\foo to that spot. This let me traverse all the way to the end where I set another junction point at C:\bar. In Explorer, I was then able to create C:\bar\test.txt. As for Perl, I was unable to access test.txt via the long path, but was able to open, read and write to C:\bar\test.txt with no problems. – bish Nov 13 '09 at 15:06
  • @bish Of course, that works, but this needs to happen sort of transparently during the build process. – Sinan Ünür Nov 13 '09 at 16:52
  • 1
    I just highly modified this answer. – ikegami Jan 12 '21 at 23:33
6

Following code actually creates quite deep (more than 260 characters long) directory structure. At least on my machine:

use Win32::API;

$cd = Win32::API->new('kernel32', 'CreateDirectoryW', 'PP', 'N');

$dir = '\\\\?\\c:\\!experiments';

$res = 1;

do
{
    print 'path length: ' . length($dir) . "\n";
    $dirname = pack('S*', unpack('C*', "$dir\0"));  #dirty way to produce UTF-16LE string

    $res = $cd->Call($dirname, 0);
    print "$res\n";

    $dir .= '\\abcde';

} while ( $res );
n0rd
  • 11,850
  • 5
  • 35
  • 56
2

I understand this is not a solution to your specific problem. However, there are a lot of scenarios where being able to map a very long path to a drive-letter would allow one to sidestep the issue and would therefore be useful in dealing with very long path names without having to wade through a whole lot of Windows specific code and docs.

Despite all the effort I put into figuring out how to do this, I am going to recommend somehow using SUBST. Win32::FileOp provides Subst and Unsubst. You can then map the top level working directory to an unused drive letter (which you can find by using Substed). I would start checking with Z and working backwards.

Or, you can shell out, invoke subst utility with no parameters to get a list of current substitutions, choose one that is not there.

None of this is entirely safe as substitutions could change during the build process.

Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
  • Much cleaner than trying to use a Windows shortcut. – mob Nov 12 '09 at 16:11
  • 2
    This wouldn't be high on my list of things to try. It has to work across processes. Also, I need to do this up to 100 times, so I foresee other disasters. – brian d foy Nov 12 '09 at 20:19
  • 3
    The IT folks have reasonably shot down this one. This is a shared build machine with a lot going on, so I don't get to create new virtual drives. – brian d foy Nov 12 '09 at 22:33
2

This should really be a comment but posting code in comments is hardly useful.

UNC paths do not work either:

C:\> net share
perlbuild    e:\home\src
#!/usr/bin/perl

use strict;
use warnings;

use File::Path qw(make_path);
use File::Slurp;
use Path::Class;

my $top = dir('//Computer/perlbuild');
my @comps = ('0123456789') x 30;

my $path = dir($top, @comps);

make_path $path, { verbose => 1 };

my $file = file($path, 'test.txt');

write_file "$file" => 'This is a test';

print read_file "$file";

Result:

mkdir \\Computer\perlbuild\0123456789\0123456789\0123456789\0123456789\0123456
789\0123456789\0123456789\0123456789\0123456789\0123456789\0123456789\0123456789
\0123456789\0123456789\0123456789\0123456789\0123456789\0123456789\0123456789\01
23456789\0123456789: No such file or directory; The filename or extension is too
 long at C:\Temp\k.pl line 15
Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
-1

I had three thoughts, all of them kind of hacks:

  1. Start with some short directory names (C:\data_directory\a\b\c\d\4\5\6\...) and then rename the directories (starting with the deepest directory first of course).

  2. Create Windows shortcut to a moderately long path and create files and subdirectories from there? (Or install Cygwin and use symlinks?)

  3. Create the desired files in a directory with a short name, zip/tar them, and unpack them to the directory with the longer name. Or create zip/tar files "by hand" and unpack them in the desired location.

mob
  • 117,087
  • 18
  • 149
  • 283
  • 1
    (1) is not going to work unless you are using the right Win32 API (in which case you would use it in the first place), (2) Cygwin's utilities could not navigate to or remove the subdirectories I created using the scripts in my post, (3) might work, depending on what you are using to unzip. – Sinan Ünür Nov 12 '09 at 15:08
  • Although some of those might be fantastically wicked, remember that I have to get this to work from within Module::Build. I need to create the directory and return a path so that the rest of the process is none the wiser. Creating the path is only half the problem. – brian d foy Nov 12 '09 at 20:31
  • Windows shortcuts are nothing like symlinks. They are more akin to `*.desktop` files. The main difference is that shortcuts are binary files. Only recently have you been able to have symlinks, and that requires admin level authority. – Brad Gilbert Mar 25 '14 at 06:01