-1

Is there an easy way to take a very long filename (which contains parameters and settings that were used to generate the file) and shorten it programmatically so it will save on Windows?

The filename cannot be sent to a URL shortening service (Bitly, Google, etc.) because the information is confidential and the network is segregated.

The filename has to be shortened internally within the application and without using third-party library and without sending the result to a database.

e.g. Start with a file name so long that Windows will fail to save it:

E:\Results\Job\<SomeJobComponentName>\[A very long file name with the parameters and settings that were used to generate it].csv

And save it as

E:\Results\Job\<SomeJobComponentName>\0eVfd878swg9.csv

And then when the file is read, the code can convert

0eVfd878swg9.csv

back to

[A very long file name with the parameters and settings that were used to generate it].csv

Sort of like a Bitly for filenames.

This is not about encrypting the CONTENT of the file. It's simply about making sure that filenames of any length can be saved without hindrance. The filename needs to be convertible back to the original without opening the file, so that the un-shortened filename can be parsed and viewed in a viewer (which the user uses to browse the result files in the job directory).

I've previously tried the "\?\" prefix to the full path but for whatever reason this fails when using StreamWriter. It can't handle this workaround.

Let me know if I'm barking up the wrong tree here!

Using C# 7.0, Windows Server 2012 R2 and Windows 10 desktop.

Bit Racketeer
  • 493
  • 1
  • 3
  • 13
  • Create the file with something else that can handle the long name and then feed the 8.3 equivalent to streamwriter? – jhnc Feb 16 '19 at 09:17
  • 1
    btw did you really mean "\?\" and not "\\?\" ? – jhnc Feb 16 '19 at 09:26
  • Maybe use GZip? https://stackoverflow.com/a/7343623/ – Bit Racketeer Feb 16 '19 at 13:11
  • 2
    This is a pretty bad idea. Logically this info belongs inside the file itself, but you can't mess with the CSV file format. Simply punt the problem by generating *another* file with the same name but different extension. What that file looks like is entirely up to you, start with a simple .txt file. – Hans Passant Feb 16 '19 at 13:50
  • This is driven by end-user requirements. They want to be able to view a directory of hundreds (or even thousands) of results files and see what's inside each from the file-name. They'll choose which files to look inside based on the filename. Opening each file to look inside it adds human-unpopular complexity (i.e. operational inefficiency at the meatware level). – Bit Racketeer Feb 17 '19 at 12:05
  • Put it this way - I am unable to debug my end-users and tell them to change their requirements. As far as the operations of our business are concerned, their requirement is rational and reasonable and we need to make systems deliver on it. The best answer I've found is to upgrade to Windows Server 2016 or later and switch on long file names. – Bit Racketeer Feb 17 '19 at 17:32

2 Answers2

1

According to your question you want to start with a filename so long that Windows will fail to save it. The limit is 255 characters per component (bit between backslashes: folder or filename) and very nearly 32,767 characters, or about 9 A4 pages, for the entire path, which should be sufficient for most normal purposes.

If you are dealing with a filesystem other than NTFS (for example FAT, NFS, ISO-9660) then the limits are considerably tighter. This is about NTFS on Windows 10 Anniversary Update (2016) or later.

While Windows will save and retrieve a file with a path this long, it is possible that some APIs will not. This answer assumes that the file is actually saved under the name you want, but you have to pass a shorter name to such an API.

If the path or filename really is so long that Windows cannot save it, then the file cannot actually exist under that name, and you will have to put the name somewhere outside the filesystem, and one of your constraints is without sending the result to a database, so that possibility is ruled out.

There are two approaches to compression. One exploits redundancy in the data you want to compress, for example run length encoding or Huffman. But that won't work here. There is unlikely to be enough redundancy in the names to make a significant difference. The other is to generate short names and maintain a lookup table. That is what bitly does. Since you have disallowed creating your own lookup table (without sending the result to a database), your only option is to use built-in Windows facilities.

When you save a file in a modern version of Windows, the filesystem will automatically make a short 8.3 filename that will allow the file to be seen and opened by legacy applications. You can retrieve the short filename very simply, like this:

>>> import win32api
>>> win32api.GetShortPathName(r"E:\Dropbox\Rocket Cottage\Sicilian fennel and orange salad with red onion and mint.fdx")
'E:\\Dropbox\\ROCKET~1\\SICILI~1.FDX'

To convert back:

>>> win32api.GetLongPathName(r"E:\Dropbox\ROCKET~1\SICILI~1.FDX")
'E:\\Dropbox\\Rocket Cottage\\Sicilian fennel and orange salad with red onion and mint.fdx'

If using win32api falls foul of your requirement not to use a third-party library (though in a Windows installation that, frankly, borders on religious mania) then you can use subprocess to call dir /X.

C:\Users\xxxxx>dir /x E:\Dropbox\ROCKET~1\SICILI~1.FDX
 Volume in drive E is Enigma
 Volume Serial Number is D45D-0655

 Directory of E:\Dropbox\ROCKET~1

2013-04-17  18:07            17,125 SICILI~1.FDX Sicilian fennel and orange salad with red onion and mint.fdx

C:\Users\xxxxx>dir /x "E:\Dropbox\Rocket Cottage\Sicilian fennel and orange salad with red onion and mint.fdx"
 Volume in drive E is Enigma
 Volume Serial Number is D45D-0655

 Directory of E:\Dropbox\Rocket Cottage

2013-04-17  18:07            17,125 SICILI~1.FDX Sicilian fennel and orange salad with red onion and mint.fdx
BoarGules
  • 16,440
  • 2
  • 27
  • 44
  • 1
    @GenoChen Given the OP's constraints (no network, no database, browseable, roundtrippable) I don't see any plausible alternative. Maybe the downvoter can come up with something better. – BoarGules Feb 16 '19 at 14:18
  • 1
    @BoarGlues I upvoted because I think this is the best choice too. – Geno Chen Feb 16 '19 at 14:23
  • A variant of an earlier comment. This is driven by end-user requirements. They want to be able to view a directory of hundreds (or even thousands) of results files and see what's inside each from the file-name (i.e. parameters & results). They'll choose which files to look inside based on the filename. Opening each file to look inside it adds human-unpopular complexity (i.e. operational inefficiency at the meatware level) so my idea is to use a GUI viewer which unpacks a Bitly-type name. But this is a workaround that I wish to avoid. (1/2) – Bit Racketeer Feb 17 '19 at 12:08
  • PS filesystem is NTFS (default in Windows 2012 R2) – Bit Racketeer Feb 17 '19 at 12:10
  • To your point about the 255 character limit - this appears to fail with StreamWriter; I assumed that this is because the total path+file name length is >255 chars (components are <240), so tried the "\\?\" workaround but this also fails. And to your (otherwise excellent suggestion) of 8.3 filenames - this will not meet end-user requirements that file-names are human readable & describe content inside each file. We produce 10's of thousands of files and the team human-processing results have requested human-intelligible filenames. On UNIX it would be easy...! (2/2) – Bit Racketeer Feb 17 '19 at 12:14
  • I would look more closely at why StreamWriter is failing. Is it really the total path length that is causing the failure, or is it the length of a single path component? Does a component contain disallowed characters `/ : * ? " < > |` maybe? – BoarGules Feb 17 '19 at 12:18
0

The best answer appears to be to upgrade to Windows Server 2016 or later and switch on long file names. There is no point fighting an uphill battle work around on outdated server operating system when a permanent & official solution is easily available. Thanks everyone for all the comments.

Bit Racketeer
  • 493
  • 1
  • 3
  • 13