3

Is there a function I can use that converts dodgy filenames with good filenames?

I'm processing a large amount of photos, and very occasionally, my script stops because the uploader has put a curly symbol (~) in the filename. I'm also now wondering if there are any other bad symbols that can't be in a filename and how to escape them.

I'm looping through these files using VBScript's FileSystem Object, similar to the following:

For Each File In Files
    If InStr(UCase(File.Name), ".JPG") > 0 Then
        '// do stuff
    End If
Next
TheCarver
  • 19,391
  • 25
  • 99
  • 149
  • 2
    I'm not exactly sure what your error is, but `~` is a valid character for Windows file names. – aphoria Aug 02 '12 at 13:43
  • You might want to persue why you are getting the error. Since the most likely source of the original filename is also a file from some windows based OS its highly unlikely that your would recieve an invalid character. As aphoria points out ~ is perfectly legal in windows filename. You should endevour to find the true reason for the error before attempting a "fix". – AnthonyWJones Aug 02 '12 at 14:06
  • What is your error? The FileSystemObject works with the shortened 8.3 naming convention. – Nilpo Aug 02 '12 at 15:49
  • @AnthonyWJones: Sorry, my mistake, I should have looked in to this further before asking on SO. I found that the file in question was corrupt. Now I know it's corrupt, I doubt it's the filesystem object that is crashing, it's more than likely the ASP.JPEG component that reads the file that is the issue. There is a strange pattern to this though, as *every* corrupt file starts with an underscore and has a ~ also. So maybe I can program it to look out for this file and delete it from the folder. – TheCarver Aug 02 '12 at 19:23

1 Answers1

6

You can make a function that will return a 'cleaned' filename like:

function MakeNormal(filename)
    dim re : Set re = new regexp

    re.Pattern = "[^\w :\\\.]"
    re.Global = True

    MakeNormal = re.Replace(filename, "_")

end function

msgbox MakeNormal("C:\Temp\normal filename.txt")
msgbox MakeNormal("C:\Temp\special ~!@#$%^&*() filename.txt")

' returns: "C:\Temp\normal filename.txt" and "C:\Temp\special __________ filename.txt"

And replace the name of the file with the cleaned one. Becomes risky when you have two files that are only unique on the special characters.

Above is the 'whitelist' variant, if you prefer a 'blacklist' version, you can replace the pattern for something like [~!@#$%^&()]

AutomatedChaos
  • 7,267
  • 2
  • 27
  • 47
  • I disagree. You are removing characters that are allowed to appear in valid file names. While effective, this is a bad solution. – Nilpo Aug 02 '12 at 15:46
  • @Nilpo, you have the academic right on your side. But in my experience you'll sometimes encounter systems that do not comply with the [Windows filename convention](http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247.aspx), especially in the ascii 127-255 range. If you have a third party uploader that cannot use certain symbols, you'll have to fall back to sanitize the input. One way to do that is replacing dodgy characters with accepted ones. – AutomatedChaos Aug 03 '12 at 11:10
  • I have never seen a single instance where the FSO would not handle a legally named file. Think about it--that would make the FSO pretty useless. You wouldn't even be able to rename a file before trying to open it or anything. It would create a whole cascading effect of other problems. – Nilpo Aug 03 '12 at 12:52
  • @Nilpo Well, you should try Sharepoint. It is not possible to upload files with ~, #, % or & in their filenames while they are allowed as characters in Windows filenames. Took me some time to find out why my screenshots did not automatically save to a certain location (like R:\TestReports\Screenshots\) while saving to any other location (C:\Temp\) was perfectly fine. Finally, I realized that the R: drive was actually mapped from a Sharepoint location and my filenames had a # in them. – AutomatedChaos Aug 06 '12 at 09:48
  • But the question isn't about Sharepoint, it's about the FSO. On a side note, Sharepoint is designed specifically for web use. It makes sense that it reserves all characters that have special meaning in URLs. – Nilpo Aug 06 '12 at 13:54