-2

I have many folders with sub-folders of files I get from colleagues with special and even hidden characters like these:

Non standard chars: {µ, 市, ', &, 「 , invisible char, Ü, é, ... }

I am looking for a script or a windows tool that would rename all sub-folders and files in one go by replacing any non-standard character to a standard one according to characters in List X. Bonus: even better would be, if the tool would check List R, and use a replacement rule, if one is defined. If not, it should just replace the non-standard char with "_".

List X: {A-Z, a-z, 0-9, (, ), [,],-, _}
List R: {é->e , ü->u, ä->a, @->at,...} (Replacement rules)

Appreciate any hint to a tool or script.

Theo
  • 57,719
  • 8
  • 24
  • 41
A.T.
  • 127
  • 9
  • For the diacritic characters, I would use this: [Converting Unicode string to ASCII](https://stackoverflow.com/a/46660695/1701026) – iRon Feb 07 '21 at 08:28

2 Answers2

4

How's this? Replacing specific characters, then anything not between space and tilde or tab, like the unicode snowman.

echo hi | set-content filä.txt,filé.txt,fil☃.txt # in script or ise
$pairs = ('ä','a'), ('é','e'), ('[^ -~\t]','_')
dir -r | where name -match '[^ -~\t]' | rename-item -newname {
  $name = $_.name
  foreach ($pair in $pairs) {
    $name = $name -replace $pair
  }
  $name
} -whatif


What if: Performing the operation "Rename File" on target "Item:
C:\users\admin\foo\filä.txt Destination: C:\users\admin\foo\fila.txt".
What if: Performing the operation "Rename File" on target "Item:
C:\users\admin\foo\filé.txt Destination: C:\users\admin\foo\file.txt".
What if: Performing the operation "Rename File" on target "Item:
C:\users\admin\foo\fil☃.txt Destination: C:\users\admin\foo\fil_.txt".
js2010
  • 23,033
  • 6
  • 64
  • 66
1

I would parse the file/directory names and if each character didn't fall within an ascii range, replace it with an arbitrary character. You can use the following and build upon it using a hashtable for your List X and List R items.

filter ConvertTo-FileSafe {
    $userString = [char[]]$_
    $charToReplaceWith = '.'
    for ($currentChar = 0; $currentChar -lt $userString.Length; $currentChar++) {
        if ([System.IO.Path]::GetInvalidFileNameChars().Contains($userString[$currentChar])){
            $userString[$currentChar] = $charToReplaceWith
        }
        $previousChar = $userString[$currentChar]
    }
    return ($userString -join '')
}
Dan
  • 106
  • 4
  • The question is about transliterating characters, not about replacing invalid filename characters. – Theo Feb 06 '21 at 20:43