I have strings containing characters which are not found in ASCII; such as á, é, í, ó, ú; and I need a function to convert them into something acceptable such as a, e, i, o, u. This is because I will be creating IIS web sites from those strings (i.e. I will be using them as domain names).
Asked
Active
Viewed 8,698 times
3
-
2In general, it's called transliteration. Normalizing to FormD and filtering will work to convert composed Latin letters to [Basic Latin](http://www.unicode.org/charts/nameslist/index.html) letters but not ligatures (dž, ǣ, ij, … ) and such. See this [question](https://stackoverflow.com/questions/1841874/how-to-transliterate-cyrillic-to-latin-text). – Tom Blodget Oct 10 '17 at 16:26
2 Answers
3
function Convert-DiacriticCharacters {
param(
[string]$inputString
)
[string]$formD = $inputString.Normalize(
[System.text.NormalizationForm]::FormD
)
$stringBuilder = new-object System.Text.StringBuilder
for ($i = 0; $i -lt $formD.Length; $i++){
$unicodeCategory = [System.Globalization.CharUnicodeInfo]::GetUnicodeCategory($formD[$i])
$nonSPacingMark = [System.Globalization.UnicodeCategory]::NonSpacingMark
if($unicodeCategory -ne $nonSPacingMark){
$stringBuilder.Append($formD[$i]) | out-null
}
}
$stringBuilder.ToString().Normalize([System.text.NormalizationForm]::FormC)
}
The resulting function will convert diacritics in the follwoing way:
PS C:\> Convert-DiacriticCharacters "Ångström"
Angstrom
PS C:\> Convert-DiacriticCharacters "Ó señor"
O senor
Copied from: http://cosmoskey.blogspot.nl/2009/09/powershell-function-convert.html

iRon
- 20,463
- 10
- 53
- 79
2
Taking this answer from a C#/.Net question it seems to work in PowerShell ported roughly like this:
function Remove-Diacritics
{
Param([string]$Text)
$chars = $Text.Normalize([System.Text.NormalizationForm]::FormD).GetEnumerator().Where{
[System.Char]::GetUnicodeCategory($_) -ne [System.Globalization.UnicodeCategory]::NonSpacingMark
}
(-join $chars).Normalize([System.Text.NormalizationForm]::FormC)
}
e.g.
PS C:\> Remove-Diacritics 'abcdeéfg'
abcdeefg

TessellatingHeckler
- 27,511
- 4
- 48
- 87