As it says in the header line, I want to convert zenkaku characters to hankaku ones and vice-vrsa in C#, but can't figure out how to do it. So, say "ラーメン" to "ラーメン" and the other way around. Would it be possible to write this in a method which determines automatically which way the conversion needs to go, based on the format of the input?
Asked
Active
Viewed 3,988 times
2 Answers
3
You can use the Strings.StrConv() method by including a reference to Microsoft.VisualBasic.dll, or you can p/invoke the LCMapString() native function:
private const uint LOCALE_SYSTEM_DEFAULT = 0x0800;
private const uint LCMAP_HALFWIDTH = 0x00400000;
public static string ToHalfWidth(string fullWidth)
{
StringBuilder sb = new StringBuilder(256);
LCMapString(LOCALE_SYSTEM_DEFAULT, LCMAP_HALFWIDTH, fullWidth, -1, sb, sb.Capacity);
return sb.ToString();
}
[DllImport("kernel32.dll", CharSet = CharSet.Unicode)]
private static extern int LCMapString(uint Locale, uint dwMapFlags, string lpSrcStr, int cchSrc, StringBuilder lpDestStr, int cchDest);
and you can also do the reverse:
private const uint LCMAP_FULLWIDTH = 0x00800000;
public static string ToFullWidth(string halfWidth)
{
StringBuilder sb = new StringBuilder(256);
LCMapString(LOCALE_SYSTEM_DEFAULT, LCMAP_FULLWIDTH, halfWidth, -1, sb, sb.Capacity);
return sb.ToString();
}
As for detecting the format of the input string, I'm not aware of an easy way without doing a conversion first and comparing results. (What if the string contains both full-width and half-width characters?)

John Estropia
- 17,460
- 4
- 46
- 50
-
Thanks for the suggestion. This basically answers my question. A pity that there is no easy way to combine the two functions to get the conversion done automatically. – yu_ominae Jun 22 '11 at 04:05
-
Doing so will be ambiguous behavior anyway. If I give "ラーメン" (note first char is half-width) as input to your combined function, do you output "ラーメン"(convert char-by-char), "ラーメン"(convert based on first char), or "ラーメン"(convert based on majority)? – John Estropia Jun 22 '11 at 04:16
-
You're quite right, it gets pretty complicated... I am doing this to highlight substrings in a string containing Japanese characters. I guess in this case covering all possibilities would take too much processing power for not very much benefit to the end user. I ended up doing what you suggested by the way, converting to zenkaku and hankaku and then comparing the both to see if anything happened, so I can eliminate kanji. Thanks for the help! – yu_ominae Jun 22 '11 at 06:39
2
One approach is to compile a list of all characters you want to convert and how they map to each other, and then iterate the input string and replace all characters in the list with their equivalent.
var fullToHalf = new Dictionary<char, char>
{
...
{ '\u30E9', '\uFF97' }, // KATAKANA LETTER RA -> HALFWIDTH KATAKANA LETTER RA
{ '\u30EA', '\uFF98' }, // KATAKANA LETTER RI -> HALFWIDTH KATAKANA LETTER RI
...
};
var halfToFull = fullToHalf.ToDictionary(kv => kv.Value, kv => kv.Key);
var input = "\u30E9";
var isFullWidth = input.All(ch => fullToHalf.ContainsKey(ch));
var isHalfWidth = input.All(ch => halfToFull.ContainsKey(ch));
var result = new string(input.Select(ch => fullToHalf[ch]).ToArray());
// result == "\uFF97"

dtb
- 213,145
- 36
- 401
- 431
-
Thanks for the suggestion. I thought about something like this with a string.Contains and arrays containing all characters, but am wondering about the time that would take. A dictionary seems neater so I might give this a go. – yu_ominae Jun 22 '11 at 04:02
-
Just to say: thanks very much for the suggestion. I like the approach, but it was overly complex for what I was trying to achieve. Also one potential problem I have with this is the need to create the dictionaries beforehand... A bit strange that this should be the only way to do this in pure C# when VB has the StrConv() method. – yu_ominae Jun 22 '11 at 06:41
-
I've made a .NET Standard library and NuGet Package that implements this approach: https://github.com/bgever/HalfFullWidth – Bart Verkoeijen Sep 12 '20 at 17:09