1

I want to transliterate Japanese to Romaji with Kakasi tool, using C#. For this, I created a wrapper:

    [DllImport("kakasi.dll")]
    static extern int kakasi_getopt_argv(int size, IntPtr param);
    [DllImport("kakasi.dll")]
    static extern IntPtr kakasi_do([MarshalAs(UnmanagedType.LPStr)]string str);

    public static void SetParams(string [] paramz)
    {
        kakasi_getopt_argv(paramz.Length, StringToIntPtr(paramz));
    }

    public static string DoKakasi(string japanese)
    {
        return Marshal.PtrToStringAuto(kakasi_do(japanese));
    }

    private static IntPtr StringToIntPtr(string[] strings)
    {
        int bytesCount;
        IntPtr ptr = IntPtr.Zero;
        ArrayList stringBytes = new ArrayList();
        foreach (string str in strings)
        {
            stringBytes.AddRange(Encoding.Unicode.GetBytes(str));
            stringBytes.Add((byte)'\0');
        }
        bytesCount = stringBytes.Count;
        try
        {
            ptr = Marshal.AllocHGlobal(bytesCount);
            Marshal.Copy((byte[])stringBytes.ToArray(typeof(byte))
                , 0
                , ptr
                , bytesCount);
            return ptr;
        }
        catch
        {
            if (ptr != IntPtr.Zero)
                Marshal.FreeHGlobal(ptr);
            throw;
        }
    }

And then:

KakasiCs.SetParams(new[] { "kakasi", "-ja", "-ga", "-ka", "-Ea", "-Ka", "-Ha", "-Ja", "-U", "-s",});
var x = KakasiCs.DoKakasi("さかき");

I have 2 problems:

  1. Bad output - I receive no romaji, but something strange: "㼿?Äꈎᅵ鄠".
  2. In VS2010 every time I receive a warning with PInvokeStackImbalance exception.

Any help is appreciated. Thanks.

Makoto
  • 104,088
  • 27
  • 192
  • 230
Zelzer
  • 561
  • 1
  • 5
  • 16
  • 1
    I'm not familiar with this library so I can't really help, but I noticed you have some Romaji in there that doesn't correspond with any Hiragana or Katakana. Specifically, Ea, ja and s. Ea requires 2 characters, ja kind of requires 2 (I'm actually not sure how that would be represented in unicode) and s just doesn't exist period. – MGZero Jun 22 '11 at 15:49
  • @MGZero, Thanks for the notice. I'm in japanese like in fog really, so it is interesting info for me. – Zelzer Jun 22 '11 at 16:00
  • I read up on the library a bit and it turns out this is for converting Kanji to Hiragana, Katakana or Romaji. Take note that this means you really won't be able to transliterate an entire sentance using this library. – MGZero Jun 22 '11 at 16:15

1 Answers1

1

I have used this library(only with c++ builder). Before passing the string to the kakasi, you should konvert string to the SHIFT-JIS code page. After processing convert it back to the unicode. Here the code that I use

    ...
    char*shift_jis=CodePageConverter::fromUnicode(932,InputTextBox->Text.c_bstr());
    char*converted_text=ProcessText(shift_jis);
    OutputTextBox->Text=CodePageConverter::toUnicode(932,converted_text);


    ...
    char* TForm1::ProcessText(char*string)
    {
      int paramscount=0;
      char**argv=CreateParameters(paramscount);

      kakasi_getopt_argv(paramscount, argv);
      char*result=kakasi_do(string);
      DeleteArguments(argv,paramscount);
      return result;
    }
    ...
Dmitry
  • 11
  • 1