2

I have an unmanaged DLL that gets called from .NET with pre-allocated buffers to get filled within the unmanaged DLL (according to Pass C# string to C++ and pass C++ result (string, char*.. whatever) to C#).

My unmanaged function has the following prototype:

myFunc(char* a_inBuf,  int a_InLen, 
       char* a_outBuf, int* a_pOutLen, 
       char* a_errBuf, int* a_pErrLen);

So, I declare the method in the managed code like this:

public static extern int myFunc(
  [In, MarshalAs(UnmanagedType.LPStr)] string inputXml, int inputLen,
  [MarshalAs(UnmanagedType.LPStr)] StringBuilder outputXml, ref int outputLen,
  [MarshalAs(UnmanagedType.LPStr)] StringBuilder errorXml, ref int errorLen);

Before calling myFunc, I create the two StringBuilders:

StringBuilder outputXml = new StringBuilder(100);
StringBuilder errorXml  = new StringBuilder(100);

After calling myFunc, I take the two StringBuilders and write them into an XML file (one for each StringBuilder) using

using (StreamWriter writer = new StreamWriter("OutputXmlFile.xml", false, Encoding.UTF8))
{
  writer.Write(outputXml.ToString());
  writer.Close();
}

The output shall be written in UTF8 since the Input is also UTF8. But unfortunately, the StringBuilder uses UTF16 encoding. The content of outputXml and errorXml gets filled in the unmanaged DLL also in UTF8 encoding. This behaviour should not be changed. When writing the files, special chars contained in the StringBuilders aren't written correctly.

How do I tell the StringBuilder that the content is actually NOT UTF16 but UTF8?


Edit: the answer provided by Polynomial indicates to use xmlWriter to write the file. But actually, the writing is just used for debugging output. In normal application run, the content of outputXml and errorXml is directly used within the program. Therefore, any hints regarding the use of special XML handling classes is not useful.

The actual issue is to get the correct strings out of the StringBuilder (or convert them to be correct).

Community
  • 1
  • 1
eckes
  • 64,417
  • 29
  • 168
  • 201

3 Answers3

3

You cannot convince the pinvoke marshaller to convert from utf8. It will either assume utf-16 or the system default code page and always convert to utf-16.

Not a problem, just do it yourself. Declare the arguments of type byte[] instead. Create the arrays before the call with the proper length, after the call use Encoding.UTF8.GetString() to convert.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
2

There's an absolutely awesome article on this topic which helped me solve exactly this problem. Here it is: http://www.undermyhat.org/blog/2009/08/tip-force-utf8-or-other-encoding-for-xmlwriter-with-stringbuilder/

Essentially, you have to use xmlWriter.ForceEncoding(Encoding.UTF8) to force the encoding, but there are some caveats to it. Give the article a read and it should help you understand what's going on, why it's UTF-16 in the first place and how to get round it.

Polynomial
  • 27,674
  • 12
  • 80
  • 107
  • thanks for the answer. I already read the article before but it doesn't actually cover my problem: the writing of the xml files is just for debugging. In normal application use, the content of `outputXml` and `errorXml` is directly used in the application without writing it out to a file... – eckes Nov 07 '11 at 09:36
1

Try doing something like this (it gives a way to override the default nature of .NET's UTF-16):

public class StringWriterWithEncoding : StringWriter
{
    Encoding encoding;

    public StringWriterWithEncoding (StringBuilder builder, Encoding encoding) :base(builder)
    {
        this.encoding = encoding;
    }

    public override Encoding Encoding
    {
        get { return encoding; }
    }

}

The logic behind this is it gives a means to override .NET's default UTF-16 encoding for StringWriters. Then you can cal it like this:

edit

StringBuilder builder = new StringBuilder();
StringWriterWithEncoding stringWriter = new StringWriterWithEncoding(builder, Encoding.UTF8)
XmlWriter writer = new XmlTextWriter( stringWriter );
return stringWriter.ToString();
Andrew Jackman
  • 13,781
  • 7
  • 35
  • 44