I think I cannot find any better answer, so answering my own question.
Cause.
There are many languages using non-spacing modifiers for characters. For European languages, there are substitutions, e.g. "u" (U+0075) + "¨" (U+00A8) = "ü" (U+00FC)
. In this case, solution by @tchrist is quite sufficient.
However, for complex writing systems, there is no substitution for non-spacing modifiers. Therefore, NUnit's TextMessageWriter.WriteCaretLine(int mismatch)
treats mismatch
parameter as a byte offset, while screen representation of Thai string may be shorter than the length of caret line ("-----^"
).
Solution.
Force WriteCaretLine(int mismatch)
to respect non-spacing modifiers, reducing mismatch
value to the number of non-spacing modifiers occurred before this offset.
Implement all supplementary classes that are actually needed only to make your new code invoked.
Along with Thai, I have tested it with Devanagari and Tibetan. It works as expected.
Yet another pitfall. If you're using NUnit with Visual Studio through ReSharper like I do, you have to configure your Internet Explorer's fonts (it cannot be managed with R#) so that it used proper monospaced fonts for Thai, Devanagari, etc.
Implementation.
- Inherit
TextMessageWriter
and override its DisplayStringDifferences
;
- Implement your own
ClipExpectedAndActual
and FindMismatchPosition
- here are non-spacing modifiers are respected; Proper clipping is needed since it may also impact calculation of non-spacing elements.
- Inherit
EqualConstraint
and override its WriteMessageTo(MessageWriter writer)
so that your MessageWriter was used;
- Optionally, create a custom wrapper for simple invocation of custom constraint.
The source code goes below. About 80% of the code doesn't do anything useful, but it's included due to access levels in original code.
// Step 1.
public class ThaiMessageWriter : TextMessageWriter
{
/// <summary>
/// This method is merely a copy of the original method taken from NUnit sources,
/// except that it changes meaning of <paramref name="mismatch"/> before the caret line is displayed.
/// <remarks>
/// Originally passed <paramref name="mismatch"/> contains byte offset, while proper display of caret requires
/// it position to be calculated in character placeholder units. They are different in case of
/// over- or under-string Unicode characters like acute mark or complex script (Thai)
/// </remarks>
/// </summary>
/// <param name="clipping"></param>
public override void DisplayStringDifferences(string expected, string actual, int mismatch, bool ignoreCase, bool clipping)
{
// Maximum string we can display without truncating
int maxDisplayLength = MaxLineLength
- PrefixLength // Allow for prefix
- 2; // 2 quotation marks
int mismatchOffset = mismatch;
if (clipping)
MsgUtils2.ClipExpectedAndActual(ref expected, ref actual, maxDisplayLength, mismatchOffset);
expected = MsgUtils.EscapeControlChars(expected);
actual = MsgUtils.EscapeControlChars(actual);
// The mismatch position may have changed due to clipping or white space conversion
int mismatchInCharPlaceholders = MsgUtils2.FindMismatchPosition(expected, actual, 0, ignoreCase);
Write(Pfx_Expected);
WriteExpectedValue(expected);
if (ignoreCase)
WriteModifier("ignoring case");
WriteLine();
WriteActualLine(actual);
//DisplayDifferences(expected, actual);
if (mismatch >= 0)
WriteCaretLine(mismatchInCharPlaceholders);
}
// Copied due to private
/// <summary>
/// Write the generic 'Actual' line for a constraint
/// </summary>
/// <param name="constraint">The constraint for which the actual value is to be written</param>
private void WriteActualLine(Constraint constraint)
{
Write(Pfx_Actual);
constraint.WriteActualValueTo(this);
WriteLine();
}
// Copied due to private
/// <summary>
/// Write the generic 'Actual' line for a given value
/// </summary>
/// <param name="actual">The actual value causing a failure</param>
private void WriteActualLine(object actual)
{
Write(Pfx_Actual);
WriteActualValue(actual);
WriteLine();
}
// Copied due to private
private void WriteCaretLine(int mismatch)
{
// We subtract 2 for the initial 2 blanks and add back 1 for the initial quote
WriteLine(" {0}^", new string('-', PrefixLength + mismatch - 2 + 1));
}
}
// Step 2.
public static class MsgUtils2
{
private static readonly string ELLIPSIS = "...";
/// <summary>
/// Almost a copy of MsgUtil.ClipExpectedAndActual method
/// </summary>
/// <param name="expected"></param>
/// <param name="actual"></param>
/// <param name="maxDisplayLength"></param>
/// <param name="mismatch"></param>
public static void ClipExpectedAndActual(ref string expected, ref string actual, int maxDisplayLength, int mismatch)
{
// Case 1: Both strings fit on line
int maxStringLength = Math.Max(expected.Length, actual.Length);
if (maxStringLength <= maxDisplayLength)
return;
// Case 2: Assume that the tail of each string fits on line
int clipLength = maxDisplayLength - ELLIPSIS.Length;
int clipStart = maxStringLength - clipLength;
// Case 3: If it doesn't, center the mismatch position
if (clipStart > mismatch)
clipStart = Math.Max(0, mismatch - clipLength / 2);
// shift both clipStart and maxDisplayLength if they split non-placeholding symbol
AdjustForNonPlaceholdingCharacter(expected, ref clipStart);
AdjustForNonPlaceholdingCharacter(expected, ref maxDisplayLength);
expected = MsgUtils.ClipString(expected, maxDisplayLength, clipStart);
actual = MsgUtils.ClipString(actual, maxDisplayLength, clipStart);
}
private static void AdjustForNonPlaceholdingCharacter(string expected, ref int index)
{
while (index > 0 && CharUnicodeInfo.GetUnicodeCategory(expected[index]) == UnicodeCategory.NonSpacingMark)
{
index--;
}
}
static public int FindMismatchPosition(string expected, string actual, int istart, bool ignoreCase)
{
int length = Math.Min(expected.Length, actual.Length);
string s1 = ignoreCase ? expected.ToLower() : expected;
string s2 = ignoreCase ? actual.ToLower() : actual;
int iSpacingCharacters = 0;
for (int i = 0; i < istart; i++)
{
if (CharUnicodeInfo.GetUnicodeCategory(s1[i]) != UnicodeCategory.NonSpacingMark)
iSpacingCharacters++;
}
for (int i = istart; i < length; i++)
{
if (s1[i] != s2[i])
return iSpacingCharacters;
if (CharUnicodeInfo.GetUnicodeCategory(s1[i]) != UnicodeCategory.NonSpacingMark)
iSpacingCharacters++;
}
//
// Strings have same content up to the length of the shorter string.
// Mismatch occurs because string lengths are different, so show
// that they start differing where the shortest string ends
//
if (expected.Length != actual.Length)
return length;
//
// Same strings : We shouldn't get here
//
return -1;
}
}
// Step 3.
public class ThaiEqualConstraint : EqualConstraint
{
private readonly string _expected;
// WTF expected is private?
public ThaiEqualConstraint(string expected) : base(expected)
{
_expected = expected;
}
public override void WriteMessageTo(MessageWriter writer)
{
// redirect output to customized MessageWriter
var myMessageWriter = new ThaiMessageWriter();
base.WriteMessageTo(myMessageWriter);
writer.Write(myMessageWriter);
}
}
// Step 4.
public static class ThaiText
{
public static EqualConstraint IsEqual(string expected)
{
return new ThaiEqualConstraint(expected);
}
}