17

What test text do you try and type into your web forms to check that they handle all the edge cases properly (especially Unicode and XSS style problems).

I am particularly interested in good Unicode strings that may do something odd if they are mis-encoded when they are displayed again.

Text that contains potentially problematic characters, like quotes, <, > etc would also be interesting.

Yi Jiang
  • 49,435
  • 16
  • 136
  • 136
Rik Heywood
  • 13,816
  • 9
  • 61
  • 81

4 Answers4

22

Your idea of HTML-sensitive characters is a good start. I also like using characters that are kind of readable, but are still Unicode. When I was doing this kind of testing for tabblo.com, I used this string:

Testing «ταБЬℓσ»: 1<2 & 4+1>3, now 20% off!

This has HTML-sensitive characters, ASCII, upper-half ISO characters, and multi-byte Unicode characters.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
11

Turkey testing!

http://www.moserware.com/2008/02/does-your-code-pass-turkey-test.html

This is actually pretty advanced internationalization testing, not for the faint of heart, including date formatting, percent calculations, upper/lowercase translations, etc.

willoller
  • 7,106
  • 1
  • 35
  • 63
  • Not really good test. Would fail in Poland. For example these are valid dates in Poland: 31.12.2018, 31/12/2018, 2018-12-31, 31.12. And these are valid numbers: 1234,56 and 1 234,56 and 1.234,56 – Tom Nov 13 '18 at 20:10
  • The point is that if i18n is fully implemented for Turkey, it will work many places worldwide, including Poland, since dates, money, and decimals are all handled using i18nionalized resources/libs/variables/strings. You would just set the region to Poland. – willoller Nov 13 '18 at 20:27
9

These smilies from SuperUser.com are pretty cool for testing your unicode support as well...

https://superuser.com/questions/52671/how-do-i-create-unicode-smilies-like

٩(-̮̮̃-̃)۶ ٩(●̮̮̃•̃)۶ ٩(͡๏̯͡๏)۶ ٩(-̮̮̃•̃).

Community
  • 1
  • 1
Rik Heywood
  • 13,816
  • 9
  • 61
  • 81
2

Well, this is a bit of a brute force approach, but if you wanted to start from some well formed Unicode and add some errors, a great resources for the real stuff is here: http://www.unicode.org/charts.

John Lockwood
  • 3,787
  • 29
  • 27