5

I have code thus..

string text1 = "more text";
string text2 = string.Format("some text plus {0}", text1);

I convert it to Arabic:

string text1 = "المزيد من النص";
string text2 = string.Format("بعض النص بالإضافة إلى {0}", text1);

So far no problems!

I then switch my computer's language from English to Arabic and look at the code and the numbers have been automatically update to Arabic digits...

string text1 = "المزيد من النص";
string text2 = string.Format("{.} بعض النصوص بالإضافة إلى", text1);

(I had to fudge this using a period, but the Arabic symbol for 0 looks similar enough for this example).

Trouble is, when the Windows selected language is in Arabic, I am getting a string format exception error because the Format command is not recognising the Arabic symbol for zero.

To be specific, my question is: How do you ensure that the zero entered remains as a Latin digit, even when in a non-Latin language? or How can I get this to work without erroring?

EDIT:

Cut and paste straight from the IDE : first one Windows set to English, second picture, same code, Arabic...

English

Arabic

damichab
  • 162
  • 10
  • 1
    What if you try with `String.Format(CultureInfo.InvariantCulture, "...", ...)`? – haim770 Jun 30 '21 at 07:29
  • As I understand it, that would affect the whole string. I only want to maintain the integrity of the tag. There are times when I have other numbers in the string. – damichab Jun 30 '21 at 07:32
  • *"the numbers have been automatically update to Arabic digits"* - there is no automatic updates of strings. Or numbers in strings. From where this string with `{.}` comes? I'd expect you have localized strings and arabic strings are loaded when you switch the language. So someone simply did a mistake while localizing `{0}` sequences in text in arabic localization. – Sinatr Jun 30 '21 at 07:34
  • @Sinatr Apparently not so. In english the string is "بعض النص بالإضافة إلى {0}", but when viewed when windows is set to Arabic, the zero gets updated to the Arabic symbol (which looks like a dot). When back to Windows-English, it looks like a normal zero again and the statements run without issue. – damichab Jun 30 '21 at 07:43
  • How exactly do you look at string? *"the zero gets updated"* - this is what I don't understand. The string is combination of bytes, the bytes will not change by switching language. I am trying to understand at which step this change occurs. – Sinatr Jun 30 '21 at 07:50
  • Me too. I have updated the question with screen shots. I would have thought that no matter what the symbol for zero looked like, that the underlying '\u0030' would be the same. Best I can tell, it is, but it just will not work in Arabic. – damichab Jun 30 '21 at 08:12
  • Do you restart visual studio and it's the same cs-file where `{0}` become `{.}`? I guess it has something to do with cs-file encoding. Visual Studio doesn't load it correctly. – Sinatr Jun 30 '21 at 08:47
  • Do you have all numbers changed to arabic? Like variable names are also have this issue: `var test0` -> `var test.`? What about contstants? Does `000` become `...`? Or the change only happens in strings? Does change occurs to numbers without `{}` too? – Sinatr Jun 30 '21 at 08:53
  • It is just the numbers in the string. Mostly the numbers are already converted (as per Google translate), but the tags are left as is. But as you can see from the screen captures, the tags needed for string.Format change all by themselves. Changing language by installing language pack then selecting language, signing out and signing back in to effect the change. – damichab Jun 30 '21 at 09:13

2 Answers2

0

String.Format(CultureInfo.InvariantCulture,"",) Should be the way to go, but you need to have the string split into 2 parts : The First Part contains the Arabic text or the English text. The second part should contain the Numeric part, and that is the one you should be formatting.

Here is the documentation from Microsoft for you to look at and determine the specific culture that you want to use. https://learn.microsoft.com/en-us/dotnet/api/system.string.format?view=net-5.0#System_String_Format_System_IFormatProvider_System_String_System_Object_

  • If I could just break up the sections, then I would not need to use the tags in the first place. – damichab Jun 30 '21 at 08:03
  • What I mean is that you break the string into two parts, and then add them in one string again `string ax = string a + string x` – Mostafa Tarek Yassien Jun 30 '21 at 08:16
  • I didn't get it why would one need to split string at all? Sure, you found a workaround, but why did the constant string changed? – Sinatr Jun 30 '21 at 08:45
  • @Sinatr I thought about this workaround to make a static part which is the numeric part, and another part that is dynamic which is the text part. – Mostafa Tarek Yassien Jun 30 '21 at 08:47
  • @Sinatr The numeric part needs to be fixed with a certain format, and the text part is dynamic and changes from one language to another one, – Mostafa Tarek Yassien Jun 30 '21 at 08:48
  • Why numeric part needs to be fixed? I expect unicode string containing characters from multiple languages (english + arabic) to stay as it is. I can suggest my own workaround - extract arabic text into resource dll, while keeping sources in english. But why this *morphing* of `0` occurs in the first place? It shouldn't – Sinatr Jun 30 '21 at 08:50
  • from what I understood from the question is that he needs the "0" to stay as is in any language, so I thought of this workaround so he can control this specific character with an if statement for example. – Mostafa Tarek Yassien Jun 30 '21 at 08:52
  • Another workaround idea is [convert arabic digits](https://stackoverflow.com/q/5879437/1997232) into english number just before calling `string.Format`. – Sinatr Jun 30 '21 at 08:58
  • 1
    Sorry to dip out on the conversation, just had a power cut... I think I have a solution now. I do need the tag to be zero in any language, mostly it is, just having problems with Arabic and probably Bengali. Initial solution, I am swapping out each {0} and replacing with String.Format(CultureInfo.InvariantCulture, "{0}", "{0}") and putting inline. – damichab Jun 30 '21 at 09:07
  • @damichab I think that will be a great point, but it will also act as two strings, right? – Mostafa Tarek Yassien Jun 30 '21 at 10:36
  • 1
    @MostafaTarekYassien Will still be a single string, just the tag will be a bit more complicated. I am recreating my 'Arabic' file. I'll write up an explaination/answer in the morning after some testing. Thanking everyone for their help - much appreciated. – damichab Jun 30 '21 at 10:59
0

Final solution, with thanks to all who put in comments.

I swapped out all tags {0} and replaced with string.Format(CultureInfo.InvariantCulture, "{0}", "{0}")

so now my given example from above looks like...

string text1 = "المزيد من النص";
string text2 = string.Format(String.Format(CultureInfo.InvariantCulture, "{0}", "{0}")  + " بعض النصوص بالإضافة إلى", text1);

Screen shot for actual code whilst Windows has its language set to Arabic:

enter image description here

It appears that by using the string.Format in this fashion, that the tags do not revert to Arabic symbols and when each property is put into its own string.Format, it runs without creating an error.

I still do not know why I had to go to all this trouble, like comments above, it should have worked regardless. This to me is a big time hack if ever I saw one!

I did make an initial mistake with secondry tags. I swapped out {1} and replaced with string.Format(CultureInfo.InvariantCulture, "{1}", "{1}"). Of course this is a sub string.Format command where it still only has one tag.. Dumb error corrected by doing the following...

string.Format(CultureInfo.InvariantCulture, "{0}", "{1}")

damichab
  • 162
  • 10