1

I have a data source that contains a \u2265, which is the greater or equal sign appearing as a single character just like this:

However when the data reaches my application it comes in as \\u2265.

I'm not concerned right now with why the extra \ is occurring.

Instead I just want to be able to convert the \\u2265 to a \u2265 inside my application.

I have tried various string manipulations such as these, and they either had syntax problems, or simply didn't work:

disp = dataSource.Replace('\\', '\');  
disp = dataSource.Replace("\\", "\");

and

int x = dataSource.IndexOf("\\u");
disp = dataSource.Substring(0, x) + dataSource.Substring(x+1);

Part of the problem with the Substring technique is that \\ is being seen as a single character.

dev1998
  • 882
  • 7
  • 17
  • Your two screenshots show the exact same code - so how are you seeing different results? Also, how is the data getting to your application? (Basically, we'll need more context in order to help.) – Jon Skeet Dec 13 '18 at 17:09
  • \ is an escape character, so when you see `"\\"` there's actually only 1 \ in the string. – user247702 Dec 13 '18 at 17:09
  • I am not sure I am following, just adding the char to the string should work. [See here](https://dotnetfiddle.net/f0v8BU) – maccettura Dec 13 '18 at 17:09
  • Thanks @Stijn for the edit. – dev1998 Dec 13 '18 at 17:11
  • I think the second screen shot shows `disp` after `disp = dataSource` whereas the first one shows `disp` right before that. – MikeH Dec 13 '18 at 17:11
  • It looks like the screenshots are confusing the issue. I'm going to get rid of them. – dev1998 Dec 13 '18 at 17:11
  • @dev1998 The screen shots actually help me... – MikeH Dec 13 '18 at 17:12
  • 1
    I think a [mcve] might help. – Phil M Dec 13 '18 at 17:13
  • @dev1998 Look at my comment and the link. Your code should work 100% as expected. If its not, you need to give us a [MCVE] – maccettura Dec 13 '18 at 17:13
  • @maccettura the problem is that his dataSource is bringing it in as a string, not a char – MikeH Dec 13 '18 at 17:14
  • @maccettura, The problem is that the incoming data is where the problem is, but I can't easily do anything about that right now. I'm just trying to recognize the \\u and fix that so it is a \u. – dev1998 Dec 13 '18 at 17:16
  • @MikeH Yeah but OP's original screenshot shows `disp` being completely overriden with the value of `dataSource`. OP's problem is _very_ unclear – maccettura Dec 13 '18 at 17:16
  • What is _"a data source that contains a `\u2265`"_? When you read it, do you see a string like _"x \u2265 y"_, with the backslash doubled up when you look at it in the debugger? If so, that's not the same as _"x ≥ y"_. The latter has a ≥ character in it, the former has that character represented as a six character string. Have you tried substituting "≥" for "\u2265"? – Flydog57 Dec 13 '18 at 17:16
  • 2
    @dev1998 you simply can't convert `\\u2265` into `≥` by replacing ``\\`` with ``\``.. Check this https://stackoverflow.com/questions/9738282/replace-unicode-escape-sequences-in-a-string – laika Dec 13 '18 at 17:17

2 Answers2

3

The problem is that your data source has the \u2265 as a string and is not treating it as a char. If you replace the string version with a char version you'll get what you need.

data = data.Replace("\\u2265", '\u2265'.ToString());

The reason for the ToString() is so that we can use the appropriate overload for Replace.

Simpler version supplied by @Flydog57:

data = data.Replace("\\u2265", "\u2265");

General version inspired by @Jimi:

Regex r = new Regex(@"\\u[0-9]{4}");
var matches = r.Matches(data);
var uniqueMatches = matches
      .OfType<Match>()
      .Select(m => m.Value)
      .Distinct();

foreach (var m in uniqueMatches)
{
  data = data.Replace(m,Regex.Unescape(m));
}
user247702
  • 23,641
  • 15
  • 110
  • 157
MikeH
  • 4,242
  • 1
  • 17
  • 32
3

Another option, of more general use (but not always applicable, it might interfere with the string content) is to unescape the string using Regex.Unescape():

string myString = "\\u2265" + "20180426";
myString = Regex.Unescape(myString);

will print

≥20180426
Jimi
  • 29,621
  • 8
  • 43
  • 61
  • Nice! The last piece to the puzzle is the `Unescape`. I couldn't come up with it! – MikeH Dec 13 '18 at 17:44
  • @MikeH Yes, but it could mess with the string content. If the content is just what is shown here, it can work. – Jimi Dec 13 '18 at 17:47
  • 1
    I think updated my version (shamelessly borrowing from you) avoids the mess. +1 from me though. – MikeH Dec 13 '18 at 17:48
  • The next problem I had was to decide between your answer and @MikeH as being correct. I decided to go with MikeH because of the way it looked for the unique matches. However both answers were up voted by me. I appreciate all the help. – dev1998 Dec 13 '18 at 18:04
  • Absolutely. @MikeH Made all the *hard work*. Anyway, this is your question, you don't need to justify your choices :) – Jimi Dec 13 '18 at 18:07