I am trying to parse a string (returned by a web server), which contains non-standard (as far as I can tell) unicode Id's such as "\Ud83c" or "\U293c", as well as plain text. I need to display this string, emojis in tact, to the user in a datagrid view.
btw, I am blind so please excuse any formatting errors :(
full example of what my code is parsing: "Castle: \Ud83d\Udc40Jerusal\U00e9m.Miles" the code I wrote which is failing miserably:
Public Function ParseUnicodeId(LNKText As String) As String
Dim workingarray() As String
Dim CurString As String
Dim finalString As String
finalString = ""
' split at \ char
workingarray = Split(LNKText, chr(92))
For Each CurString In workingarray
If CurString <> "" Then
' remove leading U so number can be converted to hex
CurString = Right(CurString, Len(CurString) - 1)
' attempt to cut off right most chars until number can be converted to text as there is nothign separating end of Unicode chars and start of plain text
Do While IsNumeric(CurString) = False
If CurString = "" Then
Exit Do
End If
CurString = Left(CurString, Len(CurString) - 1)
Loop
If CurString.StartsWith("U", StringComparison.InvariantCultureIgnoreCase) Then
CurString = CurString.Substring(1)
End If
' convert result from above to hex
Dim numeric = Int32.Parse(CurString, NumberStyles.HexNumber)
' convert to bytes
Dim bytes = BitConverter.GetBytes(numeric)
' convert resulting bytes to a real char for display
finalString = finalString & Encoding.Unicode.GetString(bytes)
End If
Next
ParseUnicodeId = finalString
End Function
I tried to do this all kinds of ways; but can't seem to get it right. My code currently returns empty strings, although my guess is that is because of some of the more recent changes I have made to cut off the leading U or to try and chop off one char at a time. If I take those bits out and just pass it something like "Ud83c", it works perfectly; its only when plain text is mixed in that it fails, but I can't seem to come up with a way to separate the two and re-combine at the end.