0

Let's say that I have a string: "\\u2026". And, I want it to change that to "\u2026" to print out the unicode in Scala. Is there a way to do that? Thank you for your time.

Edit: Let me clarify. Due to some circumstances, I have a string like: "Following char is in unicode: \\u2026", which prints:

Following char is in unicode: \u2026

But, I want to edit it so that it prints:

Following char is in unicode: …

Thank you for the answers. This is what I ended up doing.

def FixString(string: String) : String = {
  var newString = string;
  // Find the 1st problematic string
  var start = string.indexOf("\\u");
  while(start != -1) {
    // Extract the problematic string
    val end = start + 6;
    val wrongString = string.substring(start,end);
    // Convert to unicode
    val hexCode = wrongString.substring(2);
    val intCode = Integer.parseInt(hexCode, 16);
    val finalString = new String(Character.toChars(intCode));
    // Replace
    newString = string.replace(wrongString,finalString);
    // Find next problematic string
    start = string.indexOf("\\u", end);
  }
  return newString;
}

2 Answers2

2

If you know the string is exactly \uXXXX (unescaped), then

val stringWithBackslash = "\\u2026" // just for example
val hexCode = stringWithBackslash.substring(2) // "2026"
val intCode = Integer.parseInt(hexCode, 16) // 8230
val finalString = new String(Character.toChars(intCode)) // "…"

(code adapted from Creating Unicode character from its number). If not, pick the part you want with regular expression """\\u(\d{4})""".

Alexey Romanov
  • 167,066
  • 35
  • 309
  • 487
1

Short answer to the question as asked to use the String.replace method:

"\\u2026".replace("\\\\", "\\")

Notice that I had to double each backslash because the backslash character also begins Java String escape sequences.

If you want the JVM to perform UTF-8 IO (not required for this question), set the Java system property file.encoding=UTF-8, like this:

$ sbt console
Welcome to Scala 2.12.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_151).
Type in expressions for evaluation. Or try :help.

scala> System.setProperty("file.encoding","UTF-8")
res0: String = UTF-8

scala> val strWithError: String = "\\u2026"
strWithError: String = \u2026

scala> val prefixedString: String = strWithError.replace("\\\\", "\\") // corrected string as per OP
prefixedString: String = \u2026

Here is bonus information, adapted from https://stackoverflow.com/a/16034658/553865 (referenced by Alexey Romanov's answer):

scala> val utfString: String = strWithError.replace("\\u", "") // utf code point
utfString: String = 2026

scala>   val intCode = Integer.parseInt(utfString, 16)
intCode: Int = 8230

scala>   val symbol = new String(Character.toChars(intCode))
symbol: String = …
Mike Slinn
  • 7,705
  • 5
  • 51
  • 85