-1

I have a String like:

String s = "IPhone 5s $400"

Which I want to get as:

String s1 = "IPhone 5s";

String s2 = "$400";

What I tried was getting the last word of the String and checking if it starts with "&#" and if yes, take it out of the String and make it UTF-8.

But it might be possible that the Price may not be at the end of the String.

Also, does all the currency symbols start with "$#"?

Nordehinu
  • 338
  • 1
  • 3
  • 11
Housefly
  • 4,324
  • 11
  • 43
  • 70
  • 2
    Use grouping to catch the string between `` and `;`. – Maroun Oct 01 '14 at 09:37
  • "does all the currency symbols start with "$#"?" -- that depends on how the string is encoded. The weird thing is, this seems XML or HTML, where plain ASCII characters such as `$` do not *need* to be encoded. – Jongware Oct 01 '14 at 09:39
  • We can not answer the last question, you have to tell us where this `String` comes from and what it's format is. If you don't know, you first need to figure that out, before doing some regex... – brimborium Oct 01 '14 at 09:39
  • @brimborium It is a parsed string from XML – Housefly Oct 01 '14 at 09:41
  • @Jongware Not all of the currency symbols are within the ASCII set, therefore it would be actually nice to have all currency symbols encoded so they are easier to detect. – brimborium Oct 01 '14 at 09:41
  • @brimborium: true, but surely there could be *other* non-ASCII characters in the string as well, so one still needs to check 'everything'. – Jongware Oct 01 '14 at 09:48
  • @Jongware Correct. As long as OP doesn't completely specifies the format, we can only guess. ;) – brimborium Oct 01 '14 at 10:34

2 Answers2

0

Split before &#:

String s = "IPhone 5s $400"
String[] splitArray = subjectString.split("(?=&#)");

After that, call StringEscapeUtils.unescapeHtml4 on the strings in that array to correctly decode the symbols. If there is only one such symbol (as in your case):

splitArray[1] = StringEscapeUtils.unescapeHtml4(splitArray[1]);
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
0

You can either use the StringEscapeUtils or the standard libs with regex, then split the string.

import java.io.*;

class Test{
   public static void main(String args[]){
      String p = new String("IPhone 5s $400");
      String r = p.replaceAll("($)","\\|\\$");

      String[] s = r.split("\\|"); 
      System.out.println(s[0]);    // IPhone 5s
      System.out.println(s[1]);    // $400
   }
}

Example:

http://ideone.com/epttX4

l'L'l
  • 44,951
  • 10
  • 95
  • 146