0

I have long string with Currency that look like the following :

...<option value="USD">USD - United States Dollar</option>    <option value="JPY">JPY - Japanese Yen</option>...

What is the fastest way in order to extract 2 values:

USD
USD - United States Dollar
metasim
  • 4,793
  • 3
  • 46
  • 70
david hol
  • 1,272
  • 7
  • 22
  • 44
  • 1
    I don't know what the "fastest" way is, but [jsoup.org](http://jsoup.org). – Michael Zajac Mar 08 '16 at 14:03
  • You mean how do you extract `X` and `Y` from the string ``? – Assaf Mar 08 '16 at 14:07
  • @Assaf Yes, this is what i want to do – david hol Mar 08 '16 at 14:09
  • 2
    Can we have the whole String? The substring you gave suggests it actually has XML format, in which case you can load it as XML with `scala.xml.XML.loadString(yourStringHere)` and then extract elements, as shown in [here](http://alvinalexander.com/scala/how-to-extract-data-from-xml-nodes-in-scala) for example. – John K Mar 08 '16 at 14:20

1 Answers1

1

If it's really just getting certain substrings out of a string then I'd go with a regex here.

Use a capturing group (make sure it's not greedy) to get the parts of the string that interest you (in this case the value property and the tag content).

val str =
  """<option value="USD">USD - United States Dollar</option><option value="JPY">JPY - Japanese Yen</option>"""
val pattern = """<option value="(.+?)">(.+?)</option>""".r

pattern.findAllMatchIn(str).foreach(x => println(x.group(1) + " " + x.group(2)))
/* output:
 * USD USD - United States Dollar
 * JPY JPY - Japanese Yen
 */
Assaf
  • 1,352
  • 10
  • 19
  • Maybe you can directly use: `pattern.findAllMatchIn(str).foreach(x => println(x.group(1) + " " + x.group(2)))` – chengpohi Mar 08 '16 at 14:27
  • Correct sir, consider me chastised. – Assaf Mar 08 '16 at 14:32
  • Using regex to parse html is [generally not a good idea](http://stackoverflow.com/a/1732454/2292812) – Michael Zajac Mar 08 '16 at 14:54
  • OK, that was extremely... poetic. I agree with that 100% though, so OP, if the string is inside an greater HTML, you should not, repeat should not use this solution. If however it's just a string then go ahead. – Assaf Mar 08 '16 at 15:20