0

I have a really long in-memory string stored in a val response: String that looks something like

HTTP/1.1 200 OK
Server: Apache
Other: headers

<response>
<xml>
---much much more XML---
</xml>
</response>
0

I want to extract the

<response>
...
</response>

part of the string. So far, I have this:

"<response>.*?</response>".r findFirstIn response

...but for some reason Scala returns None. I did figure out a way to do this with indices and the slice function, but there has to be a neat way with regex. Anyone know how?

fedenusy
  • 274
  • 3
  • 14
  • 2
    http://stackoverflow.com/a/1732454/66686 – Jens Schauder Jul 17 '12 at 10:25
  • Indices plus slicing is going to be quite a bit faster. – Rex Kerr Jul 17 '12 at 10:51
  • @JensSchauder the XML is coming from a trusted source, and I know there's only one element, so regex does the job of sanitizing input before passing it to the XML builder. In any case it turned out Tag Soup's lexer could handle the input without even sanitizing it in the first place. – fedenusy Jul 20 '12 at 22:31

1 Answers1

5

First of all, it is probably a much better idea to use an XML parser when working with XML responses. It might look like an overkill at first, but as the project grows it is very likely that you'll end up with having to parse more complicated XML documents, which will be much harder with regex than with a full-fledged XML parser.

Anyway, this regex works:

"(?s)<response>.*?</response>".r findFirstIn response

(?s) sets the DOTALL flag.

Malte Schwerhoff
  • 12,684
  • 4
  • 41
  • 71
  • Thank you, that works. I'm definitely using an XML parser - this regex just cleans up the input before I pass it into the XML builder :) – fedenusy Jul 17 '12 at 10:06