0

I'm trying to extract a form element value using regexp:

Pattern pattern = Pattern.compile("name=\"(token)\"[^>]*value=\"([^\"]+)\"", 2);

Matcher matcher = pattern.matcher(result); 

if(matcher.find())
{
    String value = matcher.group(2);
}
<input type="hidden" name="token" value="YToxOntzOjU"/>

However, my matcher yields no results. What am I missing?

Johan
  • 35,120
  • 54
  • 178
  • 293

1 Answers1

1

You should not parse HTML using regular expressions, but your written code seems to work fine?

String result  = "<input type=\"hidden\" name=\"token\" value=\"YToxOntzOjU\"/>";

Pattern pattern = Pattern.compile("name=\"(token)\"[^>]*value=\"([^\"]+)\"", 2);
Matcher matcher = pattern.matcher(result); 

if (matcher.find()) {
   String value = matcher.group(2);
   System.out.println(value); //=> "YToxOntzOjU"
}

Working Demo

Community
  • 1
  • 1
hwnd
  • 69,796
  • 4
  • 95
  • 132
  • @Johan Escaped quotes are needed in String _literals_. Where does that value come from? – Sotirios Delimanolis May 26 '14 at 23:09
  • @SotiriosDelimanolis Yeah, I misread it a bit. It's a response from a HttpClient -> InputStream -> to string. Perhaps I'm better off with a parsing library for html nodes? – Johan May 26 '14 at 23:11
  • @johan Yes, most probably. But the code you provided works for this case. You must not have what you are showing us. – Sotirios Delimanolis May 26 '14 at 23:12
  • @Johan Yes indeed, use a parser instead of trying to use regular expressions. – hwnd May 26 '14 at 23:13
  • @SotiriosDelimanolis Well, I've posted what I get when stepping through the code using a breakpoint. Perhaps the debugger translates html entities e.g. `<` is actually something like `<`? However, I think I'll go find a parsing lib instead. Got any suggestions? – Johan May 26 '14 at 23:17
  • 1
    @Johan jsoup probably. – Sotirios Delimanolis May 26 '14 at 23:19