1

I want to retrieve some values from a webpage using Pattern & Matcher

<form name="loginForm"  id="loginForm"  method="post" onsubmit="ScrollUp(60);return validateLoginForm();" 
                 enctype="multipart/form-data" action="/login.php">
                 <input type="hidden" name="Rpidci" value="">
                <div class="last_box">
                    <div class="second_box_heading_panel">
                        <h1>Existing users  - 
                            <span> Login here</span>
                        </h1>
                    </div>
                    <div class="second_box_form_panel">
                        <div class="error-msg">
                                                        </div>
                        <div class="name_form_panel">
                            <div class="name">User Name
                            </div>
                            <div class="name_text_field">
                                <input name="sHZnGSgdzmIJoKWOCHmYez" type="text" class="existing_user round_four" id="sHZnGSgdzmIJoKWOCHmYez" maxlength="10" value=""/>
                            </div>
                        </div>
                        <div class="name_form_panel">
                            <div class="name">Password 
                            </div>
                            <div class="name_text_field"><input name="AWrPDfe" type="password" class="existing_user round_four" id="AWrPDfe" maxlength = "20"
                            value=""/>
                            </div>
                        </div>


                              <div class="login_btn"><a href="javascript:void(0);" onclick="javascript:ScrollUp(70);return validateLoginForm();"><img src="images/login_btn.png" title="login here" /></a></div>
                            </div>
                            </div>
                      <div class="name_form_panel"></div>

                                                        </div> 

                    </div>
              </form>

I want to retrieve values of this two fields

<input name="sHZnGSgdzmIJoKWOCHmYez" type="text" class="existing_user round_four" id="sHZnGSgdzmIJoKWOCHmYez" maxlength="10" value=""/>

&

<input name="AWrPDfe" type="password" class="existing_user round_four" id="AWrPDfe" maxlength = "20" value=""/>

I tried several times but failed in getting the output. Please help.

EDIT:

The code I tried is as below: (not the same as I wrote initially as I was frustrated and messed it up very much)

Matcher matcher = Pattern.compile("<form name=\"loginForm\" .+ method=\"post\" .+ action=\"/login.php\">\\s*<input[^>]+>\\s*<input[^>]+>\\s*").matcher(loginResp);

        String[] strArr = matcher.group(0).split("<input");
        String str1 = "";
        String str2 = "";
        String str3 = "";
        String str4 = "";

        Pattern localPattern = Pattern.compile(" name=\"([^\\s]+)\" type=\"text\" id=\"([^\\s]+)\" value=\"([^\\s]+)\" />");
        Matcher localMatcher2 = localPattern.matcher(strArr[3]);
        if (localMatcher2.find()) {
            str1 = localMatcher2.group(1);
            echo("STR1 " + str1);
            str2 = localMatcher2.group(3);
            echo("STR2 " + str2);
        }
android_newbie
  • 667
  • 2
  • 14
  • 32
  • you are [parsing html the cthulhu way](http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html) – jlordo Jan 15 '13 at 09:37
  • Well, to be fair, if it's only those two values, HTML parsing would be overkill. – Sentry Jan 15 '13 at 09:38
  • If it's only those two values, HTML parsing would be trivial and I doubt he'd be posting questions re regexps – Brian Agnew Jan 15 '13 at 09:40
  • Maybe he doesn't want another 200kb of HTML parser code. Btw: What exactly is the problem? Do you get the wrong result? Do you get no result? You do realize that you specified at least one character as value, but your example has the empty string "", which of course doesn't match. – Sentry Jan 15 '13 at 09:42
  • @Sentry i dont get any result that the problem/ – android_newbie Jan 15 '13 at 09:46
  • rt.jar is 50Mb on my machine. I don't know what proportion of this is loaded at runtime (dependent on the application) but 200Kb isn't going to trouble anything/one unless you're writing a WebStart app to be accessed via dialup – Brian Agnew Jan 15 '13 at 09:51

2 Answers2

2

As ever, I would recommend using an HTML parser such as JTidy or JSoup. You can't do this reliably using regular expressions and an HTML parser is a much easier solution.

Brian Agnew
  • 268,207
  • 37
  • 334
  • 440
0

You can use xpath query for getting values of those two fields instead of Regular Expression. Refer this link for xpath tutorial.

  • It would be more appreciated if you can post some relevant example from that site, because previously we have seen that the answer is accepted but the link has become dead so other people who has same question fail to collect the answer. – sadaf2605 Jan 15 '13 at 09:59