0

I am trying to simulate the login process to my facebook page using HtmlUnit (and I do have good reasons to do the same). Here is my java code for the same:

public static void main(String[] args) throws IOException {
//tried to experiment with the browser types also. But to the same result
//even using no param constructor does not help.
        WebClient webClient=new WebClient(BrowserVersion.CHROME);

        HtmlPage page1=webClient.getPage("https://www.facebook.com/bhramakarserver");
        HtmlForm loginForm=(HtmlForm)page1.getElementById("login_form");
        HtmlTextInput username=(HtmlTextInput)page1.getElementById("email");
        HtmlPasswordInput password=(HtmlPasswordInput)page1.getElementById("pass");
        username.setValueAttribute("myFbUsername");
        password.setValueAttribute("myFbPassword");
        HtmlElement button = (HtmlElement) page1.createElement("button");
        button.setAttribute("type", "submit");

        // append the button to the form
        loginForm.appendChild(button);
        page1=button.click();

        //page1.executeJavaScript("window.scrollBy(0,6000)"); does not work
        System.out.println(page1.asXml());
        HtmlSpan postContentSpan=(HtmlSpan)page1.getByXPath("//span[@class='userContent']").get(0);
        System.out.println(postContentSpan.asXml());
    }

When I run this, I get the following error:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
    at java.util.ArrayList.rangeCheck(ArrayList.java:604)
    at java.util.ArrayList.get(ArrayList.java:382)
    at com.rahulserver.fbhighlight.Main.main(Main.java:35)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)

So clearly the pathogenic line is

HtmlSpan postContentSpan=(HtmlSpan)page1.getByXPath("//span[@class='userContent']").get(0);

The xpath is returning null. I posted this question related to it and go the answer that that the code containing the above xpath is commented out,hence is returning null.

So why is that happening and how do I make it work? As the page loads on scrolling down further,as is usual with facebook, I tried to simulate the process using

page1.executeJavaScript("window.scrollBy(0,6000)"); 

But yet it does not work and I get the same result. Here is the generated html file's pastebin link:http://pastebin.com/MfXsYSJQ.

I am sure that someone on SO would be able to come up with an out-of-the box answer to it...

.

Community
  • 1
  • 1
rahulserver
  • 10,411
  • 24
  • 90
  • 164
  • Are you able to login on fb account ?? – Kick Jan 25 '14 at 07:12
  • @user2115021 Yes!The pastebin code i have shown above is same as the code of the page i find after logging in to facebook. i think this has to do something with the onPageletArrive scripts. – rahulserver Jan 25 '14 at 07:15
  • I run the above code locally and found that after the successful login no SPAN exist will class attribute value as 'userContent'. Can you recheck which element content you want to c. – Kick Jan 25 '14 at 07:24
  • Just I want to get the tag containing the post having this text as substring: "This is the third post".To check it, ctrl+f in the pastebin link I have sent. You will get it in the span which looks like:This is the third post of this page. – rahulserver Jan 25 '14 at 07:30
  • I am working and let u knw the update – Kick Jan 25 '14 at 07:41
  • It seems rahul that some script is running on fb which is commented the data of the fb.If you try to view the page(page.asXml()) you will find that all the SPAN which are showing the data is commented. – Kick Jan 25 '14 at 07:59
  • @user2115021 yes you are right. But then the page, if you see in a normal browser, has also the span elements commented out. so how are the commented lines being rendered? – rahulserver Jan 25 '14 at 08:15

2 Answers2

0

The issue arise due to Browser you are using,a lso need to add the AJAX support and javascript wait.Change the Browser and need to add some more lines which are as below :

WebClient webClient=new WebClient(BrowserVersion.FIREFOX_3_6);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.waitForBackgroundJavaScript(50000);

The FireFox 3.6 is deprecated but it is better that however application runs.

Feel free to select as correct answer if it fulfill ur pblm.

Kick
  • 4,823
  • 3
  • 22
  • 29
  • It, as I expected did not work. Its not due to browser issues.If you see it in actual web page, you also would be surprised to see that on selecting the text of a post, it shows it to be normal.But if you do a view page source, you will find it to be in a comment. – rahulserver Jan 25 '14 at 10:22
  • The above code run on my system,ur content "This is the third post of this page" is scrapped.I also observed fb behavior that code is commented during view source code but if u save the source code as .html and open the file u will see all the code as uncommented. – Kick Jan 25 '14 at 12:06
0

The below code is running on my system.Please find the code

import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlForm;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlPasswordInput;
import com.gargoylesoftware.htmlunit.html.HtmlSpan;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;
import java.io.IOException;

public class App {

   public static void main(String[] args) throws IOException {

       WebClient webClient=new WebClient(BrowserVersion.FIREFOX_3_6);
        webClient.setAjaxController(new NicelyResynchronizingAjaxController());
        webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
        webClient.getOptions().setThrowExceptionOnScriptError(false);
        webClient.waitForBackgroundJavaScript(50000);
        HtmlPage page1=webClient.getPage("https://www.facebook.com/bhramakarserver");
        HtmlForm loginForm=(HtmlForm)page1.getElementById("login_form");
        HtmlTextInput username=(HtmlTextInput)page1.getElementById("email");
        HtmlPasswordInput password=(HtmlPasswordInput)page1.getElementById("pass");
        username.setValueAttribute("username");
        password.setValueAttribute("password");
        HtmlElement button = (HtmlElement) page1.createElement("button");
        button.setAttribute("type", "submit");

        // append the button to the form
        loginForm.appendChild(button);
        page1=button.click();

        HtmlSpan postContentSpan=(HtmlSpan)page1.getByXPath("//span[@class='userContent']").get(0);
        System.out.println("The content is "+postContentSpan.asXml());
    }
}
Kick
  • 4,823
  • 3
  • 22
  • 29
  • Its not possible.The paste code is running on my system and i m using html 2.13 version.Have u make any change in above code? If not then check the code on some another system. – Kick Jan 26 '14 at 14:51
  • I have just replaced the "username" with the actual user name i use to login and password. I would check on another system and let you know. – rahulserver Jan 26 '14 at 14:56