1

So I have installed Java on my CentOS server. I now want to be able to use PHP to run HTMLUnit to get a fully rendered webpage and then return the results to the user.

I see the "simple" example on HTMLUnit but I know next to nothing about Java and don't know where that needs to go or be ran to even get the test case working (i.e. getting Google's homepage).

public void getURL() throws Exception {
    final WebClient webClient = new WebClient();
    final HtmlPage page = webClient.getPage("http://google.com"); // Pass in URL

    // RETURN "page"
}

Once the test is working I would need to be able to "pass" in the desired URL and then "capture" the output.

So far Googling as me running in circles. Does anyone have a link to a simple example, and then pointers on how to integrate it with PHP?

Thanks!

Dough Boy
  • 39
  • 3

2 Answers2

1

Get HTML using java


    import java.io.BufferedWriter;
    import java.io.IOException;
    import java.io.OutputStreamWriter;
    import java.net.URL;
    import java.util.List;
    import java.lang.String;

    import com.gargoylesoftware.htmlunit.Page;
    import com.gargoylesoftware.htmlunit.WebClient;
    import com.gargoylesoftware.htmlunit.html.HtmlPage;

    public class GetHtml {

        public static void main(String[] args) throws IOException {
            WebClient webClient = new WebClient();
            webClient.getOptions().setThrowExceptionOnScriptError(false);
            webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
            webClient.getOptions().setJavaScriptEnabled(false);
            HtmlPage page = webClient.getPage("http://google.com"); // Pass in URL
            String originalHtml = page.getWebResponse().getContentAsString();
            System.out.println(originalHtml);
        }

    }

Get result from php


    exec("java -jar ", $output);

$output is your expected data.

Chel
  • 11
  • 3
1

You can use PHP's shell_exec() call to start HTMLunit console line and capture the output. As for the code, this should get you started:

import java.io.IOException;
import java.net.URL;
import java.util.List;
import java.lang.String;

import com.gargoylesoftware.htmlunit.Page;
import com.gargoylesoftware.htmlunit.WebClient;

public class myClient {
    public static void main(String[] args) throws Exception {
        // Create and initialize WebClient object
        WebClient webClient = new WebClient();
        HtmlPage page = webClient.getPage("http://google.com"); // Pass in URL
        Console.out.println(page.toString());
    }   
}

Then, from php:

$html = shell_exec('/bin/javac myClient.java');

I can't test it at the moment, so sorry for any code mistakes.

Lars
  • 5,757
  • 4
  • 25
  • 55