10

I'm having some trouble figuring out how to get the content of some HTML after javascript has updated it.

Specifically, I'm trying to get the current time from US Naval Observatory Master Clock. It has an h1 element with the ID of USNOclk in which it displays the current time.

When the page first loads, this element is set to display "Loading...", and then javascript kicks in and updates it to the current time via

function showTime()
    {
        document.getElementById('USNOclk').innerHTML="Loading...<br />";
        xmlHttp=GetXmlHttpObject();
        if (xmlHttp==null){
            document.getElementById('USNOclk').innerHTML="Sorry, browser incapatible. <BR />";
            return;
        } 
        refresher = 0;
        startResponse = new Date().getTime();
        var url="http://tycho.usno.navy.mil/cgi-bin/time.pl?n="+ startResponse;
        xmlHttp.onreadystatechange=stateChanged;
        xmlHttp.open("GET",url,true);
        xmlHttp.send(null);
    }  

So, the problem is that I'm not sure how to get the updated time. When I check the element, I see the "Loading..." as the content of the h1 element.

I've double checked that javascript is enabled, and I've tried calling the waitForBackgroundJavaScript function on the webclient as well hoping that it would give the javascript time to start updating stuff. However, no success as of yet.

My Current Code:

import com.gargoylesoftware.htmlunit._
import com.gargoylesoftware.htmlunit.html.HtmlPage

object AtomicTime {

  def main(args: Array[String]): Unit = {
    val url = "http://tycho.usno.navy.mil/what.html"
    val client = new WebClient(BrowserVersion.CHROME)
    
    println(client.isJavaScriptEnabled()) // returns true
    client.waitForBackgroundJavaScript(10000)
//    client.waitForBackgroundJavaScriptStartingBefore(10000) //tried this one too without success
    var response: HtmlPage = client.getPage(url)
    println(response.asText())
  }
}

How do I trigger the javascript to update the HTML?

Community
  • 1
  • 1
Zack Yoshyaro
  • 2,056
  • 6
  • 24
  • 46
  • So, right off the top - your code, above, is running "server side". Given the nature of your question, I will assume that JavaScript is running "client side" (browser). You'd be well served exploring that *vast* area for any number of ideas and approaches (AJAX push comes to mind) in the context of your project & tooling. – Richard Sitze Jul 24 '13 at 20:01
  • OR exploring design alternatives - do you really need that info "up-to-date" on the server? Can something be provided in a field on a POST? – Richard Sitze Jul 24 '13 at 20:16

2 Answers2

12

I figured it out!

HtmlPage objects have an executeJavaScript(String) which can be used to kick off the showTime script. Then, once the script has actually started, that's when waitForBackgroundJavaScript becomes relevant.

The code I ended up with:

import com.gargoylesoftware.htmlunit._
import com.gargoylesoftware.htmlunit.html.HtmlPage
import com.gargoylesoftware.htmlunit.html.DomElement

object AtomicTime {

  def main(args: Array[String]): Unit = {
    val url = "http://tycho.usno.navy.mil/what.html"
    val client = new WebClient(BrowserVersion.CHROME)

    var response: HtmlPage = client.getPage(url)
    response.executeJavaScript("showTime")

    printf("Current AtomicTime: %s", getUpdatedRespose(response, client))
  }

  def getUpdatedRespose(page: HtmlPage, client: WebClient): String = {
    while (page.getElementById("USNOclk").asText() == "Loading...") {
      client.waitForBackgroundJavaScript(200)
    }
    return page.getElementById("USNOclk").asText()
  }
}
Zack Yoshyaro
  • 2,056
  • 6
  • 24
  • 46
1

Although the waitForBackgroundJavaScript method seems to be a good alternative it's worth mentioning that it is experimental. You can see that in the JavaDocs that state:

Experimental API: May be changed in next release and may not yet work perfectly!

So I recommend to go for a slightly more complex approach:

int amountOfTries = 10;
while (amountOfTries > 0 && CONDITION) {
    amountOfTries--;
    synchronized (page) {
        page.wait(1000);
    }
}

Note the amountOfTries condition is there to take appropriate action if there has been some kind of issue with the request. Otherwise, you will end up getting your self into an infinite loop. Be careful with that.

Then you should replace CONDITION with your actual condition. In this case it is

page.getElementById("USNOclk").asText().equals("Loading...")

In short, what the code above does is checking for the condition to become true each second for a maximum of 10 seconds.

Of course, a better approach would be to extract this error checking behavior into a separate method so that you can reuse the logic on different conditions.

Mosty Mostacho
  • 42,742
  • 16
  • 96
  • 123