0

I have a simple WebViewClient class that should return all the <p> elements of the page I am currently on (in a webview)

Here is the code

public class SearchClient extends WebViewClient {

class MyJavaScriptInterface {
    @SuppressWarnings("unused")
    public void processHTML(String[] html) {
    Log.w("Length", String.valueOf(html.length));
        for(String s : html)
        {

            Log.w("Row", s.toString());
        }



    }
}


public SearchClient(WebView wv)
{
    wv.addJavascriptInterface(new MyJavaScriptInterface(), "HTMLOUT");
}

public void onPageFinished(WebView view, String url) {
    view.loadUrl("javascript:window.HTMLOUT.processHTML(document.getElementsByTagName('p'));");

}

}

document.getElementsByTagName clearly is returning the elements because "Log.w" line in the processHTML function has over 100 strings ... but the for loop crashes. Why is this??

Nunser
  • 4,512
  • 8
  • 25
  • 37
  • I found out that getElementsByTagName actually returns a NodeList ... I changed the String[] parameter with NodeList but still come up with java.lang.NullPointerException – Steve Jpbs May 13 '13 at 15:28

1 Answers1

0

NodeList is a JavaScript class, but only a Java interface. In JavaScript you should create a method that converts the list to and array of Strings then you can pass it into your JavaScriptInterface method.

From this SO question a solution could be:

function toArray(obj) {
  var array = [];
  // iterate backwards ensuring that length is an UInt32
  for (var i = obj.length >>> 0; i--;) { 
  array[i] = obj[i];
  }
  return array;
} 

It is also possible this would work (from question in link above), but have not tested myself:

Array.prototype.slice.call(list,0)

Community
  • 1
  • 1
Jon
  • 1,820
  • 2
  • 19
  • 43
  • Ok thanks for the help but I actually went back to a String[] as a parameter to processHTML ... the reason is because the length shows that there are 100 strings being returned ... but when I try to access them in the for loop I get a nullpointer ... don't understand that if there are 100 strings present. Should I actually use NodeList and do what Jon said ... if so how do I put his code in my view.loadUrl function?? since I already have it calling getElementsByTagName. – Steve Jpbs May 13 '13 at 16:21
  • Tried this.... view.loadUrl("javascript:window.HTMLOUT.processHTML(function toArray(document.getElementsByTagName('p')) { var array = []; for (var i = obj.length >>> 0; i--;) { array[i] = obj[i]; } return array;};"); doesn't crash but I don't get the call to processHTML – Steve Jpbs May 13 '13 at 16:47
  • Maybe it is because you are passing in the function called `toArray()` into the `processHTML()` function? JavaScript knows that you have a function named `processHTML()` but it only takes a `String[]` as a parameter. I think JavaScript will just silently fail if it does not find the method it thinks you want, thus not throw any error. Try creating the `toArray()` function in your JavaScript, not just loading it up here, and then calling this on the WebView : view.loadUrl("javascript:window.HTMLOUT.processHTML(function toArray(document.getElementsByTagName('p') ) ) "); – Jon May 13 '13 at 17:47
  • Also, you cannot use `NodeList`s in Java, even if it is in a `JavaScriptInterface`. You have to convert it to a `String[]` for your `JavaScriptInterface`to be able to use it. – Jon May 13 '13 at 17:49
  • well there is the problem ... the page I'm loading doesn't have any javascript in it ... I'm trying to create a basic news reader app ... once I load the page I want to get all the

    elements this is where the data I need resides. Its not my page so I can't add the javascript ... I need to inject it after the page loads. I came up with this but again nothing but silence ...

    – Steve Jpbs May 13 '13 at 19:19
  • view.loadUrl("javascript:function toArray(obj) { var array = []; for (var i = obj.length >>> 0; i--;) { array[i] = obj[i]; } return array;}"); view.loadUrl("javascript:window.HTMLOUT.processHTML(toArray(document.getElementsByTagName('p')));"); – Steve Jpbs May 13 '13 at 19:20
  • also tried this ... view.loadUrl("javascript:window.HTMLOUT.processHTML(Array.prototype.slice.call(document.getElementsByTagName('p'),0));"); the processHTML function is called but again a nullpointer in the loop. – Steve Jpbs May 13 '13 at 19:21
  • Just seems wierd because I can do this no problem ... view.loadUrl("javascript:window.HTMLOUT.processHTML(document.getElementsByTagName('p')[0].innerHTML);"); and change the String[] parameter in the processHTML to just a String ... but its only one

    ... I need them all. if I use the original code ... the String[] does have 100 Strings but when the loop starts and I try to access them I get a nullpointer ... how can there be 100 strings in the array but the array be null??

    – Steve Jpbs May 13 '13 at 19:28
  • It is not actually an array, it is a JavaScript `NodeList` pretending to be a `String[]`. It knows how many items it has, but it does not get accessed in the same way an array does, so it will blow up when you try to access it at a given index. – Jon May 13 '13 at 19:55
  • I believe there are ways to add JavaScript to a webpage after it is already loaded, that would be the first place I would start. Still though, this seems kinda sketchy that you are trying to manipulate a webpage that isn't yours and you have no server side control over. While I am sure there are legitimate use cases for this, it sounds just as much like you could be a hacker trying to present the user with a false version of the page. (No offense if you are not actually a hacker.) – Jon May 13 '13 at 20:00
  • No no no ... not doing anything like that, no offence taken. Its just a simple news reader ... rather than using BufferedReader and going over the whole page line by line till I find the data I need it would be nice to just grab all the elements at once then go over them and forget about the rest of the page. – Steve Jpbs May 13 '13 at 20:12
  • This works but it seem inefficient because there could be up to 100 calls to processHTML rather than 1. for(int x = 0; x<=10;x++) view.loadUrl("javascript:window.HTMLOUT.processHTML(document.getElementsByTagName('p')[" +x+ "].innerHTML);"); – Steve Jpbs May 13 '13 at 20:21
  • Check this interface out, you may be able to process the NodeList in Java after all: http://developer.android.com/reference/org/w3c/dom/NodeList.html – Jon May 13 '13 at 20:49
  • I actually did have a look at that interface some time at the begining of this whole conversation ... built a class that implimented NodeList ... setup the required methods. But given that getElementsByTagName returns a NodeList and not the class I built I was at a loss once again. NullPointer !!! – Steve Jpbs May 13 '13 at 22:36