0

I'm trying to parse some int s from a webpage and I have run into some problems:

1 The webpage is generated using javascript.

This sample code (Credz to Oracle.com. StackOverflow will not let me link) prints out the html code before the javascript is executed.

import java.net.*;
import java.io.*;

public class URLConnectionReader {
    public static void main(String[] args) throws Exception {
        URL oracle = new URL("http://www.oracle.com/");
        URLConnection yc = oracle.openConnection();
        BufferedReader in = new BufferedReader(new InputStreamReader(
                                    yc.getInputStream()));
        String inputLine;
        while ((inputLine = in.readLine()) != null) 
            System.out.println(inputLine);
        in.close();
    }
}

Q: How can I get the generated html?

2 The webpage is not rendered rendered correctly when going directly to the link: This direct link will render as an empty "shell". Going to this link and clicking Vis utskriftsside (Down to the left) will open a new correctly rendered window.

Q: What is the difference between the two links and how can I access the correctly rendered webpage using the direct link?

EDIT

This is the HTML/JavaScript generating the numbers I'm trying to scrape:

 <div id="drawNumbers" class="drawn-numbers">
 <script type="text/javascript">
    var tableData ='';
    if (opener.draw_numbers) {
        for(var i = 0; i<opener.draw_numbers.length;i++){
            tableData += '<div class="number" style="left:'+(i*28+8)+'px;">';
            tableData += '<img width="23" height="23" alt="" src="/nt-keno/result/images/res_keno_tallramme_print.gif">';
            tableData += '</div>';
            tableData +=  '<div class="number" style="left:'+(i*28+9)+'px; top:9px; z-index: 30;">' +opener.draw_numbers[i]+ '</div>';
        }
    }
    document.writeln(tableData);
</script>
</div>

Can I import this array into java?

opener.draw_numbers[i]
PalSivertsen
  • 384
  • 2
  • 11

2 Answers2

1

What you are doing is termed 'scraping', where dynamic pages often cause problems:

How do you scrape AJAX pages?

Scraping dynamically generated html inside Android app

Best web scraping Ruby on Rails library that handles dynamic HTML produced by javascript

There are no simple solutions.

Community
  • 1
  • 1
ColinE
  • 68,894
  • 15
  • 164
  • 232
  • I assume the data I'm trying to access is stored in a database. Is it possible to find where it is stored? – PalSivertsen Sep 11 '12 at 20:28
  • @Bøtteknotten Unless you have the ability to query the database directly (or they make an API available), knowing where the database was located would not help. – David B Sep 11 '12 at 22:05
0

The page has a frame which contains:

https://www.norsk-tipping.no/nt-keno/result/keno_result_info.jsp?drawID=1771&bet=10&keno_level=10

The data comes from a json url. You can see this with tools such as firebug:

https://www.norsk-tipping.no/api-keno/getResultInfo.json

fgb
  • 18,439
  • 2
  • 38
  • 52