0

I am confused in how to fetch the paginated data from bigquery in Java,

I followed the docs from bigquery but it's not clear for me

// Identify the table itself
        TableId tableId = TableId.of("dataset", "table");

        // Page over 100 records. If you don't need pagination, remove the pageSize parameter.
        TableResult result = bigQuery.listTableData(tableId, BigQuery.TableDataListOption.pageSize(100));

        // Print the records
        result
            .iterateAll()
            .forEach(
                    row -> {
                        row.forEach(fieldValue -> System.out.print(fieldValue.toString() + ", "));
                        System.out.println();
                    });

I have doubt this query will be giving only first 100 rows, whereas I want to fetch all the data from table, for example in page 1 -> 100 rows, page 2 -> 100 rows and so on untill all data is fetched from table.

Please help me understand where I am wrong.. in the table there is 20 Million records

Rupesh
  • 840
  • 1
  • 12
  • 26
  • [This](https://stackoverflow.com/questions/14807752/how-do-i-request-paginated-bigquery-query-results-using-pagetokens-with-the-goog) answer can help you. – fatih Dec 27 '22 at 08:31
  • Can you have a try with your current code? It's possible that the iterator automatically fetch the next page for you (I forgot in which language this transparent fetch is implemented!). Else, you have to get the next page token from the result, if any, perform a new listTableData with the pageToken value as option https://cloud.google.com/java/docs/reference/google-cloud-bigquery/latest/com.google.cloud.bigquery.BigQuery.TableDataListOption#com_google_cloud_bigquery_BigQuery_TableDataListOption_pageToken_java_lang_String_ – guillaume blaquiere Dec 27 '22 at 13:13
  • Thanks guys, yeah using listTableData and result.nextPage() it worked – Rupesh Dec 27 '22 at 19:24

1 Answers1

1

I have doubts this query will be giving only the first 100 rows.

No, it will give all the rows which are returned by your query.

The function iterateAll() iterates through all the records at once and gives you all the rows from the result. Whereas getValues() function will give only paged records of the current page.

You can try the below code snippet for testing and understanding:

        // Page over 100 records. If you don't need pagination, remove the pageSize
        // parameter.
        TableResult results = bigquery.listTableData(tableId, TableDataListOption.pageSize(100));

        // .iterateAll() will iterate through the all the records at once
        int rowCount=0;
        for (FieldValueList row : results.iterateAll()) {
            rowCount++;
        }
        System.out.println("Total Fetched "+ rowCount +" records");
        
        // .getValues() will only iterate through the all the records of the current page
        int pageCount=0;
        rowCount=0;
        do{
            System.out.println("page number: "+(++pageCount));                
            for (FieldValueList row : results.getValues()) {
                rowCount++;
            }
            System.out.println("Fetched "+rowCount+" records");
            results = results.getNextPage();
            rowCount=0;
        }while(results.hasNextPage());

        //for the last page
        System.out.println("page number: "+(++pageCount));
        for (FieldValueList row : results.getValues()) {
            rowCount++;
        }
        System.out.println("Fetched "+rowCount+" records");

        System.out.println("Query ran successfully");

Result:

enter image description here

You can follow this cloud doc for more information.

kiran mathew
  • 1,882
  • 1
  • 3
  • 10