I believe the pageToken is abstracted away for you by the python client library. If you go down to the end of the search_jobs method in the source you will see it builds an iterator that is aware of the pageToken and nextPageToken fields:
iterator = google.api_core.page_iterator.GRPCIterator(
client=None,
method=functools.partial(
self._inner_api_calls["search_jobs"],
retry=retry,
timeout=timeout,
metadata=metadata,
),
request=request,
items_field="matching_jobs",
request_token_field="page_token",
response_token_field="next_page_token",
)
return iterator
So all you should need to do is the following - copied from the docs at https://googleapis.github.io/google-cloud-python/latest/talent/gapic/v4beta1/api.html:
from google.cloud import talent_v4beta1
client = talent_v4beta1.JobServiceClient()
parent = client.tenant_path('[PROJECT]', '[TENANT]')
# TODO: Initialize `request_metadata`:
request_metadata = {}
# Iterate over all results
for element in client.search_jobs(parent, request_metadata):
# process element
pass
# Alternatively:
# Iterate over results one page at a time
for page in client.search_jobs(parent, request_metadata).pages:
for element in page:
# process element
pass
Default page size is 10 apparently, you can modify this with the pageSize parameter. Page iterator documentation can be found here:
Doco: https://googleapis.github.io/google-cloud-python/latest/core/page_iterator.html
Source: https://googleapis.github.io/google-cloud-python/latest/_modules/google/api_core/page_iterator.html#GRPCIterator
Probably the simplest way to deal with this is consume all results using
allResults = list(results_iterator)
If you have massive amounts of data and don't want to page through in one go I would do the following. The ".pages" is just returning a generator that you can work with as usual.
resultsIterator = client.search_jobs(parent, request_metadata)
pages = resultsIterator.pages
currentPageIter = next(pages)
#do work with page
currentItem = next(currentPageIter)
currentPageIter = next(pages)
# etc...
You would need to catch StopIteration error for when you run out of items or pages:
https://anandology.com/python-practice-book/iterators.html
This is why:
def _page_iter(self, increment):
"""Generator of pages of API responses.
Args:
increment (bool): Flag indicating if the total number of results
should be incremented on each page. This is useful since a page
iterator will want to increment by results per page while an
items iterator will want to increment per item.
Yields:
Page: each page of items from the API.
"""
page = self._next_page()
while page is not None:
self.page_number += 1
if increment:
self.num_results += page.num_items
yield page
page = self._next_page()
See how after the yield it calls _next_page? This will check for more pages and then perform another request for you if they exist.
def _next_page(self):
"""Get the next page in the iterator.
Returns:
Page: The next page in the iterator or :data:`None` if
there are no pages left.
"""
if not self._has_next_page():
return None
if self.next_page_token is not None:
setattr(self._request, self._request_token_field, self.next_page_token)
response = self._method(self._request)
self.next_page_token = getattr(response, self._response_token_field)
items = getattr(response, self._items_field)
page = Page(self, items, self.item_to_value)
return page
If you are wanting a sessionless option, you can use offset + page size and pass the current offset to the user on each ajax request:
offset (int) –
Optional. An integer that specifies the current offset (that is,
starting result location, amongst the jobs deemed by the API as
relevant) in search results. This field is only considered if
page_token is unset.
For example, 0 means to return results starting from the first
matching job, and 10 means to return from the 11th job. This can be
used for pagination, (for example, pageSize = 10 and offset = 10 means
to return from the second page).