I'm working on a project where we need to display BigQuery results in a table within a web application.
We’ve built the feature by paging, sorting and searching directly in BigQuery, but the performance isn’t what you would expect of a modern web application. It takes several seconds the apply a search term or change a page.
I can't really share much code, but this a general question that applies any large resultset generated in BigQuery.
For little bit of context. We create a view in BigQuery by joining a product catalog to orders.
WITH Catalog AS
(
SELECT
productId,
FROM `CatalogTable`
),
Orders AS (
SELECT
p.productId,
SUM(p.qty) AS qty
FROM `OrdersView` as o, o.products AS p
GROUP BY p.productId
)
SELECT
c.productId,
IF(o.qty IS NULL, 0, o.qty) AS qty,
ROW_NUMBER() OVER(ORDER BY qty DESC) as salesRank
FROM Catalog AS c
LEFT JOIN
Orders AS o
ON CONCAT(c.name, c.sku) = CONCAT(o.name, o.sku)
And the view is queried like so:
SELECT ...
FROM `catalog` c
LEFT JOIN `catalogView` cv
WHERE c.name LIKE '%searchTerm%'
LIMIT 10
OFFSET 0
What are the options for making this grid-view perform as it would if it were built on a traditional SQL database (or close to the performance)?
I've considered clustering, but i don't believe this is an option since i'm not partitioning the table:
https://medium.com/google-cloud/bigquery-optimized-cluster-your-tables-65e2f684594b
NOTES:
It's acceptable for the results to be a little delayed, if streaming the results into another database is an option.
The query is called via a WebApi endpoint and displayed in an Angular grid-view.
New orders are imported every 15 minutes so the results from this query won't be entirely static, they can change periodically.
The data-grid must support paging, sorting and searching, and the grid could contain 10,000 plus results.