2

I'm trying to query Google spanner with query parameters using node.js client library. However, response is very slower with query parameter than without query parameter. Query has LIKE(forward match) statement. I couldn't find recommended way to use query parameters with LIKE statement.

Additionally, I tested with equal statement, there is no difference between query with parameter and query without parameter.

Table has more than 20 million rows. And instance is 1 node.

Is there any solution? or Is this bug with Google spanner?

Part of Schema(actually more than 40 columns):

CREATE TABLE props (
    props__id STRING(MAX) NOT NULL,
    props__address_quadkey STRING(MAX),
    ...
) PRIMARY KEY (props__id)

Index:

CREATE INDEX props__address_quadkey 
ON props (
    props__address_quadkey
)

Test code:

const Spanner = require('@google-cloud/spanner');
const spanner = new Spanner();

const db = spanner
  .instance('instance_name')
  .database('database_name');

(async () => {
  // Make connection
  await db.run({ sql: 'SELECT 1' });


  console.time('Without param');
  const r1 = (await db.run({
    sql: `
      SELECT
        props__id
      FROM props@{FORCE_INDEX=props__address_quadkey}
      WHERE
        (props__address_quadkey LIKE '1330020303011010200%')
    `
  }))[0];
  console.log(r1.length); // 121
  console.timeEnd('Without param'); // Without param: 277.223ms

  console.time('with param 1');
  const r2 = (await db.run({
    sql: `
      SELECT
        props__id
      FROM props@{FORCE_INDEX=props__address_quadkey}
      WHERE
        (props__address_quadkey LIKE @quadkey)
    `,
    params: { quadkey: '1330020303011010200%' },
    types: { quadkey: 'string' },
  }))[0];
  console.log(r2.length); // 121
  console.timeEnd('with param 1'); // with param 1: 9240.822ms
})();

Thank you for your help!

tahomatx
  • 45
  • 4
  • Have you performed the query using the Google Cloud Console UI and looked at [the query plan execution](https://cloud.google.com/spanner/docs/sql-best-practices#how-execute-queries)? – eb80 Aug 09 '18 at 02:56
  • Thanks comment eb80! In Console UI, query has ran very fast. And execution looks like, Index scan → FilterScan → Serialize Result → Local distributed union → Distributed union. Execution plan shows index worked properly, rows are reduced to 168 rows. So I thought query has no huge problem... – tahomatx Aug 09 '18 at 03:42
  • It sounds like a client issue, as it performs poorly only with parameter in LIKE but not others. You might get better help opening a issue in the client lib repo here https://github.com/googleapis/nodejs-spanner/ – rkansola Aug 20 '18 at 17:49
  • 1
    One other way to debug is to use request.queryMode in https://github.com/googleapis/nodejs-spanner/blob/1ad5797a237ae0bc60254f68d5ffb4be891e821a/src/v1/spanner_client.js#L607 to see how are queries actually doing as issued from client. – rkansola Aug 20 '18 at 18:15

1 Answers1

4

This is currently a limitation of Cloud Spanner. With a constant value for the LIKE pattern, Cloud Spanner is able to optimize the lookup expression based on the LIKE pattern during query compilation. For example, in this case, Cloud Spanner will be able to generate a query plan with a lookup expression that is basically

STARTS_WITH(props__address_quadkey, 1330020303011010200)

which will be able to efficiently search the index for entries that match the prefix in the LIKE pattern.

But with a parameterized LIKE pattern, that is not possible as the parameter is not evaluated until execution time and could contain any LIKE expression. As a result, instead of being able to efficiently lookup the matching rows, Cloud Spanner must read all rows and evaluate them against the LIKE pattern in the parameter to filter out non-matching rows.

This limitation however does not affect simpler predicates like the equality predicate where Cloud Spanner is able to do efficient lookups based on the value of the parameter.

adi
  • 580
  • 2
  • 12