2

I use node-influx and influx.query(...) use too much heap.

In my application I have something like

const data = await influx.query(`SELECT "f_power" as "power", "nature", "scope", "participantId"
    FROM "power"
    WHERE
    "nature" =~  /^Conso$|^Prod$|^Autoconso$/ AND
    "source" =~ /^$|^One$|^Two AND
    "participantId" =~ /${participantIdFilter.join('|')}/ AND
    "time" >= '${req.query.dateFrom}' AND
    "time" <= '${req.query.dateTo}'
    GROUP BY "timestamp"
`);

wb = dataToWb(data)

XLSX.writeFile(wb, filename);

Data is a result set of about 50M (I used this code) And the heap used by this method is about 350M (I used process.memoryUsage().heapUsed)

I'm surprised by the diference between these two values... So is possible to make this query less resource intensive?

Actually I use data to make a xlsx file. And the generation of this file lead to a node process out of memory. The method XLSX.writeFile(wb, filename) use about 100M. That's it's not enougth to fill my 512M RAM. So I figured me that is heap used by influx query which is never collected by the GC.

Actually I don't understand why the generation make this error. Why V8 can't free memory used by a method executed after and in another context ?

Cédric
  • 21
  • 3

1 Answers1

3

The node-influx (1.x client) reads the whole response, parses it into JSON, and transforms it into results (data array). There are a lot more intermediate objects on the heap during the processing of the response. You should also run the node garbage collector before and after the query to get a better estimate of how much heap does it take. You can now control the result memory usage only by reducing the result columns or rows (by limit, time, or aggregations). You can also join queries with smaller results to reduce the maximum heap usage caused by temporary objects. Of course, that is paid by the time and complexity of your code.

You can expect less memory usage with 2.x client (https://github.com/influxdata/influxdb-client-js). It does not create intermediate JSON objects, internally processes results in smaller chunks, it has an observable-like API that lets the user decide how to represent a result row. It uses FLUX as a query language and requires InfluxDB 2.x or InfluxDB 1.8 (with 2.x API enabled).

sranka
  • 31
  • 2
  • I have modified my question. If the memory used by query function is intermediate objects, this memory it should be cleared by the GC during the execution of my next method if memory reaches the V8 limit. – Cédric Mar 25 '21 at 13:05