0

Hi I’m fairly new to snowflake and I’m trying to insert a 2D array using a stored procedure into a table in snowflake.

The array looks like this [[name,last,age],[name,last,age]..] With a couple thousand items.

I tried using INSERT INTO using a loop but it just takes to long. I need a faster way to insert the data. Any idea how to approach this?

Red_viper
  • 1
  • 3

1 Answers1

2

You can try to batch a number of rows together into a single INSERT statement.

The following snippet of javascript code illustrates one way of doing this via a sequence of binding parameters generated over a chunked number of rows. I've not tested it to run and you may need to make adjustments when fitting it into your existing procedure.

var dataArray = [[name,last,age], [name,last,age], …]

// Generates (?,?,?) if there are 3 cols per row
var row_bind = "(" + ",".join(Array(dataArray[0].length).fill("?")) + ")"

while (dataArray.length > 0) {

  // Consume upto 100 rows at a time
  var dataChunk = dataArray.splice(0, 100);

  // Generate (?,?,?),(?,?,?),[...] upto a 100 times
  var params = Array(dataChunk.length).fill(row_bind).join(",");

  // Place the bind params into the query
  var statement = "INSERT […] VALUES " + params;

  // Prepare a statement with a flat stream of row/col data sent along
  var stmt = snowflake.createStatement({
    sqlText: statement,
    binds: dataChunk.flat()
  });

  // Perform your 'stmt' logic, iterate over for the next batch...
}

Note: Is what you are trying to do impossible from regular SQL? Call procedures are effectively single-threaded and will lack in performance if you're iterating over datasets which a SQL query can handle far better.

It could appear attractive to use a familiar, imperative programming language via Snowflake's stored procedures but they aren't designed for data loading or data transformations and do not scale to use the warehouse sizes when used this way.