1

I'm trying to transfer data from SQL server to DynamoDb using Node.js

 sql.connect(config, function (err) {

    if (err) console.log(err);

    // create Request object
    var request = new sql.Request();

    // query to the database and get the records
    request.query('SELECT Client, Brand FROM Client WHERE Brand = 6',
        function (err, recordset) {

            if (err) console.log(err);

            console.log('request exectued');
            var itemProcessed = 0;
            // send records as a response
            recordset.forEach(function (record) {

                var params = {
                    TableName: 'ClientBrandLookUp',
                    Item: {
                        'ClientID': record.Client,
                        'Brand': record.Brand
                    }
                };

                docClient.put(params, function (err, data) {
                    if (err) {
                        console.log(err);
                    } else {
                        console.log('Added');
                        itemProcessed++;
                        if (itemProcessed === recordset.length) { alldone(); }
                    }
                });
            });
        });
});

I am new to Node.js and I think I am doing this wrong. I have 150000 thousands records in my sql table.

The process is using a lot of memory and I'm getting a Javascript heap out of memory. I checked the Provisioned Throughput of my dynamo table and set it up to 4000 writes per seconds. If I reduce the number of line coming from SQL this code works fine.

Any help or advice will be greatly appreciated.

Theo
  • 473
  • 8
  • 20
  • I completely recognize this wasn't your question, but maybe node isn't the best tool for the job. You might look at dedicated ETL software (ETL = Extract Transform Load. Just fancy talk for moving lots of data). – Xedni Jan 20 '17 at 23:36

1 Answers1

2

The problem is it's trying to load all 150,000 records into memory at once. What you should do is adjust your query to request only parts of the dataset at once (pages). There are a number of answers on SO regarding how to implement paging in T-SQL, but if you're using a newish version, this answer should do the trick. That said, it will be up to your code to figure out how many total records there are and keep track of the current page.

Chris Thompson
  • 35,167
  • 12
  • 80
  • 109