1

I've come across a problem in uploading a large csv file to Azure's Table Storage, in that it appears to stream the data from it so fast that it doesn't upload properly or throws a lot of Timeout Errors.

This is my current code:

var fs = require('fs');
var csv = require('csv');
var azure = require('azure');
var AZURE_STORAGE_ACCOUNT = "my storage account";
var AZURE_STORAGE_ACCESS_KEY =     "my access key";
var tableService =     azure.createTableService(AZURE_STORAGE_ACCOUNT,AZURE_STORAGE_ACCESS_KEY);
var count = 150000;
var uploadCount =1;
var counterror = 1;
tableService.createTableIfNotExists('newallactorstable', function(error){
if(!error){
    console.log("Table created / located");
}
else
{
    console.log("error");
}
});

csv()
.from.path(__dirname+'/actorsb-c.csv', {delimiter: '\t'})
.transform( function(row){
    row.unshift(row.pop());
    return row;
})
.on('record', function(row,index){
    //Output plane carrier, arrival delay and departure delay
    //console.log('Actor:' + row[0]);

    var actorsUpload = {
    PartitionKey : 'actors'
    , RowKey : count.toString()
    , Actors : row[0]
    };



    tableService.insertEntity('newallactorstable', actorsUpload, function(error){
        if(!error){
            console.log("Added: " + uploadCount);
        }
        else
        {
            console.log(error)
        }
    });
    count++


})
.on('close', function(count){
    console.log('Number of lines: '+count);
})
.on('error', function(error){
    console.log(error.message);
});

The CSV file is roughly 800mb.

I know that to fix it, I probably need to send the data in batches, but I have literally no idea how to do this.

dmbll
  • 113
  • 6

2 Answers2

1

I have no knowledge of the azure package nor the CSV package, but I would suggest you to upload the file using a stream. If you have the file saved to your drive you can create a read stream from it, and then use that stream to upload to azure using createBlockBlobFromStream. That question redirects me here. I suggest you to take a look at that, as it handles the encoding. The code provides a way to convert the file to a base64 string, but i have the idea that can be done more efficiently using node. I will have to look into that though.

Community
  • 1
  • 1
MarijnS95
  • 4,703
  • 3
  • 23
  • 47
  • Question is about uploading data into table storage and not blob storage :) – Gaurav Mantri May 10 '14 at 19:24
  • @GauravMantri oh, then I hope there is also some function to stream data to the table storage. (I still hope this points the asker in the right direction though) – MarijnS95 May 10 '14 at 19:28
1

hmm What I would suggest is to upload your file to blob storage and you can have reference to blob URI in your table storage. Block blob option give you an easy way of batch upload.