I have a script that is pulling 25,000 records from AWS Athena which is basically a PrestoDB Relational SQL Database. Lets say that I'm generating a request for each one of these records, which means I have to make 25,000 requests to Athena, then when the data comes back I have to make 25,000 requests to my Redis Cluster.
What would be the ideal amount of requests to make at one time from node to Athena?
The reason I ask is because I tried to do this by creating an array of 25,000 promises and then calling Promise.all(promiseArray)
on it, but the app just hanged forever.
So I decided instead to fire off 1 at a time and use recursion to splice the first index out and then pass the remaining records to the calling function after the promise has been resolved.
The problem with this is that it takes forever. I took about an hour break and came back and there were 23,000 records remaining.
I tried to google how many requests Node and Athena can handle at once, but I came up with nothing. I'm hoping someone might know something about this and be able to share it with me.
Thank you.
Here is my code just for reference:
As a sidenote, what I would like to do differently is instead of sending one request at a time I could send 4, 5, 6, 7 or 8 at a time depending on how fast it would execute.
Also, how would a Node cluster effect the performance of something like this?
exports.storeDomainTrends = () => {
return new Promise((resolve, reject)=>{
athenaClient.execute(`SELECT DISTINCT the_column from "the_db"."the_table"`,
(err, data) => {
var getAndStoreDomainData = (records) => {
if(records.length){
return new promise((resolve, reject) => {
var subrecords = records.splice(0, )[0]
athenaClient.execute(`
SELECT
field,
field,
field,
SUM(field) as field
FROM "the_db"."the_table"
WHERE the_field IN ('Month') AND the_field = '`+ record.domain_name +`'
GROUP BY the_field, the_field, the_field
`, (err, domainTrend) => {
if(err) {
console.log(err)
reject(err)
}
redisClient.set(('Some String' + domainTrend[0].domain_name), JSON.stringify(domainTrend))
resolve(domainTrend);
})
})
.then(res => {
getAndStoreDomainData(records);
})
}
}
getAndStoreDomainData(data);
})
})
}