0

I'm using Resque in a Heroku app for background jobs and I'm wondering if there's a way to get a given job processed faster.

For instance, there are a few jobs that involve taking a large file (20GB+), reading the contents of it, and splitting it into database entries (millions of entries).

That happens in a single job.

Throwing more workers doesn't work since only one worker is used for the job. So is there a way to make a given worker work faster?

Shpigford
  • 24,748
  • 58
  • 163
  • 252

2 Answers2

1

Without seeing any code it's hard to say, but perhaps there's opportunity to optimise the UPDATEs or INSERTs that it's creating?

If it's in a loop like

csv_file.each_line do |line|
  Record.create ...
end

You could improve the performance of this by batching it so that it reads, say 1,000 lines from the file, and then does an INSERT of 1,000 rows to the DB.

This SO answer shows how to use create for batch inserts.

Community
  • 1
  • 1
ChrisJ
  • 2,486
  • 21
  • 40
0

One option would be to use the new 2x dynos which have double CPU and double memory for your worker.

John Beynon
  • 37,398
  • 8
  • 88
  • 97