0

I have an application, which does a lot of computation on few pages(requests). The web interface sends an AJAX request. The computation takes sometimes about 2-5 minutes. The problem is, by this time AJAX request times out.

We can certainly increase the timeout on the web portal, but that doesn't sound like right solution. Also, to improve performance:

  • Removed N+1/Duplicate queries
  • Implemented Caching

What else could be done here to reduce the calculation time?

Also, if it still takes longer, I was thinking of following solutions:

  • Do the computation beforehand and store it in DB. So when the actual request comes, there is no need of calculation. (Apprehensive about this approach. Since we will have to modify/Erase-and-recalculate this data, whenever there is some application logic change.)
  • Load the whole data in cache when application starts/data gets modified. But for the first time computation has to be done. Also, can't keep whole data in the cache when the application starts. So need to store it in the cache as per demand.
  • Maybe, do something like Angular promise, where promise gets fulfilled when the response comes from the server.

Do we have any alternative to do this efficiently?

UPDATE: Depending on user input, the calculation might happen in few seconds. And also it might take 2-5 minutes. The scenario is, user imports an excel. The excel has been parsed and saved in DB. Now on another page, user wants to see the report/analytics graph derived with few calculations on the imported data(which has already been saved to db with background job). The calculation has to be done with many factors, so do not want to save it in DB(As pointed above). Also, when user request the report/analytics graph, It'll be bad experience to tell him that graph will be shown after sometime. You'll get email/notification etc.

Indyarocks
  • 643
  • 1
  • 6
  • 26
  • Can any of the computations be processed in parallel? Ruby's concurrent gem is relatively easy to make use of and enables the use of concurrent threads to improve performance. – margo Apr 10 '16 at 13:05
  • Have to check that. but the whole computation results has to be merged and then sent as the response. Another thing is, how can we optimize queries because we have to fetch data from many tables. Each parallel process will query separately? – Indyarocks Apr 10 '16 at 14:51

2 Answers2

0

The extremely typical solution is to enqueue a job for background processing, and return a job ID to the front-end. Your front-end can then poll for completion using that job ID, or you can trigger a notification such as an email to be sent to the user when the job completes.

There are a multitude of gems for this, and it is such a popular and accepted solution that Rails introduced its own ActiveJob for this exact purpose.

user229044
  • 232,980
  • 40
  • 330
  • 338
  • Thanks. I was aware of background jobs. But we need to show data in realtime. :( Another alternative can be, based on polling, once succeeded, fetch data with another request. – Indyarocks Apr 10 '16 at 14:48
  • You're not showing "real time" data now, if your requests take 5 minutes then time out. Delayed jobs are the right solution here. – user229044 Apr 10 '16 at 16:43
  • Thanks. I thought you'll get the idea, anyway! By "realtime", I meant, I need to show the data to user. Can't rely on email/notification. Updating the question! – Indyarocks Apr 10 '16 at 20:26
  • @Indyarocks Yes, and you can use delayed jobs for this: *"Your front-end can then poll for completion"*. – user229044 Apr 11 '16 at 11:46
  • Can elasticsearch help here? – Indyarocks Apr 12 '16 at 14:19
-1

Here are a few possible solutions:

  1. Optimize your tables with indexes to reduce data fetching time.

  2. Preload all rows you'll be dealing with at the beginning, so you won't do a query each time you calculate something... it's faster/easier to @things.select { |r| r.blah } than to Thing.where(conditions)

  3. Instead of all that, just do the computing in PLSQL on the database side. Sure, it's not the same as writing Ruby code but it could be faster.

  4. And yes, cache the whole results set into memcache or redis or something (and expire when something change)

  5. Run the calculation in the background (crontab?) and store the results in a JSON somewhere, or cache the entire HTML file (if you're not localizing or anything)

PS: I'm doing 1,2,3 combined with 5 (caching JSON results into memcache and then pulling the array and formatting/localizing) for a few M records from about 12 tables... sports data mainly.

user229044
  • 232,980
  • 40
  • 330
  • 338
Nick M
  • 2,424
  • 5
  • 34
  • 57