1

I have three questions:

1) file_or_folder and dataset each have many metainstances. Given the following query:

p= Metainstance.find(:first, :conditions=>["file_or_folder_id=? AND dataset_id=?", some.id, dataset_id],:include=>[:file_or_folder,:dataset])

Does eager loading apply on file_or_folder and dataset? Also, what is the best way of writing this query?

2) If I need to retrieve a huge amount of data, is it more efficient to write queries using joins or includes option or by using scopes.

3) I cannot use page caching, as I have dynamic content that keeps on changing. How else can I improve the performance of a Rails app?

MrTheWalrus
  • 9,670
  • 2
  • 42
  • 66
Vinay
  • 237
  • 2
  • 8
  • 17
  • 2
    Please ask one question per post. – user229044 Jun 12 '13 at 18:15
  • 1
    You're using an ancient syntax. Throw out `.find` and use `Metainstance.where(file_or_folder_id: some.id, dataset_id: dataset_id).includes(:file_or_folder,:dataset).first` – user229044 Jun 12 '13 at 18:16
  • How "huge"? Can you use progressive loading? – Dave Newton Jun 12 '13 at 18:18
  • About 50,000 records. which have associations to other tables.But this number is not fixed, will increase in the future. – Vinay Jun 12 '13 at 18:32
  • @meager .... does "Metainstance.where(file_or_folder_id: some.id, dataset_id: dataset_id).includes(:file_or_folder,:dataset).first" eager load file_or_folder,dataset in this query ...will it increase the performance of data retrieval ? – Vinay Jun 12 '13 at 18:34

2 Answers2

1

1) First of all, find(:first) has been deprecated for a long time. It's actually finally going away in Rails 4. Here's how this query would look in the modern era (shamelessly copied from meagar's comment):

Metainstance.
  where(:file_or_folder_id => some.id, :dataset_id => dataset_id).
  includes(:file_or_folder, :dataset)

So, on to the question: Eager loading in this way means that the following will happen:

  • First, Rails will load the Metainstances that match the conditions of the query.
  • Second, it will load all of the FileOrFolders that are associated with the Metainstances fetched in the first query (not any others).
  • Finally, it will load all of the Datasets associated with those Metainstances.

I think this means that the answer to your question is "Yes, eager loading applies the contents of the where clause."

2) I think we covered this with the above discussion of finder methods. I don't think they actually less efficient, per se. Just uglier and deprecated. The above code is the correct way to run a query like this.

3) There are literally entire books on improving Rails app performance. You're going to have to be much more specific about the query you're running and how you're using the results from it before anyone can give you meaningful advice on this.

MrTheWalrus
  • 9,670
  • 2
  • 42
  • 66
0

a) Yes, it does perform eager loading. I would do this like

p= Metainstance.where(:file_or_folder_id => some.id, :dataset_id => dataset_id).includes([:file_or_folder, :dataset]).first

This also does eager loading.

b) If you are using file_or_folder and dataset later on, then it is best to use includes (and you avoid n+1 problem). If you are not using them and just need to join tables, then joins is the faster way.

c) There are many ways to improve performance of your application and you can find some of these methods in Scaling Rails Screencast series.

Ermin Dedovic
  • 907
  • 4
  • 6
  • inode = FileOrFolder.find_by_id(params[:id]) dataset = Dataset.find_by_id(params[:datasetid]) with the suggestions provided i changed my query as below : meta=MetricInstance.where(dataset_id: dataset.id, file_or_folder_id: inode.id).includes(:file_or_folder,:dataset,:qeinbat, :num_of_test,:bp,:clearall,:closeall,:clearclass,:clearmax,:pause, :mlint,:single_use,:test_time,:user).first I am not using file_or_folder and dataset later on but using the others that i gave in the include list ..... do you think my query is right now ?? – Vinay Jun 12 '13 at 19:05
  • If you are not using the file_or_folder and dataset, you can do .joins(:file_or_folder,:dataset).includes(:qeinbat, :num_of_test,:bp,:clearall,:closeall,:clearclass,:clearmax,:pause, :mlint,:single_use,:test_time,:user) – Ermin Dedovic Jun 12 '13 at 19:29
  • Thanks for the inputs. I some how feel that using joins(:file_or_folder,:dataset) ...slows down my query processing. its fast with out the joins ..... correct me if am wrong. – Vinay Jun 12 '13 at 21:09
  • Joins does not fetch associated rows immediately and is faster as such if you don't need that extra data. Includes is faster when you do have to get extra rows because it fetches them all in one query, but if you go with join, activerecord makes n+1 queries to fetch all the data. You have this problem explained here http://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations. – Ermin Dedovic Jun 12 '13 at 21:34
  • Dataset.find_by_id(params[:datasetid]) and Dataset.where(id: params[:datasetid]) are they both same ?? it throws me an error when i use where condition statement like this...... i had to give Dataset.where(id: params[:datasetid]).first to run it with out any error. – Vinay Jun 13 '13 at 13:47
  • @Vinay They are not the same. find_by_id returns single result from the database and where returns ActiveRecord::Relation, which is set of data. If you append first to where, then they become equivalent. – Ermin Dedovic Jun 13 '13 at 20:38
  • And find_by_something will be replaced in Rails 4: http://railscasts.com/episodes/400-what-s-new-in-rails-4?view=asciicast – Ermin Dedovic Jun 13 '13 at 20:42
  • Can you please answer my query at .... [link_to_question](http://stackoverflow.com/questions/17109031/unexpected-error-while-processing-request-failed-to-allocate-memory) – Vinay Jun 14 '13 at 14:00