I know that when you utilize includes
and you specify a where
clause on the joined table, you should use .references
example:
# will error out or throw deprecation warning in logs
users = User.includes(:orders).where("Orders.cost < ?", 20)
In rails 4 or later, you will get an error like the following:
Mysql2::Error: Unknown column 'Orders.cost' in 'where clause': SELECT customers.* FROM customers WHERE (Orders.cost < 100)
Or you will get a deprecation warning:
DEPRECATION WARNING: It looks like you are eager loading table(s) (one of: users, addresses) that are referenced in a string SQL snippet. For example:
Post.includes(:comments).where("comments.title = 'foo'") Currently, Active Record recognizes the table in the string, and knows to JOIN the comments table to the query, rather than loading comments in a separate query. However, doing this without writing a full-blown SQL parser is inherently flawed. Since we don't want to write an SQL parser, we are removing this functionality. From now on, you must explicitly tell Active Record when you are referencing a table from a string:
Post.includes(:comments).where("comments.title = 'foo'").references(:comments)
If you don't rely on implicit join references you can disable the feature entirely by setting config.active_record.disable_implicit_join_references = true. (
SELECT "users"."id" AS t0_r0, "users"."name" AS t0_r1, "users"."email" AS t0_r2, "users"."created_at" AS t0_r3, "users"."updated_at" AS t0_r4, "addresses"."id" AS t1_r0, "addresses"."user_id" AS t1_r1, "addresses"."country" AS t1_r2, "addresses"."street" AS t1_r3, "addresses"."postal_code" AS t1_r4, "addresses"."city" AS t1_r5, "addresses"."created_at" AS t1_r6, "addresses"."updated_at" AS t1_r7 FROM "users" LEFT OUTER JOIN "addresses" ON "addresses"."user_id" = "users"."id" WHERE (addresses.country = 'Poland')
so we do this:
# added .references(:orders)
users = User.includes(:orders).where("Orders.cost < ?", 20).references(:orders)
And it executes just fine:
SELECT "users"."id" AS t0_r0,
"users"."name" AS t0_r1,
"users"."created_at" AS t0_r2,
"users"."updated_at" AS t0_r3,
"orders"."id" AS t1_r0,
"orders"."cost" AS t1_r1,
"orders"."user_id" AS t1_r2,
"orders"."created_at" AS t1_r3,
"orders"."updated_at" AS t1_r4
FROM "users"
LEFT OUTER JOIN "orders"
ON "orders"."user_id" = "users"."id"
WHERE ( orders.cost < 20 )
I know that .includes
is just a wrapper for two methods: eager_load
and preload
. I know that since my query above is doing a filter on a joined table (orders
in this example), includes
is smart and knows to pick the eager_load
implementation over preload
because preload
cannot handle doing this query since preload
does not join tables.
Here is where I am confused. Ok: So on that query above: under the hood includes
will utilize the eager_load
implementation. But notice how when I explicitly use eager_load
for this same query (which is what includes
is essentially doing): I do not need to use .references
! It runs the query and loads the data just fine. No error and no deprecation warning:
# did not specify .references(:orders), and yet no error and no deprecation warning
users = User.eager_load(:orders).where("Orders.cost < ?", 20)
And it executes the exact same process with no problem:
SELECT "users"."id" AS t0_r0,
"users"."name" AS t0_r1,
"users"."created_at" AS t0_r2,
"users"."updated_at" AS t0_r3,
"orders"."id" AS t1_r0,
"orders"."cost" AS t1_r1,
"orders"."user_id" AS t1_r2,
"orders"."created_at" AS t1_r3,
"orders"."updated_at" AS t1_r4
FROM "users"
LEFT OUTER JOIN "orders"
ON "orders"."user_id" = "users"."id"
WHERE ( orders.cost < 20 )
That seems odd. Why does .references
need to be specified for the includes
version of the query, whereas .references
does not need to be specified for the eager_load
version of the query? What am I missing here?