12

I have a long chain of associations, joins, order by, etc., that is ultimately selecting from one of my rails models. At the end of the day I need the results to be unique and sorted. I don't care what columns are used in the SELECT statement, what appears in the ORDER BY, etc. (these all change based on the filtering options the user has selected), I just care that the top level model/table in the query is unique (based on id).

For background, widgets is the main table, and we are joining with widget_steps, and this is in Rails 3 (company is trying to upgrade ASAP but that's what they're stuck with at the moment)

Here is the query and error that is being generated:

PG::InvalidColumnReference: ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ...completed_at" IS NULL)) ORDER BY sequential DESC, widget_s...
                                                             ^
: SELECT  DISTINCT "widgets".* FROM "widgets" INNER JOIN "widget_steps" ON "widget_steps"."widget_id" = "widgets"."id" INNER JOIN "widget_steps" "active_steps_widgets" ON "active_steps_widgets"."id" = "widgets"."active_widget_step_id" WHERE "widgets"."account_id" = 1 AND "widgets"."completed_at" IS NULL AND (("widgets"."sequential" = 't' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."id" = "widgets"."active_widget_step_id" AND "widget_steps"."completed_at" IS NULL) OR ("widgets"."sequential" = 'f' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."completed_at" IS NULL)) ORDER BY sequential DESC, widget_steps.name ASC LIMIT 10 OFFSET 0
Completed 500 Internal Server Error in 52.3ms

ActiveRecord::StatementInvalid - PG::InvalidColumnReference: ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ...completed_at" IS NULL)) ORDER BY sequential DESC, widget_s...
                                                             ^
: SELECT  DISTINCT "widgets".* FROM "widgets" INNER JOIN "widget_steps" ON "widget_steps"."widget_id" = "widgets"."id" INNER JOIN "widget_steps" "active_steps_widgets" ON "active_steps_widgets"."id" = "widgets"."active_widget_step_id" WHERE "widgets"."account_id" = 1 AND "widgets"."completed_at" IS NULL AND (("widgets"."sequential" = 't' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."id" = "widgets"."active_widget_step_id" AND "widget_steps"."completed_at" IS NULL) OR ("widgets"."sequential" = 'f' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."completed_at" IS NULL)) ORDER BY sequential DESC, widget_steps.name ASC LIMIT 10 OFFSET 0:

Why is this a thing? What does postgres think is so ambiguous about this? Why do queries like this always work fine in MySQL but make postgres choke.

I have tried:

  1. specifying .select([everything mentioned in the order by]).uniq at the end of the chain
  2. specifying .uniq at the end of the chain without doing a custom select
  3. writing some custom AREL to try to embed all of this in a sub query and then do the .uniq or the .order outside of this (can't get this working)
  4. doing the .uniq outside of postgres (this breaks because of pagination ... you can end up with some pages that only have 1 or 2 items on them because of duplicates being removed)
  5. crying
Sam Johnson
  • 624
  • 1
  • 7
  • 20
  • What is `sequential` here? Is it column of some table or something else? – Ilya Lavrov Mar 17 '17 at 00:48
  • it is a column on widget – Sam Johnson Mar 17 '17 at 00:51
  • Think about it: when there are (at least) two rows of `widget_steps`, which satisfies your condition for a single row of `widget`, which one's `name` should be used in the `ORDER BY`? PostgreSQL won't decide this instead of you, that's why it says that put `widget_steps.name` into the `SELECT` clause (to make that one also `DISTINCT`). OFC, MySQL doesn't bother with such kind of nonsenses: it allows you to run whatever wrong query you come up with. – pozs Mar 17 '17 at 09:45

3 Answers3

13

You need to add widget_steps.name to list of selected columns:

SELECT DISTINCT "widgets".*,
                "widget_steps.name"
FROM "widgets"
INNER JOIN "widget_steps" ON "widget_steps"."widget_id" = "widgets"."id"
INNER JOIN "widget_steps" "active_steps_widgets" ON "active_steps_widgets"."id" = "widgets"."active_widget_step_id"
WHERE "widgets"."account_id" = 1
  AND "widgets"."completed_at" IS NULL
  AND (("widgets"."sequential" = 't'
        AND "widget_steps"."assigned_to" = 5
        AND "widget_steps"."id" = "widgets"."active_widget_step_id"
        AND "widget_steps"."completed_at" IS NULL)
       OR ("widgets"."sequential" = 'f'
           AND "widget_steps"."assigned_to" = 5
           AND "widget_steps"."completed_at" IS NULL))
ORDER BY sequential DESC,
         widget_steps.name ASC
LIMIT 10
OFFSET 0

This should not change logic of your query and will work fine.

In Rails you may use select method to set list of selected columns:

Widget.select('"widgets".*, "widget_steps.name"')

Hope this would help.

Roman Mkrtchian
  • 2,548
  • 1
  • 17
  • 24
Ilya Lavrov
  • 2,810
  • 3
  • 20
  • 37
3

Another option with this error that may work for some cases, such as a has_and_belongs_to_many type relation, is to unscope order:

class Resource < ApplicationRecord
  has_and_belongs_to_many :things
  scope :with_blue_things, -> { joins(:things).where(:things => {:color => :blue}).unscope(:order).distinct }

This should let you do all these

Resource.with_blue_things
Resource.with_blue_things.count
Resource.with_blue_things.order(:name).count
genkilabs
  • 2,966
  • 30
  • 36
0

Suggestion:

You can add has_many :steps in Widget class, And use Widget.includes(:steps) to exec a query with Order clause which order by widget_steps columns