0

We have list of category products having duplicate names. How to get a list of products which should not have duplicate product name in Postgres?

We are searching for min product ids with group by name. then searching the products in ids.

category = Category.first
ids = Product.select("MIN(id) as id").where(deleted: false).group(:name).collect(&:id)
category.products.where("products.id IN (?)", ids).find_active

How can we optimize the queries?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Debadatt
  • 5,935
  • 4
  • 27
  • 40

2 Answers2

0

You can do Product.all.pluck(:name).uniq to get just the product names in an array.

But I think you're solving the wrong problem, in that this problem has a bad 'smell'. If you have products that have identical names, how do you differentiate them from a UX perspective? And why only get the first created product by that name vs. the most 'popular' product? I'm trying to imagine how this solution would work for the user and I'm coming up blank, perhaps because I don't know enough about the context.

Edit: Also, could you clarify what you mean by 'should not have duplicate product name'? Is it to get a list of products, but only the first product if there's multiple products with the same name? Or are you looking for items to correct?

Rich Seviora
  • 1,789
  • 12
  • 16
  • Category has many products .How to find products under a category without duplicate name ? We should not show duplicate products . If there are duplicate products with same name, then we should list one of them . – Debadatt Mar 16 '15 at 14:07
0

The simple solution in Postgres is with DISTINCT ON:

SELECT DISTINCT ON (name)
       id, name  -- add more columns if you need
FROM   Product
WHERE  deleted = FALSE
ORDER  BY name, id;

Returns unique product names (sorted alphabetically). From each set of dupes the one with the smallest id. Details:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228