3

I have this rails migration with the following code

class CreateCertificatesUsers < ActiveRecord::Migration[5.2]
  def change
    create_table :certificates_users do |t|
      t.references :user, foreign_key: true
      t.references :certificate, foreign_key: true
    end
  end
end

In Certificate model and User model, I have has_many :certificates_users

This allows me to be able to add a certificate to a user's profile. Currently, I have 2 users, and I have added Certificate for both of them.

2.5.3 :010 > CertificatesUser.all
  CertificatesUser Load (0.6ms)  SELECT  "certificates_users".* FROM "certificates_users" LIMIT $1  [["LIMIT", 11]]
 => #<ActiveRecord::Relation [#<CertificatesUser id: 8, user_id: 1, certificate_id: 1, expiry_date: "2020-01-14", certificate_number: "1122", renewal_date: "2019-01-14", ispublic: 1>, #<CertificatesUser id: 9, user_id: 2, certificate_id: 1, expiry_date: "2020-01-16", certificate_number: "123", renewal_date: "2019-01-16", ispublic: 1>, #<CertificatesUser id: 10, user_id: 2, certificate_id: 2, expiry_date: "2019-02-28", certificate_number: "123", renewal_date: "2019-01-16", ispublic: 1>]>  

Here is the certificates_users table for clarity.

id | user_id | certificate_id | ...
---+---------+----------------+----
 8 |       1 |              1 |
 9 |       2 |              1 |
10 |       2 |              2 |

One User has 2 certificates and the other has one.

My goal is to be able to list out Users who have a certain type of certificate.

  1. List out users who have a certificate with the id of 1 (should list out both users). - Dropdown 1
  2. List out only users who have both certificates with the id of 1 and certificate with the id of 2 (should list out only 1 user). - Dropdown 2
  3. When any of this user is selected from their separate dropdowns list out their certificate in another dropdown.

I have tried to use group and group_by to achieve this result but it did not work. The idea is that I am trying to group the CertificatesUser by user_id but that didn't work. It returned this error:

2.5.3 :011 > CertificatesUser.group_by(&:user_id)
Traceback (most recent call last):
        1: from (irb):11
NoMethodError (undefined method `group_by' for #<Class:0x00007fa8dda7c668>)
Did you mean?  group

Then I used group method and got this:

2.5.3 :012 > CertificatesUser.group('users.id')
  CertificatesUser Load (7.2ms)  SELECT  "certificates_users".* FROM "certificates_users" GROUP BY users.id LIMIT $1  [["LIMIT", 11]]
Traceback (most recent call last):
ActiveRecord::StatementInvalid (PG::UndefinedTable: ERROR:  missing FROM-clause entry for table "users")
LINE 1: ...cates_users".* FROM "certificates_users" GROUP BY users.id L...
                                                             ^
: SELECT  "certificates_users".* FROM "certificates_users" GROUP BY users.id LIMIT $1

Any idea how I can make this work?

3limin4t0r
  • 19,353
  • 2
  • 31
  • 52
mayorsanmayor
  • 2,870
  • 4
  • 24
  • 44
  • 1
    `#group` is a query method that creates a SQL GROUP BY clause. `#group_by` on the other hand is from the [enumerable mixin](https://ruby-doc.org/core-2.6/Enumerable.html) and iterates through a collection in Ruby. You can't call `.group_by` on a model class as its not a collection. You would need to do `CertificatesUser.all.group_by(&:user_id)`. But that would be really inefficient compared to using `#group`. – max Jan 15 '19 at 13:48
  • "List out only users who have both certificates with the id of 1 and certificate with the id of 2 (should list out only 1 user)." This can be done by using something likle `Certificate.joins(:certificate_users).where(certificate_user: { user_id: [1, 2] }).having('count(*) = 2').group('certificates.id')` on Postgres but this really is a whole question on its own. https://stackoverflow.com/questions/36131803/sql-where-joined-set-must-contain-all-values-but-may-contain-more – max Jan 15 '19 at 13:56

2 Answers2

2

My goal is to be able to list out Users who have a certain type of certificate.

You can do it using joins:

certificate_ids = [1,2] # or anything other ids
User.joins(:certificates_users).where('certificates_users.certificate_id' => certificate_ids)

Regarding group and group_by.

group_by can be called on Enumerable, so you cannot call it on an AR model directly. You could do it like that:

CertificatesUser.all.group_by(&:user_id)

but it's terribly inefficient - you load the whole table to memory.

To use group you need to specify a correct column name. You use users.id and your certificates_users has no such field. You can probably use

CertificatesUser.group('user_id')
mrzasa
  • 22,895
  • 11
  • 56
  • 94
  • 2
    Even better would be to use a symbol if possible inside the *group* method. When using a string it will be seen as SQL and placed inside the resulting query. When using a symbol an attribute is assumed and it will be replaced with the namespaced version (`\`certificates_users\`.\`user_id\`` for MySQL for example). This approach is less likely to have name collisions. This would happen when joining two tables that both have the same column name. – 3limin4t0r Jan 15 '19 at 11:21
  • @mrzasa thanks for your answer. This didn't work `User.joins(:certificates_users).where('certificate_users.certificate_id' => certificate_ids)`. ` Traceback (most recent call last): 1: from (irb):14 NameError (undefined local variable or method `certificate_ids' for main:Object) ` – mayorsanmayor Jan 15 '19 at 11:32
  • what was the result? – mrzasa Jan 15 '19 at 11:33
  • added in the comment above. – mayorsanmayor Jan 15 '19 at 11:36
  • well you need to define `certificate_ids` variable. See edited version of the post. – mrzasa Jan 15 '19 at 11:40
  • still didn't work @mrzasa `User Load (230.2ms) SELECT "users".* FROM "users" INNER JOIN "certificates_users" ON "certificates_users"."user_id" = "users"."id" WHERE "certificate_users"."certificate_id" IN ($1, $2) LIMIT $3 [["certificate_id", 1], ["certificate_id", 2], ["LIMIT", 11]] ActiveRecord::StatementInvalid (PG::UndefinedTable: ERROR: missing FROM-clause entry for table "certificate_users") LINE 1: ...ertificates_users"."user_id" = "users"."id" WHERE "certifica...` – mayorsanmayor Jan 15 '19 at 12:33
  • fixed a typo in a table name, try now – mrzasa Jan 15 '19 at 12:44
2

Short Answer

I fail to see why you need to group anything at all. I would solve this issue with the use of sub-queries.

certificate_1_user_ids = CertificatesUser.select(:user_id).where(certificate_id: 1)
certificate_2_user_ids = CertificatesUser.select(:user_id).where(certificate_id: 2)

# List out users who have a certificate with the id of 1.
users_with_certificate_1 = User.where(id: certificate_1_user_ids)

# List out only users who have both certificates with the
#   id of 1 and certificate with the id of 2.
users_with_certificate_1_and_2 = User.where(id: certificate_1_user_ids)
                                     .where(id: certificate_2_user_ids)

The last question might be worth a separate question. However I will mention that you can do the following.

# with the following associations present in app/models/user.rb
has_many :certificates_users
has_many :certificates, through: :certificates_users

# you can eager load associations to avoid the 1+N problem
users.includes(:certificates).each do |user|
  # do stuff with user

  user.certificates.each do |certificate|
    # do stuff with certificate
  end
end

Dynamic Certificate IDs

In the comments you asked how to make this answer work for dynamic certificate IDs. For the answer given above this can be done in the following way.

certificate_ids = [1, 2]

users = certificate_ids.reduce(User) do |scope, certificate_id|
  user_ids = CertificatesUser.select(:user_id).where(certificate_id: certificate_id)
  scope.where(id: user_ids)
end

The resulting query should look something like this (for MySQL).

SELECT `users`.*
FROM `users`
WHERE
  `users`.`id` IN (
    SELECT `certificates_users`.`user_id`
    FROM `certificates_users`
    WHERE `certificates_users`.`certificate_id` = 1
  )
  AND `users`.`id` IN (
    SELECT `certificates_users`.`user_id`
    FROM `certificates_users`
    WHERE `certificates_users`.`certificate_id` = 2
  )

Optimizing the Query

Although the above example shows us how to make the whole thing dynamic. Doing this for larger amounts of certificate IDs doesn't produce the most efficient query.

Another way of approaching this does involve grouping and is a slightly more difficult to execute. This involves searching through the certificates_users table for records that match one of the given IDs. Then count the amount of records for each user. If the amount of records equals the amount of certificate IDs given it means that an user has all those certificates.

Prerequisites

This method does however come with prerequisites.

  • The combination of user_id and certificate_id must be unique in the certificates_users table. You can make sure that this is the case by placing an unique constraint over those columns.

    add_index :certificates_users, %i[user_id certificate_id], unique: true
    
  • Another optional recommendation is that the columns user_id and certificate_id in the certificates_users table cannot be NULL. You can make sure this is the case by creating the columns using

    t.references :user, foreign_key: true, null: false
    

    or by changing the column using

    change_column_null :certificates_users, :user_id, false
    

    Of course the above example must be repeated for the certificate_id column as well.

Optimization

With the prerequisites out of the way, lets have a look at this solution.

certificates_users = CertificatesUser.arel_table
certificate_ids = [1, 2]

users = User.joins(:certificates_users)
            .where(certificate_users: { certificate_id: certificate_ids })
            .group(:id)
            .having(certificates_users[:id].count.eq(certificate_ids.size))

# Nested sections in a where always require the table name, not the
#   association name. Here is an example that makes it more clear.
#
#     Post.joins(:user).where(users: { is_admin: true })
#                  ^           ^- table name
#                  +-- association name

The above example should produce the following query (for MySQL).

SELECT `users`.*
FROM `users`
  INNER JOIN `certificates_users`
    ON `users`.`id` = `certificates_users`.`user_id`
WHERE `certificates_users`.`certificate_id` IN (1, 2)
GROUP BY `users`.`id`
HAVING COUNT(`certificates_users`.`id`) = 2

When increasing the certificate_ids array the query stays the same, other than the following two parts.

  • IN (1, 2) becomes IN (1, 2, 4, 7, 8)
  • COUNT(...) = 2 becomes COUNT(...) = 5

References

Most of the code used speaks for itself. If you're not yet familiar with reduce I recommend taking a look at the documentation. Since this method is present in a lot of other programming languages. It's a great tool to have in your arsenal.

The arel gem API isn't really meant for use outside of the internals of Rails. For this reason it's documentation is pretty bad. However you can find most of the methods used here.

3limin4t0r
  • 19,353
  • 2
  • 31
  • 52
  • 1
    Wow I’m convinced that this will work even without trying it out yet. I’ll give it a shot and drop a comment once I get it to work. Thank you very much. – mayorsanmayor Jan 15 '19 at 13:31
  • Since `users_with_certificate_1_and_2` is just a subset of `users_with_certificate_1` you could also do `users_with_certificate_1_and_2 = users_with_certificate_1.where(id: certificate_2_user_ids)`. Use whatever has your preference. – 3limin4t0r Jan 15 '19 at 16:41
  • @mayorsanmayor I've updated the answer to include a dynamic certificate IDs section. – 3limin4t0r Jan 16 '19 at 13:45
  • Thanks for the update on your answer @Johan I really appreciate it. Thank you. – mayorsanmayor Jan 16 '19 at 15:00