0

I've stumbled across this snippet of code:

users = User.where([...])

users.group!(:character_id, :age)

user_zip = {
  character_id: :character_id,         (=> has one character)
  age:          :age,
  amount:       Arel.sql('SUM(amount)')
}

user_cleaned = users.pluck(*user_zip.values)

The output then is an array with the "transformed"(?) user data.

What exactly is happening here? I've never seen this before. and is it possible to use this for a single record?

marcHoll90
  • 27
  • 5
  • What SQL server do you use? Is there any grouping in `User.where([...])`? – Jan Vítek Feb 28 '23 at 08:10
  • Postgres and yes there is grouping afterwards. forgot to add it, sorry. will edit it now – marcHoll90 Feb 28 '23 at 08:12
  • Does this answer your question? [What is the difference between pluck and collect in Rails?](https://stackoverflow.com/questions/12176102/what-is-the-difference-between-pluck-and-collect-in-rails) – JBallin Feb 28 '23 at 08:46
  • But it's not being called on a single record. It's being called on the relation returned by `users = User.where([...])`. – max Feb 28 '23 at 09:24

4 Answers4

3

There are several noteworthy things happening in this snippet:

  • Array destructuring using *
  • pluck
  • Arel.sql
  • Single record

Destructuring (*)

Pluck is defined as

pluck(*column_names)

which means it takes a variable amount of arguments.

If you'd call it with

pluck([:age, :name])

the method would see it as an array of arrays ([[:age, :name]]) so you have to "destructure" the array firs:

pluck(*[:age, :name])

or call it with single values in the first place:

pluck(:age, :name)

(which might not be possible if you receive the values as array, as in your case)

See this Thoughtbot post for more details

pluck

pluck does two things:

  • it modifies the selectclause of the generated SQL statement
  • and it bypasses creation of the ActiveRecord Model and returns the data only.

So the following lines return the same data

User.all.map { |user| [user.age, user.name] }
User.all.select(:age, :name).map { |user| [user.age, user.name] }
User.pluck(:age, :name)

The first is the least efficient: it selects ALL columns and creates the models just to discard them when calling map

The second is a bit more efficient in that it only selects the required columns. Still creates the models though.

pluck is without the overhead of selecting all columns and creating the models. So it usually is faster and more memory efficient.

On the SQL level it boils down to the difference between

select * from users

and

select age, name from users

So IMHO pluck is especially useful for APIs or when exporting to CSVs and similar.

Arel.sql

Rails wants you to wrap unsafe SQL in Arel.sql. It is a security measure (that i personally find annoying and useless) and lets you know that you have a potentially unsafe operation that Arel/ActiveRecord can not check for you.

Single record

You can not call it on a single record (and it would also not make much sense, since the single record is already the result of a DB query that was executed and there a Model which has been constructed).

If you want something similar for a single record that you have already loaded from the DB:

user.attributes.values_at('age', 'name')

or you can limit the query to just return results for one record

User.where(id: 123).pluck(:age, :name) # only one record since PK

or User.where(some conditions).limit(1).pluck(:age, :name)

Note that pluck is somewhat inconsistent in its return values:

If you only pluck one attribute, then it returns an array of values

User.pluck(:name) => ["Carl", "Mike"]

but when you pluck multiple attributes it returns an array of arrays of values

User.pluck(:name, :age) => [["Carl", 21], ["Mike", 35]]

Pascal
  • 8,464
  • 1
  • 20
  • 31
  • 1
    Ironically though half the time when I encounter pluck its used in way that makes the code less efficient - for doing `pluck(:id)` when its not actually needed and where it prevents AR from using a subquery. – max Feb 28 '23 at 10:17
  • yeah,one should not do something like this `User.where(category_id: Category.where(....).pluck(:id))` I most often use it for exports, reports, APIs or when providing limited amount of data for a chart or similar. – Pascal Feb 28 '23 at 10:42
  • If you're actually doing any work with the results use `select_all` instead. IMHO pluck is only actually useful when you want an ordered list of the columns. – max Feb 28 '23 at 10:48
1

Let's break it down starting at the end:

user_zip.values is saying "take the Hash user_zip and return an array of the values from every key." For example, this hash:

user_zip = {
  character_id: :character_id,
  age:          :age,
  amount:       Arel.sql('SUM(amount)')
}

Would return:

[
  :character_id,
  :age,
  Arel.sql('SUM(amount)')
]

Moving one step up, user_cleaned = users.pluck(*user_zip.values) is using the splat operator to destructure the above array into arguments that get passed into pluck() so that you are actually running:

users.pluck(
  :character_id,
  :age,
  Arel.sql('SUM(amount)')
)

As for your question about using this for a single record, the purpose of pluck is to retrieve values from multiple records. If you wish to retrieve only a single record then you don't use pluck (and it isn't a method for a single record). However you can limit your where() query so that it returns a single record, for example with .limit(1), and then you can use pluck on it as described:

users = User.limit(1).where([...])
anothermh
  • 9,815
  • 3
  • 33
  • 52
1

That code is a very overcomplicated way of doing:

user_cleaned = users.group(:character_id, :age).pluck(
  :character_id,
  :age,
  Arel.sql('SUM(amount)')
)

Composing the hash and extracting the values from it is clever but completely pointless - it doesn't make the code more readible or maintainable.

What exactly is happening here?

pluck selects the list of columns and immediately fires a database query. The result from the query is an array of arrays instead of model instances.

You would get something like:

[[1, 25, 100], [5, 61, 2000], [5, 16, 50]]

This is a handy low level tool for very simple cases. But its also wildly overused.

Half of the time its not actually needed in the first place - like almost every time someone does pluck(:id).

select_all does the same thing - returns raw query values but it returns the resulting rows as a hash instead which is far less messy to work with.

is it possible to use this for a single record?

Yes - if you apply a limit(1) or where(id: x). But that is kind of silly. The overhead for creating a single model instance is negible.

max
  • 96,212
  • 14
  • 104
  • 165
  • whether it is pointless to construct the hash depends on how/if it is used in other places – Pascal Feb 28 '23 at 10:40
  • @Pascal maybe I'm making a bit too many assumptions but I'm really just going off the code in the question. There it doesn't actually do anything meaningful and if I was to take a guess its not used at all or used to zip the results into hashes again - which is a bad use of `pluck`. – max Feb 28 '23 at 10:45
0

I'll start from the end. Since pluck is executed on database level as SELECT pluck_columns FROM ... it cannot be executed on single record and it cannot be executed on Array of records. It has to be called on 'ActiveRecord::Relation'.

If you run this in development environment console you will see that it creates a query like:

 SELECT character_id, age, SUM(amount) FROM users GROUP BY character_id, age;

So the "transformed" data are amounts held by groups of users of the same character_id and age.

If you need explanation of the ruby part of code like *user_zip.values please refer to the other answer by @anothermh

Jan Vítek
  • 671
  • 4
  • 11
  • As of rails 5 `pluck` is an `Enumerable` method, as well as an `ActiveRecord::Calculations` method, so it can be called on an `Array` (or any other object that includes Enumerable), although this is an awkward implementation unless the `Array` is comprised of `Hash` objects (or another object that logically responds to `#[]`). For instance: `a = [{name: 'engineersmnky'}]` then `a.pluck(:name) #=> ["engineersmnky"]` however `a = ['engineersmnky']` could also to be called like `a.pluck(/.*(?=m)/,/m.*/) #=> [["engineers","mnky"]]` – engineersmnky Feb 28 '23 at 14:17
  • 1
    I know about `pluck` implementation for Array and other Enumerables however it is very different from `pluck` on ActiveRecord Relation and since this case used Arel as an attribute it was obvious it wouldn't work on anything but Relation. Anyway I should have been more specific. – Jan Vítek Feb 28 '23 at 20:40