5

I'm using Rails 5 with PostGres 9.5. I have a table that tracks prices ...

                                         Table "public.crypto_prices"
       Column       |            Type             |                         Modifiers
--------------------+-----------------------------+------------------------------------------------------------
 id                 | integer                     | not null default nextval('crypto_prices_id_seq'::regclass)
 crypto_currency_id | integer                     |
 market_cap_usd     | bigint                      |
 total_supply       | bigint                      |
 last_updated       | timestamp without time zone |
 created_at         | timestamp without time zone | not null
 updated_at         | timestamp without time zone | not null

I would like to get the latest price per currency (where last_updated is greatest) for a select currencies. I can find all the prices related to certain currencies like so

current_prices = CryptoPrice.where(crypto_currency_id: CryptoIndexCurrency.all.pluck(:crypto_currency_id).uniq)

Then I can sort them by currency into arrays, looping through each until I find the one with the greatest last_updated value, but how can I write a finder that will return exactly one row per currency with the greatest last_updated date?

Edit: Tried Owl Max's suggestion like so

ids = CryptoIndexCurrency.all.pluck(:crypto_currency_id).uniq
crypto_price_ids = CryptoPrice.where(crypto_currency_id: ids).group(:crypto_currency_id).maximum(:last_updated).keys
puts "price ids: #{crypto_price_ids.length}"
@crypto_prices = CryptoPrice.where(crypto_currency_id: crypto_price_ids)
puts "ids: #{@crypto_prices.size}"

Although the first "puts" only reveals a size of "12" the second puts reveals over 38,000 results. It should only be returning 12 results, one for each currency.

Dave
  • 15,639
  • 133
  • 442
  • 830
  • Just throwing out a thought - Have you considered ordering by` last_updated` and grabbing the first instance? – gwalshington Sep 01 '17 at 18:38
  • That would only return one result, wouldn't it? – Dave Sep 01 '17 at 21:59
  • If you want more than one, you can do something like `.first(3)` for 3 (fill in whatever number). – gwalshington Sep 04 '17 at 15:35
  • That's not what I mean. I only what the latest price per currency for certain currencies (e.g. crypto_currency_id values I specify). Where would I speicfy the specific IDs using "first(3)"? – Dave Sep 04 '17 at 15:55

4 Answers4

1

We can write a finder that will return exactly one row per currency with the greatest last_updated date in such a way like

current_prices = CryptoPrice.where(crypto_currency_id: CryptoIndexCurrency.all.pluck(:crypto_currency_id).uniq).select("*, id as crypto_price_id, MAX(last_updated) as last_updated").group(:crypto_currency_id)

I hope that this will took you closer to your goal. Thank you.

Ahmad Ali
  • 556
  • 2
  • 8
  • 1
    When I run this I get the error, "ActiveRecord::StatementInvalid: PG::GroupingError: ERROR: column "crypto_prices.id" must appear in the GROUP BY clause or be used in an aggregate function" – Dave Sep 04 '17 at 16:57
  • Can you please give it a try by updating select by `("*, id as crypto_price_id, MAX(last_updated) as last_updated")`. I have also read somewhere that we also need to explicitly specify ordering too in case of PG. So if above solution doesn't workout for you, also add ordering clause, that may help out. Thank You. – Ahmad Ali Sep 05 '17 at 04:27
  • Please edit your question to include the code that would generate the answer. I'm unclear about what you want me to do. – Dave Sep 05 '17 at 14:32
  • unfortunately, this query does not work with postgreSQL, you need to include the id in the aggregate function like `.group(:crypto_currency_id, :id)` but then you don't have anymore unique crypto_currency_ids. – Kruupös Sep 07 '17 at 08:14
0

Using this code you can fetch the latest updated row here from particular table.

  CryptoPrice.order(:updated_at).pluck(:updated_at).last

This Should be help for you.

Pasupathi Thangavel
  • 942
  • 1
  • 10
  • 17
  • THis only returns one row always, right? Also you're not taking account of the fact taht I only want to return the latest price for certain currencies (identified by the ids in my question) – Dave Sep 04 '17 at 14:35
0

Only works with Rails5 because of or query method

specific_ids = CryptoIndexCurrency.distinct.pluck(:crypto_currency_id)
hash = CryptoPrice.where(crypto_currency_id: specific_ids)
                  .group(:crypto_currency_id)
                  .maximum(:last_updated)
hash.each_with_index do |(k, v), i|
  if i.zero?
    res = CryptoPrice.where(crypto_currency_id: k, last_updated: v)
  else
    res.or(CryptoPrice.where(crypto_currency_id: k, last_updated: v))
  end
end

Explanation:

You can use group to regroup all your CryptoPrice object by each CryptoIndexCurrency presents in your table.

Then using maximum (thanks to @artgb) to take the biggest value last_updated. This will output a Hash with keys: crypto_currency_id and value last_updated.

Finally, you can use keys to only get an Array of crypto_currency_id.

CryptoPrice.group(:crypto_currency_id).maximum(:last_updated)
=> => {2285=>2017-06-06 09:06:35 UTC,
       2284=>2017-05-18 15:51:05 UTC,
       2267=>2016-03-22 08:02:53 UTC}

The problem with this solution is that you get the maximum date for each row without getting the whole records.

To get the the records, you can do a loop on the hash pairwise. with crypto_currency_id and last_updated. It's hacky but the only solution I found.

Kruupös
  • 5,097
  • 3
  • 27
  • 43
  • Thanks. Here you're finding the latest price for every currency, but what I was going for was only get teh latest price for certain currencies (e.g. "where crypto_currency_id in (ids)"). How would I add that condition into what you have listed? – Dave Sep 04 '17 at 14:40
  • Just add a `where` clause before `group`ing ! e.g `CryptoPrice.where(crypto_currency_id: specific_ids). group(:crypto_currency_id).maximum(:last_updated).keys` it will do the same thing, except it will not take in account the `CryptoIndexCurrency` you don't need. – Kruupös Sep 04 '17 at 14:44
  • I edited my question to include your suggestion. THis line "@crypto_prices = CryptoPrice.where(crypto_currency_id: crypto_price_ids)" is returning tens of thousands of results, which is incorrect. It shoudl only be returning one result per currency. – Dave Sep 04 '17 at 16:29
  • I think it's because your table allows to have multiple rows with the same `crypto_currency_id`. That mean you have to select `cryptoprice.id` and not only `crypto_currency_id` whithin the group. I will show you an example in few hours! You are close to the solution. – Kruupös Sep 04 '17 at 20:17
  • @Dave very nice question! I think I found a solution but it's hacky, let me know if it works, if it doesn't, I'm sorry but that mean I'm not good enough to find a solution and I will stop trying ^^ – Kruupös Sep 05 '17 at 15:54
0

This is currently not easy to do in Rails in one statement/query. If you don't mind using multiple statements/queries than this is your solution:

cc_ids = CryptoIndexCurrency.distinct.pluck(:crypto_currency_id)

result = cc_ids.map do |cc_id|
  max_last_updated = CryptoPrice.where(crypto_currency_id: cc_id).maximum(:last_updated)
  CryptoPrice.find_by(crypto_currency_id: cc_id, last_updated: max_last_updated)
end

The result of the map method is what you are looking for. This produces 2 queries for every crypto_currency_id and 1 query to request the crypto_currency_ids.

If you want to do this with one query you'll need to use OVER (PARTITION BY ...). More info on this in the following links:

But in this scenario you'll have to write some SQL.

EDIT 1:

If you want a nice Hash result run:

cc_ids.zip(result).to_h

EDIT 2:

If you want to halve the amount of queries you can shove the max_last_updated query in the find_by as sub-query like so:

cc_ids = CryptoIndexCurrency.distinct.pluck(:crypto_currency_id)

result = cc_ids.map do |cc_id|
  CryptoPrice.find_by(<<~SQL.squish)
    crypto_currency_id = #{cc_id} AND last_updated = (
      SELECT MAX(last_updated) 
      FROM crypto_prices 
      WHERE crypto_currency_id = #{cc_id})
  SQL
end

This produces 1 queries for every crypto_currency_id and 1 query to request the crypto_currency_ids.

3limin4t0r
  • 19,353
  • 2
  • 31
  • 52
  • I don't mind running multiple statements, but how many queries is this generating? I want to minimize the number of queries. – Dave Sep 06 '17 at 15:06
  • 2 queries per `crypto_currency_id` + 1 for fetching the `crypto_currency_ids`. Keep in mind that the query to fetch `max_last_updated ` only returns 1 value and `CryptoPrice.find_by(crypto_currency_id: cc_id, last_updated: max_last_updated)` only returns one record. There is never a whole set of records initialized in Rails. If you want to know the exact number of queries, run: `CryptoIndexCurrency.distinct.pluck(:crypto_currency_id).count * 2 + 1` – 3limin4t0r Sep 06 '17 at 15:12
  • Thanks but I think Owl max's solution only has 2 queries -- one to get the IDs and anotehr to get the data . – Dave Sep 06 '17 at 16:16
  • That's correct, keep in mind that he doesn't fetch `crypto_currency_id` from `CryptoIndexCurrency`. If your `CryptoPrice` table contains `nil` values, or values not present in `CryptoIndexCurrency` those will be returned as well. – 3limin4t0r Sep 06 '17 at 16:40
  • 1
    @JohanWentholt, you are right, I skipped the first `where` clause in the first query because I though it was not necessary to understand the idea. But you are right it add another query. I'm sure the solution is possible to write in one query, but I'm not sure it is possible with rails ORM (in POSTGRESQL anyway, it is possible in MySQL for sure). I think Vakiliy approches is the best that far. I haven't test the query but writing this is row SQL is a good idea. Your idea is good too but not scalable on huge DB because it produces N+1 query problem. – Kruupös Sep 07 '17 at 07:59