0

to get only data with latest timestamp i use the mysql query from this answer:

fetch-the-row-which-has-the-max-value- for-a-column

my query is following:

SELECT stock_id,timestamp,price FROM market m1
                        WHERE timestamp = 
                            (SELECT MAX(timestamp) FROM market m2 WHERE m1.stock_id = m2.stock_id)

but it takes 10 minutes to execute.

what are the different options to optimize it? (mysql)

market has following schema: (sqlalchemy)

class Market(db.Model):
    stock_id=db.Column(db.Integer,db.ForeignKey('stock.id'),primary_key=True)
    timestamp=db.Column(db.Integer,primary_key=True)
    price=db.Column(db.Float)
Rick James
  • 135,179
  • 13
  • 127
  • 222
alexprice
  • 394
  • 4
  • 12

2 Answers2

1

You are using a so-called dependent subquery. These can sometime be very inefficient.

You can use this subquery to retrieve the max timestamp for each stock_id value.

                   SELECT MAX(timestamp) timestamp, stock_id
                     FROM market
                    GROUP BY stock_id

Your compound primary key should make this quite fast because MySQL will do a "loose index scan" to satisfy it.

Then you can use this as a subquery, as follows:

  SELECT m.stock_id, m.timestamp, m.price
    FROM market m
    JOIN (
                   SELECT MAX(timestamp) timestamp, stock_id
                     FROM market
                    GROUP BY stock_id
         ) maxt on m.stock_id = maxt.stock_id AND m.timestamp = maxt.timestamp

This, too, should exploit your primary key.

O. Jones
  • 103,626
  • 17
  • 118
  • 172
0

You are using a so-called dependent subquery. These can sometimes be very inefficient, because the query planner repeats them for every value from the outer query (m1.stock_id in your case).

You can use this subquery to retrieve the max timestamp for each stock_id value.

                   SELECT MAX(timestamp) timestamp, stock_id
                     FROM market
                    GROUP BY stock_id

Your compound primary key should make this quite fast because MySQL will do a "loose index scan" to satisfy it.

Then you can use it as a subquery, as follows:

  SELECT m.stock_id, m.timestamp, m.price
    FROM market m
    JOIN (
                   SELECT MAX(timestamp) timestamp, stock_id
                     FROM market
                    GROUP BY stock_id
         ) maxt on m.stock_id = maxt.stock_id AND m.timestamp = maxt.timestamp

This, too, should exploit your primary key. https://use-the-index-luke.com/

O. Jones
  • 103,626
  • 17
  • 118
  • 172