2

Suppose I have a table

STOCKCODE, DATETIME, PRICE

I want to perform a basic query such as finding the maximum price between 12.00 PM and 3.00 PM on a given date for a given stock. Currently I index by:

STOCKCODE, DATETIME, PRICE

DATETIME is granular down to the millisecond level in other words there is basically no duplicate data. For the type of query that I wish to perform is there a better way to index? Or more generally is there a better schema that I could use?

Itay Moav -Malimovka
  • 52,579
  • 61
  • 190
  • 278
deltanovember
  • 42,611
  • 64
  • 162
  • 244

1 Answers1

3

I would say your approach is good, except the index on price, since datetime is granular enough, you won't have more than one (or two or maybe maybe three) entries of price per one stockcode+datetime combinations.

This is only to support the action you described above, if you have other actions, you might need (most probably) other things.

Itay Moav -Malimovka
  • 52,579
  • 61
  • 190
  • 278
  • +1. Including `price` to make it a covering index could be beneficial, though – Michael Haren Jul 07 '11 at 01:16
  • Does this mean if my time frame includes one million trades, the database needs to scan one million rows sequentially? – deltanovember Jul 07 '11 at 01:21
  • @deltanovember - absolutly, yes. Think of this, how would you do it if it was on a piece of paper (use a smaller number, say 20), first you would take all the entries inside the time frame (here you used the index), now you need to go trough the hip of entries to find the highest one. Since the list is already ordered by time, it is not ordered by price. – Itay Moav -Malimovka Jul 07 '11 at 01:23