How do these 4 types of queries take advantage of indexes? What does the scan look like?
WHERE status = "foo"
WHERE id IN (1, 2, 3)
WHERE id IN (1, 2, 3) AND status = "foo"
WHERE id IN (1, 2, 3) OR status = "foo"
In the first case, I think this is a B+tree with the key being the status. Easy enough. But wait, it needs to store multiple items per status, so maybe it has an array (generally speaking) of the records for each status.
But for the second query, it seems you would just have the index be for id
and just fetch from the B+tree each key one id
at a time, so it would do tree.get(id)
for each id
. But that is already seeming less than ideal. How is it actually done?
Then take it further and combine the two query types, you can only use one of the indexes now (say the id
index, not the status
index). Then you get the subset of records matching these IDs, and iterate through them and check the status.
Now we are starting to seem really inefficient.
Same with the OR query.
How are these typically implemented in a database, generally or ideally speaking?
I am asking because I would like to implement a basic version of this in JavaScript for the browser. Basically, what the best way is to have multiple (potentially multi-columned) indexes on a table. So I can store a record in this "table", it gets stored in every index, and then on a query it fetches from the "best" index. I am not really sure how this works at a high level (high level yet very deep in terms of data-structure/algorithm implementation) to get started.
This is the template I am basically starting with:
class Index {
constructor(fields = ['id']) {
this.fields = fields
this.tree = new Tree
}
insert(record) {
this.tree.insert(this.getKey(record), block)
}
remove(record) {
this.tree.remove(this.getKey(record))
}
check(record) {
return this.tree.check(this.getKey(record))
}
getKey(record) {
return this.fields.map(field => record[field]).join('')
}
}
class Table {
constructor() {
this.index = []
}
insert(record) {
this.index.forEach(index => index.insert(record))
}
select(query) {
// query processing
}
remove(id) {
}
}
So basically, for each table you create several indexes. When you insert a record, it gets the key for each index and inserts it into a Tree
(the B+tree that acts like a key/value store). From there I don't know how to properly use the indexes, or if I'm even on the right track. I would ask how an ideal relational database would implement this, but that would likely get downvoted as being too general :/ but that's what I'm actually trying to build.
I have this B+tree as an example to work with.