In leveldb, which a low level database abstraction you can "only" query by exact key match or prefix key range.
You can not query by value without somekind duplication.
What pattern I adopted in my graphdb project is to follow the EAV model with a secondary "table" to store the index.
In Python plyvel you can emulate "table" using prefixed databases. Or see how FoundationDB does it in its Subspace implementation. Basically, every key-value pair of a given "table" or "space" is prefixed with a particular bytes sequence, that is all.
The first table, looks like the following:
(Entity, Attribute) → (Value)
Where Entity
is a (random) identifier and Attribute
is the byte representation of field name and last but not least Value
is the bytes serialized value associated with Attribute
for the given Entity
.
The table schema is done that way so that you can quickly fetch using a range query all Attribute
and Value
using prefix range search over a given Entity
.
The index table use the following schema:
(Attribute, Value) → Entity
That is it a shuffled version of the first table.
This is done like so, to make it possible to quickly fetch Entity
that match a particular Attribute == Value
that's what you are looking for.
There is alternative implementations for what you are looking for. Lookup my answers about leveldb and key-value stores e.g. Expressing multiple columns in berkeley db in python?
Good luck!