For example, I have 1000 users. The data of each user is not big, maximum is 1GB. So I have 2 strategies for indexing.
- Big Indexing: I will have a single index. Then every time a user searches some data, I will add a
user_id
into the query. - Small indexing: Every user is an Elasticsearch index. Because the data is not huge, we only need 1-2 shards.
My opinion is the second method is a lot faster because we don't need to add user_id
into the query. The first method might be slower because it will go to many shards and at the same time, it must count user_id
into the query.
However, there are some ref1 ref2 that they recommend we should keep the total number of shards relatively small.
In a practical environment, what is a good solution for my situation?