0

We have two indexes: posts and users. We'd like to make queries on these two indexes, search for a post in the index "posts" and then go to the index "users" to get the user info, to eventually return an aggregated result of both the user info and the post we found.

Let me clarify it a bit with an example:

posts: 
[
  {
    post: "this is a post about stack overflow",
    username: "james_bond",
    user_id: "007"
  },
  {...}
]

users: 
[
  {
    username: "james_bond",
    user_id: "007",
    bio: "My name's James. James Bond."
    nb_posts: "7"
  },
  {...}
]

I want to search for all the posts which contain "stack overflow", and then display all the users who are talking about it and their info (from the "users" index), it could look something like this:

result: {
  username: "james_bond",
  user_id: "007",
  post: "this is a post about stack overflow",
  bio: "My name's James. James Bond"
}

I hope this is clear enough, I'm sorry if this question has already been answered but I honestly didn't find any answer anywhere.

So is it possible to do so with only ES js?

mab
  • 11
  • 2
  • I also have this same curiosity. Obviously we can make multiple separate queries to ES from our app, and then manage the results in our app. The question is whether ES has any features specific for this use-case, which may be more efficient. – Kalnode Mar 13 '21 at 14:01

1 Answers1

1

I dont beleive it is possible to do exactly what you are asking as it would be very costly to join across two indexes which are potentially sharded across different nodes (this is not a main use case for elasticsearch). But if you have control of the data within elastic search you could structure the data so that you can acheive a different type of joining.

You can either use:

nested query

Where documents may contain fields of type nested. These fields are used to index arrays of objects, where each object can be queried (with the nested query) as an independent document.

has_child and has_parent queries

A join field relationship can exist between documents within a single index. The has_child query returns parent documents whose child documents match the specified query, while the has_parent query returns child documents whose parent document matches the specified query.

Denormalisation

Alternativly you could store the user denormalised within the post document when you insert the document into the index. This becomes a balancing act between saving time from doing multiple reads every time a post is viwed (fully normalised) and the cost of updating all posts from user 007 everytime his detials change (denormalised). There is a tradeoff here, you dont need to denormalise everything and as you have it you have already denormalised the username from users to posts.

Here is a Question/Answer that gives more detials on the options.

Damo
  • 5,698
  • 3
  • 37
  • 55