I don't know how to define in the schema a field obj1 with type
"object" or something similar.
You can't (at least not in the way you think it)
Solr is not designed in that way: the unit of information is a document that is composed by fields; fields may be of different types, but, in short, they are only primitive types (strings, numbers, booleans), fields cannot be complex objects. Take a look at How Solr Sees the World in the documentation.
Does it mean you can't manage nested documents? No. You can manage them with some caveats
How to define the schema
First of all you need to define the internal _root_
field like this:
<field name="_root_" type="string" indexed="true" stored="false" docValues="false" />
Then you need to merge all "primitive" fields of your parent and children objects in a single list of fields. This has some counterparts that are also mentioned in the solr documentation:
- you have to define an id field that must exist for both parent and children objects and you have to guarantee it is globally unique
- only fields that exists in both parent and children objects can be declared as "required"
For example let's see a slightly more complex case where you can nest multiple comments to blog posts:
public class BlogPost {
@Field
String id;
@Field
String title;
@Field(child = true)
List<Comment> comments;
}
public class Comment {
@Field
String id;
@Field
String content;
}
Then you need a schema like this:
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="${solr.core.name}" version="1.5">
<types>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="long" class="solr.LongPointField" positionIncrementGap="0"/>
<fields>
<field name="_version_" type="long" indexed="true" stored="true" />
<field name="_root_" type="string" indexed="true" stored="false" docValues="false" />
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true" />
<field name="title" type="string" indexed="true" stored="true" multiValued="false" required="false" />
<field name="content" type="string" indexed="true" stored="true" multiValued="false" required="false" />
</fields>
<uniqueKey>id</uniqueKey>
</schema>
How to index documents
Using solrj it is pretty straightforward: simply create your nested objects in Java and the library will take care of creating the correct request when adding them
final BlogPost myPost = new BlogPost();
myPost.id = "P1";
myPost.title = "My post";
final Comment comment1 = new Comment();
comment1.id = "P1.C1";
comment1.content = "My first comment";
final Comment comment2 = new Comment();
comment2.id = "P1.C2";
comment2.content = "My second comment";
myPost.comments = List.of(comment1, comment2);
...
solrClient.addBean("my_core", myPost);
How to retrieve documents
This is a little bit tricky: to rebuild the original object and its children you have to use the child doc transformer in your request (query.addField("[child]")
):
final SolrQuery query = new SolrQuery("*:*");
query.addField("*");
query.addField("[child]");
try {
final QueryResponse response = solrClient.query("my_core", query);
final List<BlogPost> documents = response.getBeans(BlogPost.class);