I am working on a project where the specification requires a parent - child relationship within the Solr data collection ... i.e. a user and the collection of languages they speak (each of which is made up of multiple data fields). My production system is a 4.10 Solr implementation but I have a 5.5 implementation as my disposal as well. Thus far, I am not getting this to work on either one and I have yet to find a complete documentation source on how to implement this.
The goal is to get a resulting document from Solr that looks like this:
{
"id": 123,
"firstName": "John",
"lastName": "Doe",
"languagesSpoken": [
{
"id": 243,
"abbreviation": "en",
"name": "English"
},
{
"id": 442,
"abbreviation": "fr",
"name": "French"
}
]
}
In my schema.xml, I have flatted out all of the fields as follows:
<field name="id" type="int" indexed="true" stored="true" required="true" multiValued="false" />
<field name="firstName" type="text_general" indexed="true" stored="true" />
<field name="lastName" type="text_general" indexed="true" stored="true" />
<field name="languagesSpoken" type="string" indexed="true" stored="true" multiValued="true"/>
<field name="languagesSpoken_id" type="int" indexed="true" stored="true" />
<field name="languagesSpoken_abbreviation " type="text_general" indexed="true" stored="true" />
<field name="languagesSpoken_name" type="text_general" indexed="true" stored="true" />
The latest rendition of my db-data-config.xml looks like this:
<dataConfig>
<dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:...." />
<document name="clients">
<entity name="client" query="SELECT * FROM clients" deltaImportQuery="SELECT * FROM clients WHERE id = ${dih.delta.id}" deltaQuery="SELECT id FROM clients WHERE updateDate > '${dih.last_index_time}'">
<field column="id" name="id" />
<field column="firstName" name="firstName" />
<field column="lastName" name="lastName" />
<entity name="languagesSpoken" child="true" query="SELECT id, abbreviation, name FROM languages WHERE clientId = ${client.id}">
<field name="languagesSpoken_id" column="id" />
<field name="languagesSpoken_abbreviation" column="abbreviation" />
<field name="languagesSpoken_name" column="name" />
</entity>
</entity>
</document>
...
On the 4.10 server, when the data comes out of Solr, I get one flat document record with the fields for one language inline with the firstName and lastname like this:
{
"id": 123,
"firstName": "John",
"lastName": "Doe",
"languagesSpoken_id": 243,
"languagesSpoken_abbreviation ": "en",
"languagesSpoken_name": "English"
}
On the 5.5 server, when the data comes out, I get separate documents for the root client document and the child language documents with no relationship between them like this:
{
"id": 123,
"firstName": "John",
"lastName": "Doe"
},
{
"languagesSpoken_id": 243,
"languagesSpoken_abbreviation": "en",
"languagesSpoken_name": "English"
},
{
"languagesSpoken_id": 442,
"languagesSpoken_abbreviation": "fr",
"languagesSpoken_name": "French"
}
I have spent several days now trying to figure out what is going on here to no avail. Can anybody provide me with a pointer as to what I am missing here?
Thanks, -- Jeff