9

I'm using apache solr for the matching functionality of my webapp, and I encountered a problem of this scenario:

I got three programmer, the skill field are their skills, "weight" means how well that skill he/she has:

{
    name: "John",
    skill: [
        {name: "java", weight: 90},
        {name: "oracle", weight: 90},
        {name: "linux", weight: 70}
    ]
},
{
    name: "Sam",
    skill: [
        {name: "C#", weight: 98},
        {name: "java", weight: 75},
        {name: "oracle", weight: 70},
        {name: "tomcat", weight: 70},
    ]
},
{
    name: "Bob",
    skill: [
        {name: "oracle", weight: 90},
        {name: "java", weight: 85}
    ]
}

and I have a job seeking for programmer:

{
    name: "webapp development",
    skillRequired: [
        {name: "java", weight: 85},
        {name: "oracle", weight: 85},
    ]
}

I want use the job's "skillRequired" to match those programmer(to find the best guys for the job). In this case, it should be John and Bob, Sam was kicked off cause his java and oracle skill is not good enough. and John should scored higher than Bob, cause he know oracle better.

problem is, solr can't index nested object, the best format I think I can get is:

name: "John",
skill-name: ["java", "oracle", "linux"],
skill-weight: [90, 90, 70]

and so on. so I don't know if that possible to construct a query to get this scenario working.

Is there a better schema structure for it? or using index/query time boost?

I read almost all of the solr wiki and google around with no luck, any tips and workaround is welcomed.

Problem solved, Log my solution here for help:

1st, My data format is json, so I need solr-4.8.0 for support index nested data with json. if the data was xml format, solr-4.7.2 still work.

2nd, solr-4.8.0 need java7-u55 (official recommended)

3rd, nested document/object should submitted to solr with "childDocuments" key. and for identify the type of parent/child document, I add and "type" field . so with the example above, it seems like this:

   {
        type: "programmer",
        name: "John",
        _childDocuments_: [
            {type:"skill", name: "java", weight: 90},
            {type:"skill", name: "oracle", weight: 90},
            {type:"skill", name: "linux", weight: 70}
        ]
    },
    {
        type: "programmer",
        name: "Sam",
        _childDocuments_: [
            {type:"skill",name: "C#", weight: 98},
            {type:"skill", name: "java", weight: 75},
            {type:"skill", name: "oracle", weight: 70},
            {type:"skill", name: "tomcat", weight: 70},
        ]
    },
    {
        type: "programmer",
        name: "Bob",
        _childDocuments_: [
            {type:"skill", name: "oracle", weight: 90},
            {type:"skill", name: "java", weight: 85}
        ]
    }

4th, after submit and commit to solr, I can match the job with block join query (in filter query):

fq={!parent which='type:programmer'}type:skill AND name:java AND weight:[85 TO *]&
fq={!parent which='type:programmer'}type:skill AND name:oracle AND weight:[85 TO *]
Hetfield Joe
  • 1,443
  • 5
  • 15
  • 26
  • 2
    could you, please, provide schema.xml for this particular case? – frankie Aug 11 '15 at 11:07
  • Did you have to add the _ root _ filed to your schema? I was following the guidelines from http://yonik.com/solr-nested-objects/, and before adding a nested document, I had to update the schema: $ curl http://localhost:8983/solr/nested_demo/schema -X POST -H 'Content-type:application/json' --data-binary '{ "add-field" : { "name":"_ root _", "type":"string", "indexed":true, "stored":false } }' – alisa Mar 04 '16 at 21:28
  • Can you please provide schema? How did you declare this field in schema? – Pratik Patel Oct 04 '16 at 13:46
  • @PratikPatel Sorry, I quitted that company very long time ago, and all the knowledge left there. maybe you can try elastic search? that seems much more popular. – Hetfield Joe Oct 08 '16 at 05:24

1 Answers1

3

You can try BlockJoinQuery. Refer here

sidgate
  • 14,650
  • 11
  • 68
  • 119
  • 1
    Nice! Very useful clue! and I find it here finally solved my problem: http://heliosearch.org/solr-4-8-features/ – Hetfield Joe May 13 '14 at 02:15
  • 1
    The site is not reachable! Can you please update your answer? @HetfieldJoe – Tim Long Jan 20 '16 at 14:29
  • @TimLong link works fine for me. Plz try again. Also you can google for block join query. Another resource is http://yonik.com/solr-nested-objects/ – sidgate Jan 21 '16 at 08:56
  • @TimLong As of today, you probably might want to take a look at a newer Solr5.3 features: http://yonik.com/solr-nested-objects/ – alisa Mar 04 '16 at 21:26