I want to do Solr Delta Import, but I don't want to update the whole document. Is there a way I can instruct solr to update only certain field when do the delta import?
-
Possible duplicate of [Can Solr DIH do atomic updates?\`](https://stackoverflow.com/questions/21006045/can-solr-dih-do-atomic-updates) – MatsLindh Jan 04 '18 at 15:18
-
@MatsLindh No, atomic updates and in-place updates are different concepts. In-place updates are available since Solr 6.5 for integer fields only. Under the hood atomic update reindex the whole document so this answer https://stackoverflow.com/questions/21006045/can-solr-dih-do-atomic-updates is absolutely wrong. – Ivan Mamontov Jan 04 '18 at 19:25
-
Yes, I know that they are different concepts. The _messages_ sent are however identical, and thus, the way to instruct DIH to generate a message that perform an update compared to a full insert, should be the same. Whether an atomic update or an in-place update takes place is decided by the sequence of parameters you provided in your post - but that's only the configuration, not how to make DIH do the actual in-place update. In what way is the linked answer wrong about how to make DIH do this? – MatsLindh Jan 04 '18 at 19:42
1 Answers
Theory
This feature is knows as in-place update. An in-place update is performed only when the field to be updated meet these conditions:
- non-indexed (indexed="false")
- non-stored (stored="false")
- single valued (multiValued="false")
- numeric docValues (docValues="true") fields
In other words this feature is based on a special data structure DocValues so you can not update non DocValues field without whole document reindexing. You can read more details about updatable DocValues in the following jira issues:
Practice
Here is an example via SolrJ:
HttpSolrClient client = new HttpSolrClient("http://localhost:8983/solr");
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id","1");
Map<String,Object> fields = new HashMap<>();
fields.put("inc", "-1");
doc.addField("count", fields);
client.add(doc);
client.close();
Or via CURL:
curl http://localhost:8983/solr/library/update -d '
[
{"id" : "1",
"count" : {"inc":"-1"}
}
]'
Where field count is declared as:
<field name="count" type="int" indexed="false" stored="false" docValues="true"/>
Please note in case of wrong field configuration an "Atomic Update" will be applied.
"Atomic Updates"
You can "update" any field in document without any restrictions by "Atomic Updates". Atomic Update does not actually do in-place update - it deletes the old document and then indexes a new document with the update applied to it in one shot. Under the hood it requires that all fields in your schema must be configured as stored and copy fields as not stored(keep in mind nested documents) and tries to reconstruct the whole document from the stored fields. In case of any misconfiguration you will lost a huge part of document without any notification. In general atomic update has the following drawbacks:
- Reindexing of entire documents and passing them through all analysis chains consumes a lot of CPU cycles
- Index size is increased by storing original documents data
- New index segments are created and old documents are marked as deleted in existing segments, causing segment merge policies to kick in and use additional CPU and build up I/O pressure
- Most importantly, searches have to be reopened after commits make changes visible. This wipes all accumulated filter caches, document caches, and query result caches
- Index commits, which make changes visible, wipe filter and field caches as additional segments are added to the index;
- In the case of a block index structure, whole blocks of the documents have to be reindexed, significantly increasing overhead

- 2,874
- 1
- 19
- 30