We have some webpages having similar pages of same content (such as UPS mgmt consoles) within our internal network. solr always keeps only one of them because they have the same digest.
Indexer: finished at 2013-11-18 01:21:28, elapsed: 00:00:02
SolrDeleteDuplicates: starting at 2013-11-18 01:21:28
SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/collection_test5
SolrDeleteDuplicates: deleting 4 duplicates
SolrDeleteDuplicates: finished at 2013-11-18 01:21:29, elapsed: 00:00:01
crawl finished: crawl
All the 4 deleted duplicates are of different urls. I want to keep all of them in solr while solr can still delete the other kinds of duplicate content. I guess the url isn't used to generate the digest by default so is there any way to set to use the url? What other options do I have?