5

I have a solr 5.4.1 table that I allow users to make comments on. Through PHP, the user runs the following function:

function solr_update($id, $comments) {
  $ch = curl_init("http://url:8983/solr/asdf/update?commit=true");


  $data = array(
            "id" => $id,
            "comments" => array(
              "set" => $comments),
            );

  $data_string = json_encode(array($data));          

  curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
  curl_setopt($ch, CURLOPT_POST, TRUE);
  curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/json'));
  curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);

  echo curl_exec($ch);

}

While most of the time this works and I get this response:

{"responseHeader":{"status":0,"QTime":83}}

recently I've been running into times when this is the response I get from curl_exec($ch).

{"responseHeader":{"status":400,"QTime":6},
"error":{"msg":"Exception writing document id 376299 
to the index; possible analysis error.","code":400}}

I'm not sure what's causing this, but when it happens, it's like the entire table dies and I have to use a restorepoint to get it back (http://url:8983/solr/asdf/replication?command=restore&name=solr_backup_20161028).

If I try to load the core in solr (http://url:8983/solr/asdf), no records come up and it says "Luke is not configured". While I can still run a query: (http://url:8983/solr/asdf/select?q=*:*), I can't see the record count or modify the database at all.

Am I doing something wrong that is causing my table to become corrupted?

Edit

Bounty time. I really need some help solving this!

Edit2 - Server Logs

d:\solr\solr-5.4.1\server\logs\solr.log.2

    2016-11-15 13:59:49.997 ERROR (qtp434176574-19) [   x:invoice] 
o.a.s.s.HttpSolrCall null:org.apache.lucene.index.IndexNotFoundException:
 no segments* file found in
 NRTCachingDirectory(MMapDirectory@D:\solr\solr-
5.4.1\server\solr\asdf\data\restore.snapshot.solr_backup_20161104
 lockFactory=org.apache.lucene.store.NativeFSLockFactory@17df8f10;
 maxCacheMB=48.0 maxMergeSizeMB=4.0): files: [_pej4.fdt,
 _pej4_Lucene50_0.doc, _pej4_Lucene50_0.tim, _wxvy.fdt,
 _wxvy_Lucene50_0.doc, _wxvy_Lucene50_0.tim]

Also, I now have several "solr_backupYYYYMM" folders in my asdf/data folder. Can I manually delete these without causing problems? The only thing, I believe, that runs against solr after 5pm is a python script that I wrote to backup solr each night, and as part of the script, it is currently deleting any folder with a YYYYMM string that is older that 7 days (so I don't run out of space). I've taken that part OUT of the script as of yesterday, in case that might be what is causing the issue. I'm just trying to think of everything.

enter image description here

Brian Powell
  • 3,336
  • 4
  • 34
  • 60
  • Mostly guessing here, but roughly equivalent to "invalid SQL" from an SQL server? GIGO, double check all submitted data? Something's null, required attribute is missing? Bad quote? (that sounds a tad too SQL, :P ) – Kevin_Kinsey Oct 31 '16 at 17:06
  • Incidentally, "num docs" plus "deleted docs" equals "max docs" in your report. – Kevin_Kinsey Oct 31 '16 at 17:07
  • Is there anything specific about "comments" on the ones that fail (e.g. NULL, string vs. array etc) or do they contain a character that doesn't appear in others (e.g. comma or higher-order character such as é or multi-byte characters?) – Robbie Nov 15 '16 at 23:42
  • Seconding what @Robbie is saying - . and ' can be particularly painful. Any chance you could test it and find one where it always fail and one that passes and add those examples + the logs to the question? (I appreciate it's a production system, so this is just a tentative request.) – Lefty G Balogh Nov 16 '16 at 04:30
  • Can you post an example of a query that triggers this error, and perhaps the stack trace from solr as well? – ChristianF Nov 16 '16 at 07:29
  • @everyone - I don't know what exactly is causing it, if it is a user writing a specific comment or not. I leave at 5pm, and solr is up. I come in at 7 the next day and it has this error - sometimes. I doubt anyone has written anything to the database during that time, but it's **possible**. I found the logs folder (I hadn't looked there before). I've updated my OP with some of the "ERROR" items in the log. `solr.log.2` is the most recent file to `solr.log`, which I can't open because solr is up right now. – Brian Powell Nov 16 '16 at 15:11
  • Does this help: http://stackoverflow.com/questions/11204602/how-to-fix-indexnotfoundexception-no-segments-file-found – Robbie Nov 17 '16 at 01:08
  • @Robbie kind of? I mean, it's the same error, that solr can't find can't find something that it needs (`segments`) - but WHY that's happening, when I don't think any changes are being made over the night, is still the core of the issue. – Brian Powell Nov 17 '16 at 20:55
  • Apart from the fact that you're assuming your *data to insert* is correct, it does sound data is somehow corrupted. Perhaps you should scan your drive for cluster errors. – Xorifelse Nov 21 '16 at 20:07

1 Answers1

2

It seems that your problem is described here: http://lucene.472066.n3.nabble.com/Exception-writing-document-to-the-index-possible-analysis-error-td4174845.html

As there is a whole discussion, I will copy two citations with the discussion bottom line:

"I think you have tried to index an empty string into a numeric field. That's not going to work. It must be a valid number, or you need to leave the field completely out".

"The field _collection_id is required in the schema and not filled in my update request".

Your second problem "IndexNotFoundException: no segments* file found in" happens because after exception IndexWriter dies leaving write.lock and temporary index files in place. This problem is described here https://wilsonericn.wordpress.com/2011/12/14/my-first-5-lucene-mistakes under "#3: Leaving the IndexWriter open." heading.

SergeyLebedev
  • 3,673
  • 15
  • 29