Let me mention before I start in earnest: Alfresco implements Solr which uses Lucene for indexing, thus I wouldn't manage the Lucene indexes directly on Alfresco. Instead, manage your indexes via the Solr tooling Alfresco provides.
I, too, have found that the Lucene/Solr index tends to "drift" in this version of Alfresco (4.2.0). Having engaged Alfresco support on this many times, we've found no solid root cause; they say it may be attributed to "certain customizations" we've made, but they haven't been more specific than that.
So while we've not found a solution, there are proactive steps we take to mitigate the issue.
There is a Solr report we check daily (https://your-alfresco-server.com:8443/solr/report/). On this report, there is a value labeled, "Count of transactions in the index but not the DB" (which is a very misleading label, in my experience). The higher this value, the more out-of-sync our index seems to be, so as it climbs we'll schedule a re-index during a time when no one will be impacted.
There are services the Alfresco server exposes to fix and reindex Solr. (Full disclosure: I have not found them to be very effective, but they come recommended by Alfresco Support).
Solr re-index service:
http://your-alfresco-server.com:8080/solr/admin/cores?action=REINDEX&txid=
Solr "Fix" service:
http://your-alfresco-server.com:8080/solr/admin/cores?action=FIX
- Purging stale content can reduce the time to re-index (this includes transfer reports, etc., that Alfresco generates that tends to accumulate, but aren't--in my case at least--important).
Unfortunately, the true solution often comes down to re-indexing on a scheduled, rotating basis to minimize downtime.