1

Is there a way to config Solr to ignore large files while indexing?

I'm trying to index a network drive but can't figure out how to ignore large files (>20MB).

Thanks

1 Answers1

1

Try something of this nature:

$ find /mnt -type f -size -20M -exec /opt/solr/bin/post -c wizbang {} \;

If you use Tika, it has a file size limit though it is not 'ignore'ing the file:

Apache Tika and character limit when parsing documents

How to read large files using TIka?

Community
  • 1
  • 1
rleir
  • 791
  • 1
  • 7
  • 19