Questions tagged [standardanalyzer]

This tag is used for everything regarding the "StandardAnalyzer" in Apache's "Lucene" library.

17 questions
6
votes
1 answer

standard.StandardAnalyzer not found in lucene 4.7.0

I'm newbies to lucene. i'm trying to do the tutorial here http://www.lucenetutorial.com/lucene-in-5-minutes.html The site is importing: import org.apache.lucene.analysis.standard.StandardAnalyzer; however, on my intellij, I can't find any standard…
user2773013
  • 3,102
  • 8
  • 38
  • 58
4
votes
0 answers

Lucene 4 - How to discard numeric terms in index?

I'm using Apache Tika to parse xml document before indexing with Apache Lucene. This is Tika part: BodyContentHandler handler = new BodyContentHandler(10*1024*1024); Metadata metadata = new Metadata(); FileInputStream inputstream = new…
tommy
  • 139
  • 9
2
votes
2 answers

How to make the letter "A" an exception in Lucene's StandardAnalyzer?

I've created a medical dictionary in Android using Lucene. The words and definitions are Danish, and I'm using StandardAnalyzer to index and search for the entries. The idea is that when I click on an entry in my ListView, another Activity shows up…
Matthew Quiros
  • 13,385
  • 12
  • 87
  • 132
1
vote
2 answers

what is the appropriate lucene analyzer to use?

i have problems with regards to indexing item names with numbers and symbols. a sample of my data is shown below: ANGLE BARS ORANGE - 4.0MM 2 - 1/2" B.I SQUARE TUBING 2" X 3" B.I. PIPE S-40 10MM 3/8" B.I SQUARE TUBING 1" X 2" PLYWOOD …
maccramers
  • 125
  • 2
  • 6
1
vote
1 answer

StandardAnalyzer - Apache Lucene

I'm actually developing a system where you input some text files to a StandardAnalyzer, and the contents of that file are then replaced by the output of the StandardAnalyzer (which tokenizes and removes all the stop words). The code ive developed…
svaurag
  • 71
  • 5
1
vote
1 answer

How to use Elasticsearch standard analyser without lower case

Im trying to create an analyser in elasticsearch using the pre-sets of "standard" analyser but with one change - no lower casing of words. Ive tried chaining the whitespace and standard analyser like so: PUT /standard_uppercase { "settings":…
EHarpham
  • 602
  • 1
  • 17
  • 34
1
vote
1 answer

Change StandardAnalyzer Lucene

I'm trying to search documents by title with StandardAnalyzer of lucene 4.10.3. I read the quotes from a file and the I add the double quotation marks for constructing the query with this: Query query = parser.parse("\""+doc.get("title")+"\""); The…
CodeSniffer
  • 83
  • 1
  • 10
1
vote
1 answer

Lucene search using StopWords in StandardAnalyzer

I have the following issue using Lucene.NET 3.0.3. My project analyze Documents using StandardAnalyzer with StopWord-List (combined german and english words). While searching I create my searchterm by hand and parse it using MultiFieldQueryParser.…
user1723639
1
vote
1 answer

Lucene is not matching strings having upper characters

I am using Lucene Search Engine (v36), with the StandardAnalyzer. I use the MultiFieldQueryParser. One of my fields is set as NOT_ANALYZED, because it's a version name containing alphanumeric characters and points. When this field contains an upper…
daiquiri33
  • 65
  • 2
  • 7
0
votes
1 answer

Lucene StandardAnalyzer using Hunspell TokenFilter in C#?

How can I add a TokenFilter to StandardAnalyzer in Lucene? Or is there another Analyzer that does the same thing, only allows me to also use a TokenFilter? I have a TokenFilter for Hunspell in C# which I am not sure where/how to plug in the process…
Vladan Strigo
  • 551
  • 1
  • 7
  • 19
0
votes
1 answer

Lucene QueryParser inconsistent behaviour

The following program: import java.util.Arrays; import java.util.List; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.queryParser.ParseException; import org.apache.lucene.queryParser.QueryParser; import…
HenryR
  • 8,219
  • 7
  • 35
  • 39
0
votes
1 answer

How to create and add values to a standard lowercase analyzer in elastic search

Ive been around the houses with this for the past few days trying things in various orders but cant figure out why its not working. I am trying to create an index in Elasticsearch with an analyzer which is the same as the "standard" analyzer but…
EHarpham
  • 602
  • 1
  • 17
  • 34
0
votes
1 answer

Lucene BooleanQuery wrong result

I created a Lucene RAMDirectory to collect data from different sources and make them quickly searchable. I spent many hours to understand the different analyzers and index strategies, but in some cases the query result is not the expected. Here is a…
Dimnox
  • 441
  • 1
  • 6
  • 13
0
votes
1 answer

Preserving emails while tokenizing based on . with lucene

Would like to tokenize strings based on . , ; etc however would like to preserve email addresses, ip addresses and the likes. How do i use an analyzer with lucence to do this task? The following code which i found on stackoverflow does not preserve…
STEMExchanger
  • 31
  • 1
  • 6
0
votes
3 answers

Duke - org.apache.lucene.analysis.standard.StandardAnalyzer

https://github.com/larsga/Duke - I am using Duke - for Data Deduplication. I have setup Duke (jar files - Duke jar as well as lucene jars are added in the classpath) .. Sample example in the github-…
Soundarya Thiagarajan
  • 574
  • 2
  • 13
  • 31
1
2