This post is a reading note of Lucene In Action - Second Edition Chapter 3: Search.
Mind Map Summary
General Search Workflow

One important thing to remember is that the terms in the index are processed contents. The query entered by the user is raw content. If we want to construct queries entirely programmatically, you must ensure the terms included in all of your queries match the tokens produced by the analyzer used during indexing.
Syntax Suger Used in QueryParser

Note
- Lucene's Primary Searching API
- Classes
- IndexSearch
- Query
- QueryParser
- TopDocs
- ScoreDoc
- Lucene's Scoring
- The score computes how similar the document is to the query, with higher scores reflecting stronger similarity and thus stronger matches.
- Search Queries
- TermQueary
- TermQuerys are especially useful for retrieving documents by a key. (used together with Index.NOT_ANALYZED)
- TermRangeQuery
- NumberRangeQuery
- PrefixQuery
- BooleanQuery
- BooleanQuery is powerful. It can be used to construct nested clauses.
- MUST -> AND
- SHOUD -> OR
- MUST_NOT -> NOT
- WildcardQuery
- FuzzyQuery
- There is syntax sugar for boolean query operators.
- During the indexing process, the raw contents are processed and then indexed. Therefore, the search query should be consistent with what is indexed.
- QueryParser is the only searching piece that uses an analyzer.
- If you construct queries entirely programmatically, you must ensure the terms included in all of your queries match the tokens produced by the analyzer used during indexing.
- This is one of the reasons why QueryParser is super convenient.
- IndexReader always searches a point-in-time snapshot of the index as it existed when the IndexReader was created
- IndexReader has a method called
reopen()
. Creating an IndexReader instance is an expensive operation. - TopDocs
- Results are sorted by default and the default criteria is the score.
- We can use other sorting criteria.
- Paging through results
- There is nothing special about paging. When we move to the next page, we increase the number of results in the search APIs.
----- END -----
©2019 - 2023 all rights reserved