Course Syllabus
As improvements are made to the course, the actual syllabus may differ in part from what is listed below.
-
Principals of Search
- Sample search applications and use cases
-
Solr Terminology
- Key terms and definitions
-
An Introduction to Solr Search
-
Request handlers and response writers; sample searches; Solr’s
Admin application; a survey of what happens in a running Solr
application. We will also a look at the VelocityResponseWriter to see an
example of one of the many ways that Solr can return results.
-
Solr Features Overview
- Main Solr features and characteristics
-
Going Further with Solr Search
-
An overview of search parameters, Solr query syntax, filter and
range queries, result sets
-
Executing searches using the SolrJ API to query the Solr server
from within a Java application
- Dismax Query Parser
-
Lucene as the Underlying Search Library
-
Lucene internals, Nutch and Mahout, the Luke utility: understanding
the internals of a Lucene index
-
Basic Solr Architecture
-
Details of the file system structure of the Solr home directory,
update handlers, admin console, and replication
-
Designing Solr Applications
-
Project definition, data model, data sources, application
requirements, designing Solr schema, and using qualifiers to create
controlled entry points for searching
-
Solr Configuration
-
Using Solr config files and schema.xml files, how to use the admin
console’s Schema Browser Tool
-
Introduction to Analysis
-
Solr’s analysis process and how to use analyzers and token filters;
use of the admin analysis tool to see how the analysis stack—created to
tokenize content—affects a query
-
Findability: How configuration files and decisions will impact the
findability and relevancy of your search results; filters used in the
analysis stack and the direct impact on relevance of search
results.
-
Understanding Relevance
-
Determining relevance quality, using scoring models, payloads, and
debugging relevance issues
-
How to use a query-time field boost to improve relevance
-
Findability and Domain Knowledge
- How to plan and implement for best findability
-
Compare text and string field types for facets: designing a field
for faceting to provide a way to navigate the index
-
Building a query filter: adding a query filter that will only
return Microsoft Word documents as results
- Using and configuring synonyms to improve findability
-
Solr Indexing
-
Indexing document files into the Solr index; how Solr XML documents
are POSTed to Solr, and the details of the XML structure
- Batch Indexing – using a simple command-line indexing application
-
Using Solr’s Data Import Handler
-
Use of the DataImportHandler as a workflow engine for fetching and
indexing data, applying the Data Import Handler it to import an RSS feed
into an index
-
Faceted Searching with Solr
- Faceted Search – how to implement faceting in Solr, using SolrJ
-
Planning for Faceting - evaluating your domain-specific needs for
faceted search; configuring index fields in schema.xml so they are
available for faceted searches
-
Other Search Features and Solr Contrib
-
Enable and configure highlighting, processing highlighted fragments
that are returned
- Query elevation and boosting
-
Clustering content dynamically to organize collections of documents
into thematic categories
-
Build a Spell Checking Index, adding fields and field type to the
schema to use the SpellCheckComponent.
-
Japanese Support (Basis Technology-created supplemental course material to
the Lucid Imagination course material)
- N-gram and Morphological Analysis
- Solr Integration with the Rosette linguistics platform
-
Introduction to Solr Administration and Operations
- Index maintenance and back-up
- Replication, monitoring, and scaling Solr
-
Using JConsole to monitor the JVM for a running indexing job and
the Solr server
-
Troubleshooting Your Search Application
-
Diagnosing and fixing common problems, performance and user search
result issues
-
Open Source Community
-
How it works, how to participate, and how to leverage
resources