適用分野
ホーム»適用分野»Apache Lucene/Solr

Course Syllabus

As improvements are made to the course, the actual syllabus may differ in part from what is listed below.

  • Principals of Search
    • Sample search applications and use cases
  • Solr Terminology
    • Key terms and definitions
  • An Introduction to Solr Search
    • Request handlers and response writers; sample searches; Solr’s Admin application; a survey of what happens in a running Solr application. We will also a look at the VelocityResponseWriter to see an example of one of the many ways that Solr can return results.
  • Solr Features Overview
    • Main Solr features and characteristics
  • Going Further with Solr Search
    • An overview of search parameters, Solr query syntax, filter and range queries, result sets
    • Executing searches using the SolrJ API to query the Solr server from within a Java application
    • Dismax Query Parser
  • Lucene as the Underlying Search Library
    • Lucene internals, Nutch and Mahout, the Luke utility: understanding the internals of a Lucene index
  • Basic Solr Architecture
    • Details of the file system structure of the Solr home directory, update handlers, admin console, and replication
  • Designing Solr Applications
    • Project definition, data model, data sources, application requirements, designing Solr schema, and using qualifiers to create controlled entry points for searching
  • Solr Configuration
    • Using Solr config files and schema.xml files, how to use the admin console’s Schema Browser Tool
  • Introduction to Analysis
    • Solr’s analysis process and how to use analyzers and token filters; use of the admin analysis tool to see how the analysis stack—created to tokenize content—affects a query
    • Findability: How configuration files and decisions will impact the findability and relevancy of your search results; filters used in the analysis stack and the direct impact on relevance of search results.
  • Understanding Relevance
    • Determining relevance quality, using scoring models, payloads, and debugging relevance issues
    • How to use a query-time field boost to improve relevance
  • Findability and Domain Knowledge
    • How to plan and implement for best findability
    • Compare text and string field types for facets: designing a field for faceting to provide a way to navigate the index
    • Building a query filter: adding a query filter that will only return Microsoft Word documents as results
    • Using and configuring synonyms to improve findability
  • Solr Indexing
    • Indexing document files into the Solr index; how Solr XML documents are POSTed to Solr, and the details of the XML structure
    • Batch Indexing – using a simple command-line indexing application
  • Using Solr’s Data Import Handler
    • Use of the DataImportHandler as a workflow engine for fetching and indexing data, applying the Data Import Handler it to import an RSS feed into an index
  • Faceted Searching with Solr
    • Faceted Search – how to implement faceting in Solr, using SolrJ
    • Planning for Faceting - evaluating your domain-specific needs for faceted search; configuring index fields in schema.xml so they are available for faceted searches
  • Other Search Features and Solr Contrib
    • Enable and configure highlighting, processing highlighted fragments that are returned
    • Query elevation and boosting
    • Clustering content dynamically to organize collections of documents into thematic categories
    • Build a Spell Checking Index, adding fields and field type to the schema to use the SpellCheckComponent.
  • Japanese Support (Basis Technology-created supplemental course material to the Lucid Imagination course material)
    • N-gram and Morphological Analysis
    • Solr Integration with the Rosette linguistics platform
  • Introduction to Solr Administration and Operations
    • Index maintenance and back-up
    • Replication, monitoring, and scaling Solr
    • Using JConsole to monitor the JVM for a running indexing job and the Solr server
  • Troubleshooting Your Search Application
    • Diagnosing and fixing common problems, performance and user search result issues
  • Open Source Community
    • How it works, how to participate, and how to leverage resources