Topic Extractor
Identify keywords and significant phrases in your text, and the topics not explicitly named
Get RosetteOverview
Quickly get the gist of your content with topic extraction
Topic extraction discovers the keywords in documents or databases that capture the essence of the text. However, unlike categorization or entity extraction, topic extraction is not constrained by a finite list of recognized entity types or categories. Instead, the topic endpoint identifies “keyphrases” and “concepts” for the given input based on frequency and linguistic patterns in the text, ranking them according to their relative importance.
Topic extraction quickly lists the keyphrases and concepts to give you the gist of an article or document. On a macro level, the same principle can be applied to a corpus of documents to understand the major ideas. Knowing the keyphrases and concepts in each document enables users to automatically tag, sort, and organize their data, making it more useful to analysts and database managers.
Keyphrases versus concepts
Keyphrases are significant phrases or words quoted from the text that Rosette® deems to be representative of the content. They are uncovered based on frequency, and consider common stop words like “and” or “that,” as well as language-specific statistical patterns of where keywords are likely to be located. Concepts are themes detected within the text that may not be explicitly named. For example, an article about the Super Bowl may have the concept of “sports” or “American football,” even if neither word appears.
Topic extraction is currently only available in English, but our on-premise tools can be custom-trained for new languages.
Product highlights
- English only
- Extracts keyphrases
- Identifies concepts
- Cloud or on-premise deployments
- Fast and scalable
- Industrial-strength support
- Constantly stress-tested and improved
Tech Specs
Availability and platform support
Deployment availability: | |
Bindings: |
Supported Languages
English |
/topics endpoint
{"content": "To Sleep John Keats, 1795 - 1821 O soft embalmer of the still midnight! Shutting with careful fingers and benign Our gloom-pleased eyes, embower’d from the light, Enshaded in forgetfulness divine; O soothest Sleep! if so it please thee, close, In midst of this thine hymn, my willing eyes, Or wait the amen, ere thy poppy throws Around my bed its lulling charities; Then save me, or the passèd day will shine Upon my pillow, breeding many woes; Save me from curious conscience, that still lords Its strength for darkness, burrowing like a mole; Turn the key deftly in the oilèd wards, And seal the hushèd casket of my soul. - John Keats This poem is in the public domain. John Keats Born in 1795, John Keats was an English Romantic poet and author of three poems considered to be among the finest in the English language."} {"keyphrases": [{"phrase": "lulling charities"}, {"phrase": "O soothest Sleep"}, {"phrase": "John Keats"}, {"phrase": "O soft embalmer"}, {"phrase": "hushèd casket"}, {"phrase": "English Romantic poet"}, {"phrase": "forgetfulness divine"}, {"phrase": "pleased eyes"}, {"phrase": "passèd day"}, {"phrase": "oilèd wards"}], "concepts": [{"phrase": "John Keats", "conceptId": "Q82083"}]}
Topic Extraction Example
Deployment
Rosette Cloud
Sign up today for a free 30-day trial
The SaaS version of Rosette is rapidly implemented, low maintenance and ideal for users who wish to pay based on monthly call volume. Numerous bindings through a RESTful API are supported.
Rosette Server Edition
This on-premise private cloud deployment puts all the functionality of Rosette Cloud behind your secure firewall, and enables advanced user settings, access to custom profiles (user-specific configuration setups), and deployment of custom models.
Rosette Java Edition
For on-premise systems that need the low-latency, high-speed integration of an SDK, Rosette Java is the way to go. It has been deployed in the most demanding, high-transaction environments, including web search engines, financial compliance, and border security.
Rosette Plugins
Just plug in Rosette for instant high-accuracy multilingual search and fuzzy name search for Elasticsearch or Apache Solr.
Quality documentation and support
Our support team responds to customers in less than a business day, and is committed to a satisfactory resolution. Users have access to in-depth documentation describing all the features, with code examples and a searchable knowledge base.
Visit our GitHub for bindings and documentation.
Request Custom Demo
Complete this form and our customer team will reach out to schedule a demo based on your use case.
Questions?
Email: info@basistech.com
Phone: +1-617-386-2000