25%




k≤

p≤

DF≤

FR≥

Cluster≥



Moby Dick by Herman Melville

Download dataset Download text


Help and Tutorial

Each dot is a word. Point to dots to view words. Rotate the plot, controls appear on top right corner. Press Heroes and then Background, and then Background again. Press Reduce to see less common words by reducing DF. Use Search to look for part or whole word. Use Filters Control to filter words. Press All to load all data. Reload the page to hard reset.

DF is Document Frequency. Parameters k and p are decimals. Use Cluster field to enter cluster number.

Words are clustered automatically, based on their estimated distribution parameters. Details in our research. Data was prepared with corpus_utils.


You can mine keywords from any text with our Keyword Extractor and Text Analyzer


Paste text into Keyword Extractor, run tests and download results. After that upload the csv file to Semascope, 3d plot viewer

You can download the entire text analyzed here by pressing Download text. Copy the text and paste it inside Keyword Extractor and Text Analyzer to mine the book interactively.

Text mining is a computational technique to analyze and extract not so evident information from large bodies of text. By applying advanced algorithms and statistical models, text mining can help uncover hidden knowledge: patterns, relationships, and insights not apparent to the human reader. Read, how to mine religious and literary texts.

𝕏   Facebook   Telegram