What is this website?

In this Post

There was a break, of course, in the development of this application, which lasted over ten years. But now my “bad habit” of counting words has returned. For one thing is certain – it is something that keeps you busy.

Table of Content


Some things take a while to build, and completing the project you are now looking at took me almost 20 years. It started as a research that I did back at my University days.

I was intrigued of rare, obscure, uncommon and problematic words that are sometimes found in the oldest Indian poetry. What can you really do with them, these sacred riddles? Well, I though, how about if we start gathering some statistics about hapax legomena. And so it went, on and on, counting words in Sanskrit texts. I got kind of accustomed to it.

After the University got rid of me I continued with that pursuit for a while in France, and my “quantitative Vedic studies”, let us say, they only caused smiles. For the exception of one person, Dr. Nina Alexeyeva, who generously shares with her numerous student her vast and unbelievable knowledge of Mathematical Statistics, nobody was interested in what I was doing. This is how my research became her research and so it still is.

What we have here

Free online Keyword Extractor and Text Analyzer

Anonymous ChatGPT detector


  1. Python and R scipts to mine collections of texts for keywords, based on a linguistically motivated statistical method. The scipts are called corpus_utils.

  2. Plotly Javascript viewer app to interactively view word properties that were mined with the help of corpus_utils. This JS viewer is called semascope.

Alexander Sotov

Text: Alexandre Sotov
Comments or Questions? Contact me on LinkedIn

𝕏   Facebook   Telegram

Other Posts:

Sentiment Analysis API

Semascope: Tool for Text Mining and Analysis

Track media sentiment with this app

How AI sees Dante's Divine Comedy in 27 words

Keyword Extractor and Text Analyzer - Help

Exploring Sacred Texts with Probabilistic Keyword Extractor

FAQ: Automated keyword detection, content extraction and text visualization

Make ChatGPT Content Undetectable with this App

ChatGPT Detector, a free online tool

The Intricate Tapestry of ChatGPT Texts: Why LLM overuses some words at the expense of others?

How to build word frequency matrix using AWK or Python

How to prepare your texts for creating a word frequency matrix

Intro to Automated Keyword Extraction

How to automatically tag posts in Hugo Static Site Generator with Python

Using Hugo and Goaccess to show most read posts of a static website

How Textvisualization.app and its semascope 👁️ compare with traditional tag clouds?