Download dataset Download text
This is an analysis of about 2000 posts that were generated by ChatGPT. You can download the files from links below. Make sure to check out comments and observations in this blog: «The Intricate Tapestry of ChatGPT Posts: Why LLM overuses some words at the expense of the others?».
Help and Tutorial
Each dot is a word. Point to dots to view words. Rotate the plot, controls appear on top right corner. Press Heroes and then Background, and then Background again. Press Reduce to see less common words by reducing DF. Use Search to look for part or whole word. Use Filters Control to filter words. Press All to load all data. Reload the page to hard reset.
DF is Document Frequency. Parameters k and p are decimals. Use Cluster field to enter cluster number.
Words are clustered automatically, based on their estimated distribution parameters. Details in our research. Data was prepared with corpus_utils.
You can mine keywords from any text with our Keyword Extractor and Text Analyzer
Paste text into Keyword Extractor, run tests and download results. After that upload the csv file to Semascope, 3d plot viewer
You can download the entire text analyzed here by pressing Download text. Copy the text and paste it inside Keyword Extractor and Text Analyzer to mine the book interactively.
Text mining is a computational technique to analyze and extract not so evident information from large bodies of text. By applying advanced algorithms and statistical models, text mining can help uncover hidden knowledge: patterns, relationships, and insights not apparent to the human reader. Read, how to mine religious and literary texts.