Text Mining and Visualization: Case Studies Using Open-Source Tools provides an introduction to text mining using some of the most popular and powerful open-source tools: KNIME, RapidMiner, Weka, R, and Python.
The contributors—all highly experienced with text mining and open-source software—explain how text data are gathered and processed from a wide variety of sources, including books, server access logs, websites, social media sites, and message boards. Each chapter presents a case study that you can follow as part of a step-by-step, reproducible example. You can also easily apply and extend the techniques to other problems. All the examples are available on a supplementary website.
The book shows you how to exploit your text data, offering successful application examples and blueprints for you to tackle your text mining tasks and benefit from open and freely available tools. It gets you up to date on the latest and most powerful tools, the data mining process, and specific text mining activities.
"The timing of this book could not be better. It focuses on text mining, text being one of the data sources still to be truly harvested, and on open-source tools for the analysis and visualization of textual data. … Markus and Andrew have done an outstanding job bringing together this volume of both introductory and advanced material about text mining using modern open-source technology in a highly accessible way."
—From the Foreword by Professor Dr. Michael Berthold, University of Konstanz, Germany
RapidMiner for Text Analytic Fundamentals
Empirical Zipf-Mandelbrot Variation for Sequential Windows within Documents
Introduction to the KNIME Text Processing Extension
Social Media Analysis — Text Mining Meets Network Mining
Kilian Thiel, Tobias Kötter, Rosaria Silipo, and Phil Winters
Mining Unstructured User Reviews with Python
Sentiment Classification and Visualization of Product Review Data
Alexander Piazza and Pavlina Davcheva
Mining Search Logs for Usage Patterns
Tony Russell-Rose and Paul Clough
Temporally Aware Online News Mining and Visualization with Python
Text Classification Using Python
Sentiment Analysis of Stock Market Behavior from Twitter Using the R Tool
Nuno Oliveira, Paulo Cortez, and Nelson Areal
Empirical Analysis of the Stack Overflow Tags Network
Christos Iraklis Tsatsoulis