Text Mining and Visualization: Case Studies Using Open-Source Tools, 1st Edition (Hardback) book cover

Text Mining and Visualization

Case Studies Using Open-Source Tools, 1st Edition

Edited by Markus Hofmann, Andrew Chisholm

Chapman and Hall/CRC

297 pages | 186 B/W Illus.

Purchasing Options:$ = USD
Hardback: 9781482237573
pub: 2015-12-18
SAVE ~$22.00
eBook (VitalSource) : 9780429161971
pub: 2016-01-05
from $28.98

FREE Standard Shipping!


Text Mining and Visualization: Case Studies Using Open-Source Tools provides an introduction to text mining using some of the most popular and powerful open-source tools: KNIME, RapidMiner, Weka, R, and Python.

The contributors—all highly experienced with text mining and open-source software—explain how text data are gathered and processed from a wide variety of sources, including books, server access logs, websites, social media sites, and message boards. Each chapter presents a case study that you can follow as part of a step-by-step, reproducible example. You can also easily apply and extend the techniques to other problems. All the examples are available on a supplementary website.

The book shows you how to exploit your text data, offering successful application examples and blueprints for you to tackle your text mining tasks and benefit from open and freely available tools. It gets you up to date on the latest and most powerful tools, the data mining process, and specific text mining activities.


"The timing of this book could not be better. It focuses on text mining, text being one of the data sources still to be truly harvested, and on open-source tools for the analysis and visualization of textual data. … Markus and Andrew have done an outstanding job bringing together this volume of both introductory and advanced material about text mining using modern open-source technology in a highly accessible way."

—From the Foreword by Professor Dr. Michael Berthold, University of Konstanz, Germany

Table of Contents


RapidMiner for Text Analytic Fundamentals

John Ryan

Empirical Zipf-Mandelbrot Variation for Sequential Windows within Documents

Andrew Chisholm


Introduction to the KNIME Text Processing Extension

Kilian Thiel

Social Media Analysis — Text Mining Meets Network Mining

Kilian Thiel, Tobias Kötter, Rosaria Silipo, and Phil Winters


Mining Unstructured User Reviews with Python

Brian Carter

Sentiment Classification and Visualization of Product Review Data

Alexander Piazza and Pavlina Davcheva

Mining Search Logs for Usage Patterns

Tony Russell-Rose and Paul Clough

Temporally Aware Online News Mining and Visualization with Python

Kyle Goslin

Text Classification Using Python

David Colton


Sentiment Analysis of Stock Market Behavior from Twitter Using the R Tool

Nuno Oliveira, Paulo Cortez, and Nelson Areal

Topic Modeling

Patrick Buckley

Empirical Analysis of the Stack Overflow Tags Network

Christos Iraklis Tsatsoulis

About the Editors

Markus Hofmann is a lecturer at the Institute of Technology Blanchardstown, where he focuses on the areas of data mining, text mining, data exploration and visualization, and business intelligence. Dr. Hofmann has also worked as a technology expert with 20 different organizations, such as Intel. He earned a PhD from Trinity College Dublin, an MSc in computing from the Dublin Institute of Technology, and a BA in information management systems.

Andrew Chisholm is a certified RapidMiner Master who created both basic and advanced RapidMiner video training content for RapidMinerResources.com. He has worked as a software developer, systems integrator, project manager, solution architect, customer-facing presales consultant, and strategic consultant. He earned an MSc in business intelligence and data mining from the Institute of Technology Blanchardstown and an MA in physics from Oxford University.

About the Series

Chapman & Hall/CRC Data Mining and Knowledge Discovery Series

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
COMPUTERS / Programming / Games
COMPUTERS / Database Management / Data Mining
COMPUTERS / Machine Theory