Text Mining and Visualization: Case Studies Using Open-Source Tools provides an introduction to text mining using some of the most popular and powerful open-source tools: KNIME, RapidMiner, Weka, R, and Python.
The contributors—all highly experienced with text mining and open-source software—explain how text data are gathered and processed from a wide variety of sources, including books, server access logs, websites, social media sites, and message boards. Each chapter presents a case study that you can follow as part of a step-by-step, reproducible example. You can also easily apply and extend the techniques to other problems. All the examples are available on a supplementary website.
The book shows you how to exploit your text data, offering successful application examples and blueprints for you to tackle your text mining tasks and benefit from open and freely available tools. It gets you up to date on the latest and most powerful tools, the data mining process, and specific text mining activities.
Markus Hofmann is a lecturer at the Institute of Technology Blanchardstown, where he focuses on the areas of data mining, text mining, data exploration and visualization, and business intelligence. Dr. Hofmann has also worked as a technology expert with 20 different organizations, such as Intel. He earned a PhD from Trinity College Dublin, an MSc in computing from the Dublin Institute of Technology, and a BA in information management systems.
Andrew Chisholm is a certified RapidMiner Master who created both basic and advanced RapidMiner video training content for RapidMinerResources.com. He has worked as a software developer, systems integrator, project manager, solution architect, customer-facing presales consultant, and strategic consultant. He earned an MSc in business intelligence and data mining from the Institute of Technology Blanchardstown and an MA in physics from Oxford University.
"The timing of this book could not be better. It focuses on text mining, text being one of the data sources still to be truly harvested, and on open-source tools for the analysis and visualization of textual data. … Markus and Andrew have done an outstanding job bringing together this volume of both introductory and advanced material about text mining using modern open-source technology in a highly accessible way."
—From the Foreword by Professor Dr. Michael Berthold, University of Konstanz, Germany