An emerging topic in software engineering and data mining, specification mining tackles software maintenance and reliability issues that cost economies billions of dollars each year. The first unified reference on the subject, Mining Software Specifications: Methodologies and Applications describes recent approaches for mining specifications of software systems. Experts in the field illustrate how to apply state-of-the-art data mining and machine learning techniques to address software engineering concerns.
In the first set of chapters, the book introduces a number of studies on mining finite state machines that employ techniques, such as grammar inference, partial order mining, source code model checking, abstract interpretation, and more. The remaining chapters present research on mining temporal rules/patterns, covering techniques that include path-aware static program analyses, lightweight rule/pattern mining, statistical analysis, and other interesting approaches. Throughout the book, the authors discuss how to employ dynamic analysis, static analysis, and combinations of both to mine software specifications.
According to the US National Institute of Standards and Technology in 2002, software bugs have cost the US economy 59.5 billion dollars a year. This volume shows how specification mining can help find bugs and improve program understanding, thereby reducing unnecessary financial losses. The book encourages the industry adoption of specification mining techniques and the assimilation of these techniques in standard integrated development environments (IDEs).
Table of Contents
Specification Mining: A Concise Introduction, David Lo, Siau-Cheng Khoo, Chao Liu, and Jiawei Han
Mining Finite-State Automata with Annotations, Leonardo Mariani, Fabrizio Pastore, Mauro Pezzè, and Mauro Santoro
Adapting Grammar Inference Techniques to Mine State Machines, Neil Walkinshaw and Kirill Bogdanov
Mining API Usage Protocols from Large Method Traces, Michael Pradel and Thomas R. Gross
Static API Specification Mining: Exploiting Source Code Model Checking, Mithun Acharya and Tao Xie
Static Specification Mining Using Automata-Based Abstractions, Eran Yahav, Sharon Shoham, Stephen Fink, and Marco Pistoia
DynaMine: Finding Usage Patterns and Their Violations by Mining Software Repositories, Benjamin Livshits and Thomas Zimmermann
Automatic Inference and Effective Application of Temporal Specifications, Jinlin Yang and David Evans
Path-Aware Static Program Analyses for Specification Mining, Muralikrishna Ramanathan, Ananth Grama, and Suresh Jagannathan
Mining API Usage Specifications via Searching Source Code from the Web, Suresh Thummalapenta, Tao Xie, and Madhuri R. Marri
Merlin: Specification Inference for Explicit Information Flow Problems, Benjamin Livshits, Aditya V. Nori, Sriram K. Rajamani, and Anindya Banerjee
Lightweight Mining of Object Usage, Andrzej Wasylkowski and Andreas Zeller
David Lo is an assistant professor in the School of Information Systems at Singapore Management University. His research interests include specification mining, dynamic program analysis, automated debugging, code search, and pattern mining.
Siau-Cheng Khoo is an associate professor in the Department of Computer Science at the National University of Singapore. His research interests include specification mining, program analysis, program transformation, functional programming, domain-specific languages, and aspect-oriented programming.
Jiawei Han is a professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. He is editor-in-chief of the ACM Transactions on Knowledge Discovery from Data and co-editor of Geographic Data Mining and Knowledge Discovery, Second Edition (CRC Press, 2009) and Next Generation of Data Mining (CRC Press, 2009). His research interests include information network analysis, knowledge discovery, pattern discovery, data streams, and multidimensional analysis.
Chao Liu is a researcher in the Internet Service Research Center at Microsoft Research. His research interests include data mining for software engineering, statistical debugging, and machine learning and its use in web applications.