An emerging topic in software engineering and data mining, specification mining tackles software maintenance and reliability issues that cost economies billions of dollars each year. The first unified reference on the subject, Mining Software Specifications: Methodologies and Applications describes recent approaches for mining specifications of software systems. Experts in the field illustrate how to apply state-of-the-art data mining and machine learning techniques to address software engineering concerns.
In the first set of chapters, the book introduces a number of studies on mining finite state machines that employ techniques, such as grammar inference, partial order mining, source code model checking, abstract interpretation, and more. The remaining chapters present research on mining temporal rules/patterns, covering techniques that include path-aware static program analyses, lightweight rule/pattern mining, statistical analysis, and other interesting approaches. Throughout the book, the authors discuss how to employ dynamic analysis, static analysis, and combinations of both to mine software specifications.
According to the US National Institute of Standards and Technology in 2002, software bugs have cost the US economy 59.5 billion dollars a year. This volume shows how specification mining can help find bugs and improve program understanding, thereby reducing unnecessary financial losses. The book encourages the industry adoption of specification mining techniques and the assimilation of these techniques in standard integrated development environments (IDEs).
Table of Contents
Specification Mining: A Concise Introduction. Mining Finite-State Automata with Annotations. Adapting Grammar Inference Techniques to Mine State Machines. Mining API Usage Protocols from Large Method Traces. Static API Specification Mining: Exploiting Source Code Model Checking. Static Specification Mining Using Automata-Based Abstractions. DynaMine: Finding Usage Patterns and Their Violations by Mining Software Repositories. Automatic Inference and Effective Application of Temporal Specifications. Path-Aware Static Program Analyses for Specification Mining. Mining API Usage Specifications via Searching Source Code from the Web. Merlin: Specification Inference for Explicit Information Flow Problems. Lightweight Mining of Object Usage.
David Lo is an assistant professor in the School of Information Systems at Singapore Management University. His research interests include specification mining, dynamic program analysis, automated debugging, code search, and pattern mining.
Siau-Cheng Khoo is an associate professor in the Department of Computer Science at the National University of Singapore. His research interests include specification mining, program analysis, program transformation, functional programming, domain-specific languages, and aspect-oriented programming.
Jiawei Han is a professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. He is editor-in-chief of the ACM Transactions on Knowledge Discovery from Data and co-editor of Geographic Data Mining and Knowledge Discovery, Second Edition (CRC Press, 2009) and Next Generation of Data Mining (CRC Press, 2009). His research interests include information network analysis, knowledge discovery, pattern discovery, data streams, and multidimensional analysis.
Chao Liu is a researcher in the Internet Service Research Center at Microsoft Research. His research interests include data mining for software engineering, statistical debugging, and machine learning and its use in web applications.