1st Edition
Empirical Research in Software Engineering Concepts, Analysis, and Applications
Empirical research has now become an essential component of software engineering yet software practitioners and researchers often lack an understanding of how the empirical procedures and practices are applied in the field. Empirical Research in Software Engineering: Concepts, Analysis, and Applications shows how to implement empirical research processes, procedures, and practices in software engineering.
Written by a leading researcher in empirical software engineering, the book describes the necessary steps to perform replicated and empirical research. It explains how to plan and design experiments, conduct systematic reviews and case studies, and analyze the results produced by the empirical studies.
The book balances empirical research concepts with exercises, examples, and real-life case studies, making it suitable for a course on empirical software engineering. The author discusses the process of developing predictive models, such as defect prediction and change prediction, on data collected from source code repositories. She also covers the application of machine learning techniques in empirical software engineering, includes guidelines for publishing and reporting results, and presents popular software tools for carrying out empirical studies.
Introduction
What Is Empirical Software Engineering?
Overview of Empirical Studies
Types of Empirical Studies
Empirical Study Process
Ethics of Empirical Research
Importance of Empirical Research
Basic Elements of Empirical Research
Some Terminologies
Concluding Remarks
Systematic Literature Reviews
Basic Concepts
Case Study
Planning the Review
Methods for Presenting Results
Conducting the Review
Reporting the Review
SRs in Software Engineering
Software Metrics
Introduction
Measurement Basics
Measuring Size
Measuring Software Quality
OO Metrics
Dynamic Software Metrics
System Evolution and Evolutionary Metrics
Validation of Metrics
Practical Relevance
Experimental Design
Overview of Experimental Design
Case Study: Fault Prediction Systems
Research Questions
Reviewing the Literature
Research Variables
Terminology Used in Study Types
Hypothesis Formulation
Data Collection
Selection of Data Analysis Methods
Mining Data from Software Repositories
Configuration Management Systems
Importance of Mining Software Repositories
Common Types of Software Repositories
Understanding Systems
Version Control Systems
Bug Tracking Systems
Extracting Data from Software Repositories
Static Source Code Analysis
Software Historical Analysis
Software Engineering Repositories and Open Research Data Sets
Case Study: Defect Collection and Reporting System for Git Repository
Data Analysis and Statistical Testing
Analyzing the Metric Data
Attribute Reduction Methods
Hypothesis Testing
Statistical Testing
Example—Univariate Analysis Results for Fault Prediction System
Model Development and Interpretation
Model Development
Statistical Multiple Regression Techniques
ML Techniques
Concerns in Model Prediction
Performance Measures for Categorical Dependent Variable
Performance Measures for Continuous Dependent Variable
Cross-Validation
Model Comparison Tests
Interpreting the Results
Example—Comparing ML Techniques for Fault Prediction
Validity Threats
Categories of Threats to Validity
Example—Threats to Validity in Fault Prediction System
Threats and Their Countermeasures
Reporting Results
Reporting and Presenting Results
Guidelines for Masters and Doctoral Students
Research Ethics and Misconduct
Mining Unstructured Data
Introduction
Steps in Text Mining
Applications of Text Mining in Software Engineering
Example—Automated Severity Assessment of Software Defect Reports
Demonstrating Empirical Procedures
Abstract
Introduction
Related Work
Experimental Design
Research Methodology
Analysis Results
Discussion and Interpretation of Results
Validity Evaluation
Conclusions and Future Work
Appendix
Tools for Analyzing Data
WEKA
KEEL
SPSS
MATLAB
R
Comparison of Tools
Appendix
References
Index
Exercises and Further Reading appear at the end of most chapters.
Biography
Ruchika Malhotra is an assistant professor in the Department of Software Engineering at Delhi Technological University (formerly Delhi College of Engineering). She was awarded the prestigious UGC Raman Fellowship for pursuing post-doctoral research in the Department of Computer and Information Science at Indiana University–Purdue University. She received her master’s and doctorate degrees in software engineering from the University School of Information Technology of Guru Gobind Singh Indraprastha University. She received the IBM Best Faculty Award in 2013 and has published more than 100 research papers in international journals and conferences. Her research interests include software testing, improving software quality, statistical and adaptive prediction models, software metrics, neural nets modeling, and the definition and validation of software metrics.
"In this book, Dr. Malhotra uses her breadth of software engineering experience and expertise to give the reader coverage of many aspects of empirical software engineering. She covers the essential techniques and concepts needed for a researcher to get started on empirical software engineering research, including metrics, experimental design, analysis and statistical techniques, threats to the validity of any research findings, and methods and tools for empirical software engineering research. … The book provides the reader with an introduction and overview of the field and is also backed by references to the literature, allowing the interested reader to follow up on the methods, tools, and concepts described."
—From the Foreword by Mark Harman, University College London