Modern biological databases comprise not only data, but also sophisticated query facilities and bioinformatics data analysis tools. This book provides an exploration through the world of Bioinformatics Database Systems.
The book summarizes the popular and innovative bioinformatics repositories currently available, including popular primary genetic and protein sequence databases, phylogenetic databases, structure and pathway databases, microarray databases and boutique databases. It also explores the data quality and information integration issues currently involved with managing bioinformatics databases, including data quality issues that have been observed, and efforts in the data cleaning field.
Biological data integration issues are also covered in-depth, and the book demonstrates how data integration can create new repositories to address the needs of the biological communities. It also presents typical data integration architectures employed in current bioinformatics databases.
The latter part of the book covers biological data mining and biological data processing approaches using cloud-based technologies. General data mining approaches are discussed, as well as specific data mining methodologies that have been successfully deployed in biological data mining applications. Two biological data mining case studies are also included to illustrate how data, query, and analysis methods are integrated into user-friendly systems.
Aimed at researchers and developers of bioinformatics database systems, the book is also useful as a supplementary textbook for a one-semester upper-level undergraduate course, or an introductory graduate bioinformatics course.
Table of Contents
OVERVIEW OF BIOINFORMATICS DATABASES. Introduction. Sequence Databases. Phylogenetic Databases. Structure And Pathway Databases. Microarray And Boutique Bioinformatics Databases. BIOLOGICAL DATA CLEANING. Introduction. General Data Cleaning. A Case Study In Biological Data Cleaning. BIOLOGICAL DATA INTEGRATION. Introduction. General Data Integration. Three Areas In Biological Data Integration. BIOLOGICAL DATA SEARCHING. Introduction. Biological Data Searching Using Blast. Biological Data Searching Using The Ucsc. Genome Browser And Blat. A Case Study In Phylogenetic Tree Database Search. A Case Study In Rna Pseudoknot Database Search. BIOLOGICAL DATA MINING. Introduction. General Data Mining. Biological Data Mining. A Case Study In Biological Pattern Discovery. A Case Study In Biological Data Mining. BIOLOGICAL NETWORK INFERENCE. Introduction. Gene Regulatory Network Inference. CLOUD-BASED BIOLOGICAL DATA PROCESSING. Introduction. Data Processing In The Cloud. Biological Data. Processing In The Cloud
Kevin Byron is a PhD candidate in the Department of Computer Science at the New Jersey Institute of Technology.
Katherine G. Herbert is Associate Professor of Computer Science at Montclair State University.
Jason T.L. Wang is Professor of Bioinformatics and Computer Science at the New Jersey Institute of Technology.