Modern biological databases comprise not only data, but also sophisticated query facilities and bioinformatics data analysis tools. This book provides an exploration through the world of Bioinformatics Database Systems.
The book summarizes the popular and innovative bioinformatics repositories currently available, including popular primary genetic and protein sequence databases, phylogenetic databases, structure and pathway databases, microarray databases and boutique databases. It also explores the data quality and information integration issues currently involved with managing bioinformatics databases, including data quality issues that have been observed, and efforts in the data cleaning field.
Biological data integration issues are also covered in-depth, and the book demonstrates how data integration can create new repositories to address the needs of the biological communities. It also presents typical data integration architectures employed in current bioinformatics databases.
The latter part of the book covers biological data mining and biological data processing approaches using cloud-based technologies. General data mining approaches are discussed, as well as specific data mining methodologies that have been successfully deployed in biological data mining applications. Two biological data mining case studies are also included to illustrate how data, query, and analysis methods are integrated into user-friendly systems.
Aimed at researchers and developers of bioinformatics database systems, the book is also useful as a supplementary textbook for a one-semester upper-level undergraduate course, or an introductory graduate bioinformatics course.
Table of Contents
OVERVIEW OF BIOINFORMATICS DATABASES
Structure And Pathway Databases
Microarray And Boutique Bioinformatics Databases
BIOLOGICAL DATA CLEANING
General Data Cleaning
A Case Study In Biological Data Cleaning
BIOLOGICAL DATA INTEGRATION
General Data Integration
Three Areas In Biological Data Integration
BIOLOGICAL DATA SEARCHING
Biological Data Searching Using Blast
Biological Data Searching Using The Ucsc
Genome Browser And Blat
A Case Study In Phylogenetic Tree Database Search
A Case Study In Rna Pseudoknot Database Search
BIOLOGICAL DATA MINING
General Data Mining
Biological Data Mining
A Case Study In Biological Pattern Discovery
A Case Study In Biological Data Mining
BIOLOGICAL NETWORK INFERENCE
Gene Regulatory Network Inference
CLOUD-BASED BIOLOGICAL DATA PROCESSING
Data Processing In The Cloud
Biological Data Processing In The Cloud
Kevin Byron is a PhD candidate in the Department of Computer Science at the New Jersey Institute of Technology.
Katherine G. Herbert is Associate Professor of Computer Science at Montclair State University.
Jason T.L. Wang is Professor of Bioinformatics and Computer Science at the New Jersey Institute of Technology.