1st Edition

Advances in Machine Learning and Data Mining for Astronomy

    744 Pages 33 Color & 177 B/W Illustrations
    by Chapman & Hall

    744 Pages 33 Color & 177 B/W Illustrations
    by Chapman & Hall

    Advances in Machine Learning and Data Mining for Astronomy documents numerous successful collaborations among computer scientists, statisticians, and astronomers who illustrate the application of state-of-the-art machine learning and data mining techniques in astronomy. Due to the massive amount and complexity of data in most scientific disciplines, the material discussed in this text transcends traditional boundaries between various areas in the sciences and computer science.

    The book’s introductory part provides context to issues in the astronomical sciences that are also important to health, social, and physical sciences, particularly probabilistic and statistical aspects of classification and cluster analysis. The next part describes a number of astrophysics case studies that leverage a range of machine learning and data mining technologies. In the last part, developers of algorithms and practitioners of machine learning and data mining show how these tools and techniques are used in astronomical applications.

    With contributions from leading astronomers and computer scientists, this book is a practical guide to many of the most important developments in machine learning, data mining, and statistics. It explores how these advances can solve current and future problems in astronomy and looks at how they could lead to the creation of entirely new algorithms within the data mining community.

    Part I: Foundational Issues
    Classification in Astronomy: Past and Present, Eric Feigelson
    Searching the Heavens: Astronomy, Computation, Statistics, Data Mining, and Philosophy, Clark Glymour
    Probability and Statistics in Astronomical Machine Learning and Data Mining, Jeffrey D. Scargle

    Part II: Astronomical Applications
    Source Identification
    Automated Science Processing for the Fermi Large Area Telescope, James Chiang
    CMB Data Analysis, Paniez Paykari and Jean-Luc Starck
    Data Mining and Machine Learning in Time-Domain Discovery and Classification, Joshua S. Bloom and Joseph W. Richards
    Cross-Identification of Sources: Theory and Practice, Tamás Budavári
    The Sky Pixelization for CMB Mapping, O.V. Verkhodanov and A.G. Doroshkevich
    Future Sky Surveys: New Discovery Frontiers, J. Anthony Tyson and Kirk D. Borne
    Poisson Noise Removal in Spherical Multichannel Images: Application to Fermi Data, Jérémy Schmitt, Jean-Luc Starck, Jalal Fadili, and Seth Digel

    Galaxy Zoo: Morphological Classification and Citizen Science, Lucy Fortson, Karen Masters, Robert Nichol, Kirk D. Borne, Edd Edmondson, Chris Lintoot, Jordan Raddick, Kevin Schawinski, and John Wallin
    The Utilization of Classifications in High-Energy Astrophysics Experiments, Bill Atwood
    Database-Driven Analyses of Astronomical Spectra, Jan Cami
    Weak Gravitational Lensing, Sandrine Pires, Jean-Luc Starck, Adrienne Leonard, and Alexandre Réfrégier
    Photometric Redshifts: 50 Years after 345, Tamás Budavári
    Galaxy Clusters, Christopher J. Miller

    Signal Processing (Time-Series) Analysis
    Planet Detection: The Kepler Mission, Jon M. Jenkins, Jeffrey C. Smith, Peter Tenenbaum, Joseph D. Twicken, and Jeffrey Van Cleve
    Classification of Variable Objects in Massive Sky Monitoring Surveys, Przemek Woźniak, Lukasz Wyrzykowski, and Vasily Belokurov
    Gravitational Wave Astronomy, Lee Samuel Finn

    The Largest Data Sets
    Virtual Observatory and Distributed Data Mining, Kirk D. Borne
    Multitree Algorithms for Large-Scale Astrostatistics, William B. March, Arkadas Ozakin, Dongryeol Lee, Ryan Riegel, and Alexander G. Gray

    PART III: Machine Learning Methods
    Time–Frequency Learning Machines for Nonstationarity Detection Using Surrogates, Pierre Borgnat, Patrick Flandrin, Cédric Richard, André Ferrari, Hassan Amoud, and Paul Honeine
    Classification, Nikunj Oza
    On the Shoulders of Gauss, Bessel, and Poisson: Links, Chunks, Spheres, and Conditional Models, William D. Heavlin
    Data Clustering, Kiri L. Wagstaff
    Ensemble Methods: A Review, Matteo Re and Giorgio Valentini
    Parallel and Distributed Data Mining for Astronomy Applications, Kamalika Das and Kanishka Bhaduri
    Pattern Recognition in Time Series, Jessica Lin, Sheri Williamson, Kirk D. Borne, and David De Barr
    Randomized Algorithms for Matrices and Data, Michael W. Mahoney



    Michael J. Way, PhD, is a research scientist at the NASA Goddard Institute for Space Studies in New York and the NASA Ames Research Center in California. He is also an adjunct professor in the Department of Physics and Astronomy at Hunter College. His research focuses on understanding the multiscale structure of our universe, modeling the atmospheres of exoplanets, and applying kernel methods to new areas in astronomy.

    Jeffrey D. Scargle, PhD, is an astrophysicist in the Space Science and Astrobiology Division of the NASA Ames Research Center. His main interests encompass the variability of astronomical objects, including the Sun, sources in the Galaxy, and active galactic nuclei; cosmology; plasma astrophysics; planetary detection; and data analysis and statistical methods.

    Kamal M. Ali, PhD, is a research scientist in machine learning and data mining. He has a consulting practice and is cofounder of the start-up Metric Avenue. He has carried out research at IBM Almaden, Stanford University, Vividence, Yahoo, and TiVo, where he worked on the Tivo Collaborative Filtering Engine. His current research focuses on combining machine learning in conditional random fields with linguistically rich features to make machines better at reading web pages.

    Ashok N. Srivastava, PhD, is the principal scientist for Data Mining and Systems Health Management and leader of the Intelligent Data Understanding group at NASA Ames Research Center. His research includes the development of data mining algorithms for anomaly detection in massive data streams, kernel methods in machine learning, and text mining algorithms.

    "The volume is a well-organised collection of articles presenting the importance of modern data mining and machine learning techniques in application to analysis of astronomical data. … A major strength of the volume is its very impressive collection of real examples that can be both inspirational and educational. … The book is particularly successful in showing how collaboration between computer scientists and statisticians on one side and astronomers on the other is needed to search for a scientific discovery in the abundance of data generated by instrumentation and simulations."
    —Krzysztof Podgorski, International Statistical Review, 2014