1st Edition

Stochastic Optimization for Large-scale Machine Learning

By Vinod Kumar Chauhan Copyright 2022
    176 Pages 25 B/W Illustrations
    by CRC Press

    Advancements in the technology and availability of data sources have led to the `Big Data' era. Working with large data offers the potential to uncover more fine-grained patterns and take timely and accurate decisions, but it also creates a lot of challenges such as slow training and scalability of machine learning models. One of the major challenges in machine learning is to develop efficient and scalable learning algorithms, i.e., optimization techniques to solve large scale learning problems.

    Stochastic Optimization for Large-scale Machine Learning identifies different areas of improvement and recent research directions to tackle the challenge. Developed optimisation techniques are also explored to improve machine learning algorithms based on data access and on first and second order optimisation methods.

    Key Features:

    • Bridges machine learning and Optimisation.
    • Bridges theory and practice in machine learning.
    • Identifies key research areas and recent research directions to solve large-scale machine learning problems.
    • Develops optimisation techniques to improve machine learning algorithms for big data problems.

    The book will be a valuable reference to practitioners and researchers as well as students in the field of machine learning.

    List of Figures
    List of Tables
    Preface 


    Section I BACKGROUND

    Introduction
    1.1 LARGE-SCALE MACHINE LEARNING 
    1.2 OPTIMIZATION PROBLEMS 
    1.3 LINEAR CLASSIFICATION
    1.3.1 Support Vector Machine (SVM) 
    1.3.2 Logistic Regression 
    1.3.3 First and Second Order Methods
    1.3.3.1 First Order Methods 
    1.3.3.2 Second Order Methods 
    1.4 STOCHASTIC APPROXIMATION APPROACH 
    1.5 COORDINATE DESCENT APPROACH 
    1.6 DATASETS 
    1.7 ORGANIZATION OF BOOK 

    Optimisation Problem, Solvers, Challenges and Research Directions
    2.1 INTRODUCTION 
    2.1.1 Contributions 
    2.2 LITERATURE 
    2.3 PROBLEM FORMULATIONS 
    2.3.1 Hard Margin SVM (1992) 
    2.3.2 Soft Margin SVM (1995) 
    2.3.3 One-versus-Rest (1998) 
    2.3.4 One-versus-One (1999) 
    2.3.5 Least Squares SVM (1999) 
    2.3.6 v-SVM (2000) 
    2.3.7 Smooth SVM (2001) 
    2.3.8 Proximal SVM (2001) 
    2.3.9 Crammer Singer SVM (2002) 
    2.3.10 Ev-SVM (2003) 
    2.3.11 Twin SVM (2007) 
    2.3.12 Capped lp-norm SVM (2017) 
    2.4 PROBLEM SOLVERS 
    2.4.1 Exact Line Search Method 
    2.4.2 Backtracking Line Search 
    2.4.3 Constant Step Size 
    2.4.4 Lipschitz & Strong Convexity Constants 
    2.4.5 Trust Region Method 
    2.4.6 Gradient Descent Method 
    2.4.7 Newton Method 
    2.4.8 Gauss-Newton Method 
    2.4.9 Levenberg-Marquardt Method 
    2.4.10 Quasi-Newton Method 
    2.4.11 Subgradient Method 
    2.4.12 Conjugate Gradient Method 
    2.4.13 Truncated Newton Method 
    2.4.14 Proximal Gradient Method 
    2.4.15 Recent Algorithms 
    2.5 COMPARATIVE STUDY 
    2.5.1 Results from Literature 
    2.5.2 Results from Experimental Study 
    2.5.2.1 Experimental Setup and Implementation Details 
    2.5.2.2 Results and Discussions 
    2.6 CURRENT CHALLENGES AND RESEARCH DIRECTIONS 
    2.6.1 Big Data Challenge 
    2.6.2 Areas of Improvement 
    2.6.2.1 Problem Formulations 
    2.6.2.2 Problem Solvers 
    2.6.2.3 Problem Solving Strategies/Approaches 
    2.6.2.4 Platforms/Frameworks 
    2.6.3 Research Directions 
    2.6.3.1 Stochastic Approximation Algorithms 
    2.6.3.2 Coordinate Descent Algorithms 
    2.6.3.3 Proximal Algorithms 
    2.6.3.4 Parallel/Distributed Algorithms 
    2.6.3.5 Hybrid Algorithms 
    2.7 CONCLUSION 


    Section II FIRST ORDER METHODS
    Mini-batch and Block-coordinate Approach 
    3.1 INTRODUCTION 
    3.1.1 Motivation 
    3.1.2 Batch Block Optimization Framework (BBOF) 
    3.1.3 Brief Literature Review 
    3.1.4 Contributions 
    3.2 STOCHASTIC AVERAGE ADJUSTED GRADIENT (SAAG) METHODS
    3.3 ANALYSIS 
    3.4 NUMERICAL EXPERIMENTS 
    3.4.1 Experimental setup 
    3.4.2 Convergence against epochs 
    3.4.3 Convergence against Time 
    3.5 CONCLUSION AND FUTURE SCOPE 

    Variance Reduction Methods 
    4.1 INTRODUCTION 
    4.1.1 Optimization Problem 
    4.1.2 Solution Techniques for Optimization Problem 
    4.1.3 Contributions 
    4.2 NOTATIONS AND RELATED WORK 
    4.2.1 Notations 
    4.2.2 Related Work 
    4.3 SAAG-I, II AND PROXIMAL EXTENSIONS 
    4.4 SAAG-III AND IV ALGORITHMS 
    4.5 ANALYSIS 
    4.6 EXPERIMENTAL RESULTS 
    4.6.1 Experimental Setup 
    4.6.2 Results with Smooth Problem 
    4.6.3 Results with non-smooth Problem 
    4.6.4 Mini-batch Block-coordinate versus mini-batch setting 
    4.6.5 Results with SVM 
    4.7 CONCLUSION 

    Learning and Data Access 
    5.1 INTRODUCTION 
    5.1.1 Optimization Problem 
    5.1.2 Literature Review 
    5.1.3 Contributions 
    5.2 SYSTEMATIC SAMPLING 
    5.2.1 Definitions 
    5.2.2 Learning using Systematic Sampling 
    5.3 ANALYSIS 
    5.4 EXPERIMENTS 
    5.4.1 Experimental Setup 
    5.4.2 Implementation Details 
    5.4.3 Results 
    5.5 CONCLUSION 

    Section III SECOND ORDER METHODS

    Mini-batch Block-coordinate Newton Method 
    6.1 INTRODUCTION 
    6.1.1 Contributions 
    6.2 MBN 
    6.3 EXPERIMENTS 
    6.3.1 Experimental Setup 
    6.3.2 Comparative Study 
    6.4 CONCLUSION 

    Stochastic Trust Region Inexact Newton Method 
    7.1 INTRODUCTION 
    7.1.1 Optimization Problem 
    7.1.2 Solution Techniques 
    7.1.3 Contributions 
    7.2 LITERATURE REVIEW 
    7.3 TRUST REGION INEXACT NEWTON METHOD 
    7.3.1 Inexact Newton Method 
    7.3.2 Trust Region Inexact Newton Method 
    7.4 STRON 
    7.4.1 Complexity 
    7.4.2 Analysis 
    7.5 EXPERIMENTAL RESULTS 
    7.5.1 Experimental Setup 
    7.5.2 Comparative Study 
    7.5.3 Results with SVM 
    7.6 EXTENSIONS 
    7.6.1 PCG Subproblem Solver 1
    7.6.2 Stochastic Variance Reduced Trust Region Inexact Newton Method 
    7.7 CONCLUSION 


    Section IV CONCLUSION
    Conclusion and Future Scope 
    8.1 FUTURE SCOPE 142

    Bibliography

    Index

    Biography

    Dr. Vinod Kumar Chauhan is a Research Associate in Industrial Machine Learning in the Institute for Manufacturing, Department of Engineering at University of Cambridge UK. He has a PhD in Machine Learning from Panjab University Chandigarh India. His research interests are in Machine Learning, Optimization and Network Science. He specializes in solving large-scale optimization problems in Machine Learning, handwriting recognition, flight delay propagation in airlines, robustness and nestedness in complex networks and supply chain design using mathematical programming, genetic algorithms and reinforcement learning.