1st Edition

Stochastic Optimization for Large-scale Machine Learning

176 Pages 25 B/W Illustrations

by CRC Press

176 Pages 25 B/W Illustrations

by CRC Press

Learn about VitalSource eBooks

Also available as eBook on:

Taylor & Francis eBooks
(Institutional Purchase)Opens in new tab or window

Description

Advancements in the technology and availability of data sources have led to the `Big Data' era. Working with large data offers the potential to uncover more fine-grained patterns and take timely and accurate decisions, but it also creates a lot of challenges such as slow training and scalability of machine learning models. One of the major challenges in machine learning is to develop efficient and scalable learning algorithms, i.e., optimization techniques to solve large scale learning problems.

Stochastic Optimization for Large-scale Machine Learning identifies different areas of improvement and recent research directions to tackle the challenge. Developed optimisation techniques are also explored to improve machine learning algorithms based on data access and on first and second order optimisation methods.

Key Features:

Bridges machine learning and Optimisation.
Bridges theory and practice in machine learning.
Identifies key research areas and recent research directions to solve large-scale machine learning problems.
Develops optimisation techniques to improve machine learning algorithms for big data problems.

The book will be a valuable reference to practitioners and researchers as well as students in the field of machine learning.

List of Figures
List of Tables
Preface

Section I BACKGROUND

Introduction
1.1 LARGE-SCALE MACHINE LEARNING
1.2 OPTIMIZATION PROBLEMS
1.3 LINEAR CLASSIFICATION
1.3.1 Support Vector Machine (SVM)
1.3.2 Logistic Regression
1.3.3 First and Second Order Methods
1.3.3.1 First Order Methods
1.3.3.2 Second Order Methods
1.4 STOCHASTIC APPROXIMATION APPROACH
1.5 COORDINATE DESCENT APPROACH
1.6 DATASETS
1.7 ORGANIZATION OF BOOK

Optimisation Problem, Solvers, Challenges and Research Directions
2.1 INTRODUCTION
2.1.1 Contributions
2.2 LITERATURE
2.3 PROBLEM FORMULATIONS
2.3.1 Hard Margin SVM (1992)
2.3.2 Soft Margin SVM (1995)
2.3.3 One-versus-Rest (1998)
2.3.4 One-versus-One (1999)
2.3.5 Least Squares SVM (1999)
2.3.6 v-SVM (2000)
2.3.7 Smooth SVM (2001)
2.3.8 Proximal SVM (2001)
2.3.9 Crammer Singer SVM (2002)
2.3.10 Ev-SVM (2003)
2.3.11 Twin SVM (2007)
2.3.12 Capped lp-norm SVM (2017)
2.4 PROBLEM SOLVERS
2.4.1 Exact Line Search Method
2.4.2 Backtracking Line Search
2.4.3 Constant Step Size
2.4.4 Lipschitz & Strong Convexity Constants
2.4.5 Trust Region Method
2.4.6 Gradient Descent Method
2.4.7 Newton Method
2.4.8 Gauss-Newton Method
2.4.9 Levenberg-Marquardt Method
2.4.10 Quasi-Newton Method
2.4.11 Subgradient Method
2.4.12 Conjugate Gradient Method
2.4.13 Truncated Newton Method
2.4.14 Proximal Gradient Method
2.4.15 Recent Algorithms
2.5 COMPARATIVE STUDY
2.5.1 Results from Literature
2.5.2 Results from Experimental Study
2.5.2.1 Experimental Setup and Implementation Details
2.5.2.2 Results and Discussions
2.6 CURRENT CHALLENGES AND RESEARCH DIRECTIONS
2.6.1 Big Data Challenge
2.6.2 Areas of Improvement
2.6.2.1 Problem Formulations
2.6.2.2 Problem Solvers
2.6.2.3 Problem Solving Strategies/Approaches
2.6.2.4 Platforms/Frameworks
2.6.3 Research Directions
2.6.3.1 Stochastic Approximation Algorithms
2.6.3.2 Coordinate Descent Algorithms
2.6.3.3 Proximal Algorithms
2.6.3.4 Parallel/Distributed Algorithms
2.6.3.5 Hybrid Algorithms
2.7 CONCLUSION

Section II FIRST ORDER METHODS
Mini-batch and Block-coordinate Approach
3.1 INTRODUCTION
3.1.1 Motivation
3.1.2 Batch Block Optimization Framework (BBOF)
3.1.3 Brief Literature Review
3.1.4 Contributions
3.2 STOCHASTIC AVERAGE ADJUSTED GRADIENT (SAAG) METHODS
3.3 ANALYSIS
3.4 NUMERICAL EXPERIMENTS
3.4.1 Experimental setup
3.4.2 Convergence against epochs
3.4.3 Convergence against Time
3.5 CONCLUSION AND FUTURE SCOPE

Variance Reduction Methods
4.1 INTRODUCTION
4.1.1 Optimization Problem
4.1.2 Solution Techniques for Optimization Problem
4.1.3 Contributions
4.2 NOTATIONS AND RELATED WORK
4.2.1 Notations
4.2.2 Related Work
4.3 SAAG-I, II AND PROXIMAL EXTENSIONS
4.4 SAAG-III AND IV ALGORITHMS
4.5 ANALYSIS
4.6 EXPERIMENTAL RESULTS
4.6.1 Experimental Setup
4.6.2 Results with Smooth Problem
4.6.3 Results with non-smooth Problem
4.6.4 Mini-batch Block-coordinate versus mini-batch setting
4.6.5 Results with SVM
4.7 CONCLUSION

Learning and Data Access
5.1 INTRODUCTION
5.1.1 Optimization Problem
5.1.2 Literature Review
5.1.3 Contributions
5.2 SYSTEMATIC SAMPLING
5.2.1 Definitions
5.2.2 Learning using Systematic Sampling
5.3 ANALYSIS
5.4 EXPERIMENTS
5.4.1 Experimental Setup
5.4.2 Implementation Details
5.4.3 Results
5.5 CONCLUSION

Section III SECOND ORDER METHODS

Mini-batch Block-coordinate Newton Method
6.1 INTRODUCTION
6.1.1 Contributions
6.2 MBN
6.3 EXPERIMENTS
6.3.1 Experimental Setup
6.3.2 Comparative Study
6.4 CONCLUSION

Stochastic Trust Region Inexact Newton Method
7.1 INTRODUCTION
7.1.1 Optimization Problem
7.1.2 Solution Techniques
7.1.3 Contributions
7.2 LITERATURE REVIEW
7.3 TRUST REGION INEXACT NEWTON METHOD
7.3.1 Inexact Newton Method
7.3.2 Trust Region Inexact Newton Method
7.4 STRON
7.4.1 Complexity
7.4.2 Analysis
7.5 EXPERIMENTAL RESULTS
7.5.1 Experimental Setup
7.5.2 Comparative Study
7.5.3 Results with SVM
7.6 EXTENSIONS
7.6.1 PCG Subproblem Solver 1
7.6.2 Stochastic Variance Reduced Trust Region Inexact Newton Method
7.7 CONCLUSION

Section IV CONCLUSION
Conclusion and Future Scope
8.1 FUTURE SCOPE 142

Bibliography

Index

Author(s)

Biography

Dr. Vinod Kumar Chauhan is a Research Associate in Industrial Machine Learning in the Institute for Manufacturing, Department of Engineering at University of Cambridge UK. He has a PhD in Machine Learning from Panjab University Chandigarh India. His research interests are in Machine Learning, Optimization and Network Science. He specializes in solving large-scale optimization problems in Machine Learning, handwriting recognition, flight delay propagation in airlines, robustness and nestedness in complex networks and supply chain design using mathematical programming, genetic algorithms and reinforcement learning.

Add to Cart

Stochastic Optimization for Large-scale Machine Learning

Description

Table of Contents

Author(s)

Biography