1st Edition

An Introduction to Optimization with Applications in Machine Learning and Data Analytics

By Jeffrey Paul Wheeler Copyright 2024
473 Pages 94 B/W Illustrations
by Chapman & Hall

473 Pages 94 B/W Illustrations
by Chapman & Hall

The primary goal of this text is a practical one. Equipping students with enough knowledge and creating an independent research platform, the author strives to prepare students for professional careers. Providing students with a marketable skill set requires topics from many areas of optimization. The initial goal of this text is to develop a marketable skill set for mathematics majors as well as... Read more

I Preliminary Matters                                                             

1  Preamble                                                                                                               

1.1Introduction                                                                                  

1.2Software                                                                                                

1.3AboutthisBook

      1.3.1Presentation

1.3.2 Contents                                                                                   

      1.4 One Semester Course Material        

1.5Acknowledgements

2 The Language of Optimization                                                               

2.1 Basic Terms Defined                                                                    
2.2 When a Max or Min is not in the Set                                            

2.3 Solving an Optimization Problem                                                   

2.4 Algorithms and Heuristics                                                             

2.5RuntimeofanAlgorithmoraHeuristic

2.6ForFurtherStudy

2.7 Keywords                                                                                    

2.8 Exercises                                                                                     

3 Computational Complexity                                                                        

3.1 Arithmetic Complexity                                                               

3.2 Asymptotic Notation                                                                  

3.3 Intractability

3.4 Complexity Classes                                                                    

      3.4.1Introduction

3.4.2        Time, Space, and Big O Notation

3.4.3        The Complexity Class P

3.4.4        The Complexity Class NP

3.4.5        Utility of Complexity Classes

3.5       For Further Study

3.6       Keywords

3.7       Exercises

4      Algebra Review                                                                                                      

4.1       Systems of Linear Inequalities in Two Variables - Geometric Solutions

4.1.1        Examples

4.2       Solving Systems of Linear Equations Using Linear Algebra

4.2.1        Gauss-Jordan Elimination

4.2.2        Gaussian Elimination compared with Gauss-Jordan Elimination

4.3       Linear Algebra Basics

4.3.1        Matrices and Their Multiplication

4.3.2        Identity Matrices, Inverses, and Determinants of Matrices

4.3.3        Solving Systems of Linear Equations via Cramer’s Rule       

4.3.4        Vector and Matrix Norms

4.3.5        Vector Spaces

4.4       Matrix Properties Important to Optimization

4.4.1        Eigenvalues

4.4.2        Unimodular Matrices

4.5       Keywords

4.6       Exercises

5       Matrix Factorization                                                                                         

5.1       LU Factorization

5.2       Cholesky Decomposition

5.3       Orthogonality

5.4       Orthonormal Matrices

5.5       The Gram-Schmidt Process

5.6       QR Factorization

5.7       Keywords

5.8       For Further Study

5.9       Exercises

II          Linear Programming                                                        

6      Linear Programming                                                                                         

6.1       A Geometric Approach to Linear Programming in Two Dimen-

sions

6.1.1        Example

6.1.2        Summary

6.1.3        Keywords

6.2      ≤

  The Simplex Method: Maximization Problems with Problem Constraints of the form

6.2.1        Introduction

6.2.2        Slack Variables

6.2.3        The Method

6.2.4        Summary

6.2.5        Keywords

6.3       The Dual: Minimization with Problem Constraints of the form

6.3.1        How it Works

6.3.2        Why it Works

6.3.3        Keywords

6.4       The Big M Method: Maximum and Minimization Problems ≥ ≤

  with Mixed Constraints ( , , or =)

6.4.1        Maximization Problems with the Big M Method

6.4.2        Minimization Problems with the Big M Method

6.5       Summary of Applying the Simplex Method to Linear Program-

ming Problems

6.6       Degeneracy and Cycling in the Simplex Method

6.7       Exercises

7       Sensitivity Analysis                                                                                         

7.1       Motivation

7.2       An Excel Example

7.2.1        Solver’s Answer Report

7.2.2        Solver’s Sensitivity Report

7.2.3        Solver’s Limits Report

7.3       Exercises

8      Integer Linear Programming                                                                     

8.1       Introduction

8.2       Dakin’s Branch and Bound

8.3       Gomory Cut-Planes

8.4       For Further Study

8.5       Exercises

III Nonlinear (Geometric) Programming

4      Calculus Review                                                                                                  

4.1       Derivatives and Continuity

4.2       Taylor Series for Functions of a Single Variable

4.3       Newton’s Method

4.4       Vectors

4.5       Partial Derivatives

4.6       The Taylor Series of a Function of Two Variables

4.7       Exercises

5       A Calculus Approach to Nonlinear Programming                       

5.1       Using Derivatives to Find Extrema of Functions of a Single Variable

5.2       Calculus and Extrema of Multivariable Functions

5.3       Exercises

6      Constrained Nonlinear Programming: Lagrange Multipliers and the KKT Conditions                                                                             

6.1       Lagrange Multipliers

6.2       The KKT Conditions

6.3       Exercises

7       Optimization involving Quadratic Forms                                           

7.1       Quadratic Forms

7.2       Definite and Semidefinite Matrices and Optimization

7.3       The Role of Eigenvalues in Optimization

7.4       Keywords

7.5       Exercises

8      Iterative Methods                                                                                              

8.1       Newton’s Method for Optimization

8.1.1        Single-Variable Newton’s Method for Optimization

8.1.2        Multivariable Newton’s Method for Optimization

8.2       Steepest Descent (or Gradient Descent)

8.2.1        Generalized Reduced Gradient

8.3       Additional Geometric Programming Techniques

8.4       Exercises

9      Derivative-Free Methods                                                                               

9.1       The Arithmetic Mean-Geometric Mean Inequality (AGM)

9.2       Weight Finding Algorithm for the AGM

9.3       The AGM, Newton’s Method, and Reduced Gradient Com- pared

9.4       Exercises

10   Search Algorithms                                                                                            

10.1    Evolutionary Algorithms

10.2    Ant Foraging Optimization

II    Convexity and the Fundamental Theorem of Lin- ear Programming                                                                       

11     Important Sets for Optimization                                                             

11.1    Special Linear Combinations

11.2    Special Sets

11.3    Special Properties of Sets

11.4    Special Objects

11.5    Exercises

12    The Fundamental Theorem of Linear Programming                  

12.1    The Finite Basis Theorem

12.2    The Fundamental Theorem of Linear Programming

12.3    For Further Study

12.4    Exercises

13    Convex Functions                                                                                             

13.1    Convex Functions of a Single Variable

13.2    Concave Functions

13.3    Graphs of Convex and Concave Functions

13.4    Multivariable Convex Functions

13.5    Mathematical Results from Convexity

13.6    Exercises

14    Convex Optimization                                                                                      

Jourdain Lamperski

14.1    Convex optimization and applications

14.2    Duality

14.3    Subgradient descent

14.4    Exercises

III      Combinatorial Optimization                                           

15    An Introduction to Combinatorics                                                         

15.1    Introduction

15.2    The Basic Tools of Counting

15.2.1    When We Add, Subtract, Multiply or Divide

15.2.2    Permutations and Combinations

15.3    The Binomial Theorem and Binomial Coefficients

15.3.1    Pascal’s Triangle

15.3.2    Binomial Coefficients

15.3.3    The Binomial Theorem

15.3.4    Another Counting Argument

15.3.5    The Multinomial Theorem

15.4    Counting When Objects are Indistinguishable

15.4.1    Permutations with Indistinguishable Objects

15.4.2    Summary of Basic Counting Techniques

15.5    The Pigeonhole Principle

15.6    Exercises

16    An Introduction to Graph Theory                                                          

16.1    Basic Definitions

16.2    Special Graphs

16.2.1    Empty Graphs and the Trivial Graph

16.2.2    Walks, Trails, Paths, and Cycles

16.2.3    Trees

16.2.4    Complete and Bipartite Graphs

16.3    Vertex and Edge Cuts

16.3.1    Graph Connectivity

16.3.2    Notation for Removing a Vertex or Edge from a Graph     

16.3.3    Cut Vertices and Vertex Cuts

16.3.4    Edge Cuts and Bridges

16.4    Some Useful and Interesting Results

16.5    Exercises

17    Network Flows                                                                                                    

17.1    Basic Definitions

17.2    Maximum Flows and Cuts

17.3    The Dinitz-Edmonds-Karp-Ford-Fulkerson Algorithm

17.4    Max Flow as a Linear Programming Problem

17.5    Application to a Major League Baseball Pennant Race

17.6    Exercises

18   Minimum-Weight Spanning Trees and Shortest Paths               

18.1    Weighted Graphs and Spanning Trees

18.2    Minimum-Weight Spanning Trees

18.2.1    Kruskal’s Algorithm

18.2.2    Prim’s Method

18.2.3    Kruskal’s and Prim’s Compared

18.3    Shortest Paths

18.3.1    Dijkstra’s Algorithm

18.3.2    A Linear Programming Approach to Shortest Paths

18.4    Exercises

19    Network Modeling and the Transshipment Problem                  

19.1    Introduction of the Problem

19.2    The Guarantee of Integer Solutions in Network Flow Problems     

19.3    Exercises

20   The Traveling Salesperson Problem                                                       

20.1    History of the Traveling Salesperson Problem

20.2    The Problem

20.3    Heuristic Solutions

20.3.1    Nearest Neighbor

20.3.2    Insertion Algorithms

20.3.2.1      Closest Insertion

20.3.2.2      Cheapest Insertion

20.3.3    The Geometric Heuristic

20.3.4    2-Opt and 3-Opt

20.3.5    Christophides Algorithm

20.4    Concorde

20.5    For Further Study

20.6    Exercises

IV Optimization for Data Analytics and Machine Learning                                                                                

21    Probability                                                                                                             

21.1    Introduction

21.2    Set Theory

21.2.1    The Vocabulary of Sets and Sample Spaces

21.2.2    The Algebra of Sets

21.3    Foundations of Probability

21.3.1    Borel Sets

21.3.2    The Axioms and Basic Properties of Probability

21.4    Conditional Probability

21.4.1    Naive Probability

21.4.2    Conditional Probability

21.4.3    Bayes’ Theorem

21.4.3.1      Bayesian Spam Filters

21.4.4    Independence

21.5    Random Variables and Distributions

21.5.1    Random Variables

21.5.2    Probability Mass and Probability Density Functions

21.5.2.1      Expectation and Variance of a Discrete Random Variable

21.5.3    Some Discrete Random Variable Probability Distributions

21.5.3.1      The Binomial Distribution

21.5.3.2      The Geometric Distribution

21.5.3.3      The Negative Binomial Distribution

21.5.3.4      The Hypergeometric Distribution

21.6    Exercises

22   Regression Analysis via Least Squares                                                 

John McKay and Suren Jayasuria

22.1    Introduction

22.2    Formulation

22.3    Linear Least Squares

22.3.1    Pseudo-Inverse

22.3.2    Brief Discussion of Probabilistic Interpretation

22.4    Regularized Linear Least Squares

23   Forecasting                                                                                                           

Joseph “Nico” Gabriel

23.1    Smoothing

23.1.1    Exponential Smoothing

23.1.2    Trends

23.1.3    Seasonality

23.2    Stationary Data and Differencing

23.2.1    Autocorrelation

23.3    ARIMA models

23.3.1    Autoregressive models

23.3.2    Moving Average models

23.3.3    ARIMA model structure

23.4    Partial

23.4.1    Goodness-of-fit metrics

23.5    Exercises

24   Introduction to Machine Learning                                                         

Suren Jayasuria and John McKay

24.1    Introduction

24.2    Nearest Neighbors

24.3    Support Vector Machines

24.4    Neural Networks

24.4.1    Artificial Neural Networks

24.4.2    Exercises

A Techniques of Proof                                                                                          

A.1      Introduction to Propositional Logic

A.2      Direct Proofs

A.3      Indirect Proofs

A.3.1      Proving the Contrapositive

A.3.2      Proof by Contradiction

A.4      The Principle of Mathematical Induction

A.5      Exercises

B     Useful Tools from Analysis and Topology                                          

B.1      Definitions

B.2      Useful Theorems

Bibliography

Bibliography                                                                                                                       

Index

Notation                                                                                                                                

List of Notation


Biography

Jeffrey Paul Wheeler earned his PhD in Combinatorial Number Theory from the University of Memphis by extending what had been a conjecture of Erdős on the integers to finite groups. He has published, given talks at numerous schools, and twice been a guest of Trinity College at the University of Cambridge. He has taught mathematics at Miami University (Ohio), the University of Tennessee-Knoxville, the University of Memphis, Rhodes College, the University of Pittsburgh, Carnegie Mellon University, and Duquesne University. He has received numerous teaching awards and is currently in the Department of Mathematics at the University of Pittsburgh. He also occasionally teaches for Pitt’s Computer Science Department and the College of Business Administration. Dr. Wheeler’s Optimization course was one of the original thirty to participate in the Mathematical Association of America’s NSF-funded PIC Math program.