1st Edition
Statistical Analysis of Financial Data With Examples In R
Statistical Analysis of Financial Data covers the use of statistical analysis and the methods of data science to model and analyze financial data. The first chapter is an overview of financial markets, describing the market operations and using exploratory data analysis to illustrate the nature of financial data. The software used to obtain the data for the examples in the first chapter and for all computations and to produce the graphs is R. However discussion of R is deferred to an appendix to the first chapter, where the basics of R, especially those most relevant in financial applications, are presented and illustrated. The appendix also describes how to use R to obtain current financial data from the internet.
Chapter 2 describes the methods of exploratory data analysis, especially graphical methods, and illustrates them on real financial data. Chapter 3 covers probability distributions useful in financial analysis, especially heavy-tailed distributions, and describes methods of computer simulation of financial data. Chapter 4 covers basic methods of statistical inference, especially the use of linear models in analysis, and Chapter 5 describes methods of time series with special emphasis on models and methods applicable to analysis of financial data.
Features
* Covers statistical methods for analyzing models appropriate for financial data, especially models with outliers or heavy-tailed distributions.
* Describes both the basics of R and advanced techniques useful in financial data analysis.
* Driven by real, current financial data, not just stale data deposited on some static website.
* Includes a large number of exercises, many requiring the use of open-source software to acquire real financial data from the internet and to analyze it.
1. The Nature of Financial Data
Financial Time Series
Autocorrelations
Stationarity
Time Scales and Data Aggregation
Financial Assets and Markets
Markets and Regulatory Agencies
Interest
Returns on Assets
Stock Prices; Fair Market Value
Splits, Dividends, and Return of Capital
Indexes and "the Market"
Derivative Assets
Short Positions
Portfolios of Assets: Diversification and Hedging
Frequency Distributions of Returns
Location and Scale
Skewness
Kurtosis
Multivariate Data
The Normal Distribution
Q-Q Plots
Outliers
Other Statistical Measures
Volatility
The Time Series of Returns
Measuring Volatility: Historical and Implied
Volatility Indexes: The VIX
The Curve of Implied Volatility
Risk Assessment and Management
Market Dynamics
Stylized Facts about Financial Data
Notes and Further Reading
Exercises and Questions for Review
Appendix A: Accessing and Analyzing Financial Data in R
A R Basics
A Data Repositories and Inputting Data into R
A Time Series and Financial Data in R
A Data Cleansing
Notes, Comments, and Further Reading on R
Exercises in R
2. Exploratory Financial Data Analysis
Data Reduction
Simple Summary Statistics
Centering and Standardizing Data
Simple Summary Statistics for Multivariate Data
Transformations
Identifying Outlying Observations
The Empirical Cumulative Distribution Function
Nonparametric Probability Density Estimation
Binned Data
Kernel Density Estimator
Multivariate Kernel Density Estimator
Graphical Methods in Exploratory Analysis
Time Series Plots
Histograms
Boxplots
Density Plots
Bivariate Data
Q-Q Plots
Graphics in R
Notes and Further Reading
Exercises
3. Probability Distributions in Models of Observable Events
Random Variables and Probability Distributions
Discrete Random Variables
Continuous Random Variables
Multivariate Distributions
Measures of Association in Multivariate Distributions
Copulas
Transformations of Multivariate Random Variables
Distributions of Order Statistics
Asymptotic Distributions; The Central Limit Theorem
The Tails of Probability Distributions
Sequences of Random Variables; Stochastic Processes
Diffusion of Stock Prices and Pricing of Options
Some Useful Probability Distributions
Discrete Distributions
Continuous Distributions
Multivariate Distributions
General Families of Distributions Useful in Modeling
Constructing Multivariate Distributions
Modeling of Data-Generating Processes
R Functions for Probability Distributions
Simulating Observations of a Random Variable
Uniform Random Numbers
Generating Nonuniform Random Numbers
Simulating Data in R
Notes and Further Reading
Exercises
4. Statistical Models and Methods of Inference
Models
Fitting Statistical Models
Measuring and Partitioning Observed Variation
Linear Models
Nonlinear Variance-Stabilizing Transformations
Parametric and Nonparametric Models
Bayesian Models
Models for Time Series
Criteria and Methods for Statistical Modeling
Estimators and Their Properties
Methods of Statistical Modeling
Optimization in Statistical Modeling; Least Squares and Other Applications
The General Optimization Problem
Least Squares
Maximum Likelihood
R Functions for Optimization
Statistical Inference
Confidence Intervals
Testing Statistical Hypotheses
Prediction
Inference in Bayesian Models
Resampling Methods; The Bootstrap
Robust Statistical Methods
Estimation of the Tail Index
Estimation of VaR and Expected Shortfall
Models of Relationships among Variables
Principal Components
Regression Models
Linear Regression Models
Linear Regression Models: The Regressors
Linear Regression Models: Individual Observations and Residuals
Linear Regression Models: An Example
Nonlinear Models
Specifying Models in R
Assessing the Adequacy of Models
Goodness-of-Fit Tests; Tests for Normality
Cross Validation
Model Selection and Model Complexity
Notes and Further Reading
Exercises
5. Discrete Time Series Models and Analysis
Basic Linear Operations
The Backshift Operator
The Difference Operator
The Integration Operator
Summation of an Infinite Geometric Series
Linear Difference Equations
Trends and Detrending
Cycles and Seasonal Adjustment
Analysis of Discrete Time Series Models
Stationarity
Sample Autocovariance and Autocorrelation Functions; Estimators
Statistical Inference in Stationary Time Series
Autoregressive and Moving Average Models
Moving Average Models; MA(q)
Autoregressive Models; AR(p)
The Partial Autocorrelation Function (PACF)
ARMA and ARIMA Models
Simulation of ARMA and ARIMA Models
Statistical Inference in ARMA and ARIMA Models
Selection of Orders in ARIMA Models
Forecasting in ARIMA Models
Analysis of ARMA and ARIMA Models in R
Robustness of ARMA Procedures; Innovations with Heavy Tails
Financial Data
Linear Regression with ARMA Errors
Conditional Heteroscedasticity
ARCH Models
GARCH Models and Extensions
Unit Roots and Cointegration
Spurious Correlations; The Distribution of the Correlation Coefficient
Unit Roots
Cointegrated Processes
Notes and Further Reading
Exercises
Biography
James E. Gentle is University Professor Emeritus at George Mason University. He is a Fellow of the American Statistical Association (ASA) and of the American Association for the Advancement of Science. He is author of Random Number Generation and Monte Carlo Methods and Matrix Algebra.
"The book is very well written, and fills an important need for an up-to-date textbook about statistical techniques applied to finance. The book explains the theory behind the statistical techniques very well, with good detail. The mathematical notation is appealing and elegant."
~Jerzy Pawlowski, New York University Tandon School of Engineering"I thoroughly enjoyed reading the first two chapters of the book. Often, the first couple of chapters of a book provide a "boilerplate" discussion of the characteristics of the data and R. Here, the first two chapters are very well developed, to the point that they provide a good general resource to readers approaching the analysis of financial data from several different perspectives. For example, students in statistics usually approach the entire analysis of time series having in mind the potential application to the analysis of financial data, but they know nothing about the characteristics of the data and the financial markets...Just like the previous chapters, I broadly enjoyed reading this chapter. Prof. Gentle explains the topics clearly and often uses simulations to convey the intuition. That's also the way I like to teach these concepts and I think it enhances understanding among economics and finance students. I also commend the way he discusses the lag and difference operators and how they are implemented in R. He devotes quite some space to them, and I believe that is good as many texts go over these concepts too quickly for many students. Likewise, the discussion of the AR(I)MA models is very detailed and clear.
~Jan Annaert, University of Antwerp and Antwerp Management School