3rd Edition

Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data, Third Edition

By Bruce Ratner Copyright 2017
690 Pages
by Chapman & Hall

724 Pages 200 B/W Illustrations
by Chapman & Hall

690 Pages 200 B/W Illustrations
by Chapman & Hall

Interest in predictive analytics of big data has grown exponentially in the four years since the publication of Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data, Second Edition. In the third edition of this bestseller, the author has completely revised, reorganized, and repositioned the original chapters and produced 13 new... Read more

Preface to Third Edition



Preface of Second Edition



Acknowledgments



Author




1. Introduction



2. Science Dealing with Data: Statistics and Data Science



3. Two Basic Data Mining Methods for Variable Assessment



4. CHAID-Based Data Mining for Paired-Variable Assessment



5. The Importance of Straight Data Simplicity and Desirability for Good Model-Building Practice



6. Symmetrizing Ranked Data: A Statistical Data Mining Method for Improving the Predictive Power of Data



7. Principal Component Analysis: A Statistical Data Mining Method for Many-Variable Assessment



8. Market Share Estimation: Data Mining for an Exceptional Case



9. The Correlation Coefficient: Its Values Range between Plus and Minus 1, or Do They?



10. Logistic Regression: The Workhorse of Response Modeling



11. Predicting Share of Wallet without Survey Data



12. Ordinary Regression: The Workhorse of Profit Modeling



13. Variable Selection Methods in Regression: Ignorable Problem, Notable Solution



14. CHAID for Interpreting a Logistic Regression Model



15. The Importance of the Regression Coefficient



16. The Average Correlation: A Statistical Data Mining Measure for Assessment of Competing Predictive Models and the Importance of the Predictor Variables



17. CHAID for Specifying a Model with Interaction Variables



18. Market Segmentation Classification Modeling with Logistic Regression



19. Market Segmentation Based on Time-Series Data Using Latent Class Analysis



20. Market Segmentation: An Easy Way to Understand the Segments



21. The Statistical Regression Model: An Easy Way to Understand the Model



22. CHAID as a Method for Filling in Missing Values



23. Model Building with Big Complete and Incomplete Data



24. Art, Science, Numbers, and Poetry



25. Identifying Your Best Customers: Descriptive, Predictive, and Look-Alike Profiling



26. Assessment of Marketing Models



27. Decile Analysis: Perspective and Performance



28. Net T-C Lift Model: Assessing the Net Effects of Test and Control Campaigns



29. Bootstrapping in Marketing: A New Approach for Validating Models



30. Validating the Logistic Regression Model: Try Bootstrapping



31. Visualization of Marketing Models: Data Mining to Uncover Innards of a Model



32. The Predictive Contribution Coefficient: A Measure of Predictive Importance



33. Regression Modeling Involves Art, Science, and Poetry, Too



34. Opening the Dataset: A Twelve-Step Program for Dataholics



35. Genetic and Statistic Regression Models: A Comparison



36. Data Reuse: A Powerful Data Mining Effect of the GenIQ Model



37. A Data Mining Method for Moderating Outliers Instead of Discarding Them



38. Overfitting: Old Problem, New Solution



39. The Importance of Straight Data: Revisited



40. The GenIQ Model: Its Definition and an Application



41. Finding the Best Variables for Marketing Models



42. Interpretation of Coefficient-Free Models



43. Text Mining: Primer, Illustration, and TXTDM Software



44. Some of My Favorite Statistical Subroutines





Index

Biography

Bruce Ratner, The Significant StatisticianTM, is President and Founder of DM STAT-1 Consulting, the ensample for Statistical Modeling, Analysis and Data Mining, and Machine-learning Data Mining in the DM Space. DM STAT-1 specializes in all standard statistical techniques, and methods using machine-learning/statistics algorithms, such as its patented GenIQ Model, to achieve its clients' goals – across industries including Direct and Database Marketing, Banking, Insurance, Finance, Retail, Telecommunications, Healthcare, Pharmaceutical, Publication & Circulation, Mass & Direct Advertising, Catalog Marketing, e-Commerce, Web-mining, B2B, Human Capital Management, Risk Management, and Nonprofit Fundraising. Bruce holds a doctorate in mathematics and statistics, with a concentration in multivariate statistics and response model simulation. His research interests include developing hybrid-modeling techniques, which combine traditional statistics and machine learning methods. He holds a patent for a unique application in solving the two-group classification problem with genetic programming.

"I bought your book as it seemed to have the right mixture of statistical theory, practice, and common sense – finally! You can find the first often; the second occasionally; but the third, esp. in combination with the first two – never. I cannot thank you enough, Bruce! You are brilliant at assimilating, stating the underlying principles of analyses." ~Sandra Hendren, Sr. Lecturer, Harvard

"Bruce Ratner’s recent 3rd edition of "Statistical and Machine-Learning Data Mining" is the best I’ve seen in my long career. It provides insightful methods for data mining, and innovative techniques for predictive analytics. The book is a valuable resource for experienced and newbie data scientists. Bruce’s book is my new data science bible. It is written in a clear style, and is an enjoyable read as it includes historical notes, which flow with the material." ~Jack Theurer, President, G. Theurer Assoc. Inc.

"Your book has been very helpful when I was reviewing the manual for the Automatic Linear Modeling (ALM) in SPSS. It offers many insightful perspectives to use for future ALM features and improvements. This book is an excellent contribution to the literature of statistics, data mining, and machine learning. Thank you, Bruce." ~Patrick Yan, PhD, Professor, Arizona State Univ.

"I heard one of my instructors in Coursera mention Bruce Ratner’s new book "Statistical and Machine-Learning Data Mining" during an online chat when he became tired of answering questions." ~Mike Richardson, Head of Hardware, Smartfrog, Inc.