An Introduction to the Science and Applications of Unstructured Information Analysis
Text Analytics: An Introduction to the Science and Applications of Unstructured Information Analysis is a concise and accessible introduction to the science and applications of text analytics (or text mining), which enables automatic knowledge discovery from unstructured information sources, for both industrial and academic purposes. The book introduces the main concepts, models, and computational techniques that enable the reader to solve real decision-making problems arising from textual and/or documentary sources.
- Easy-to-follow step-by-step concepts and methods
- Every chapter is introduced in a very gentle and intuitive way so students can understand the WHYs, WHAT-IFs, WHAT-IS-THIS-FORs, HOWs, etc. by themselves
- Practical programming exercises in Python for each chapter
- Includes theory and practice for every chapter, summaries, practical coding exercises for target problems, QA, and sample code and data available for download at https://www.routledge.com/Atkinson-Abutridy/p/book/9781032249797
Table of Contents
1 TEXT ANALYTICS. 1.1 INTRODUCTION 1.2 TEXT MINING AND TEXT ANALYTICS 1.3 TASKS AND APPLICATIONS 1.4 THE TEXT ANALYTICS PROCESS 1.5 SUMMARY 1.6 QUESTIONS 2 NATURAL-LANGUAGE PROCESSING 2.1 INTRODUCTION 2.2 THE SCOPE OF NATURAL-LANGUAGE PROCESSING 2.3 NLP LEVELS AND TASKS 2.3.1 Phonology 2.3.2 Morphology 2.3.3 Lexicon 2.3.4 Syntax 2.3.5 Semantic 2.3.6 Reasoning and Pragmatics 2.1 SUMMARY 2.2 EXERCISES 2.2.1 Morphological Analysis 2.2.2 Lexical Analysis 2.2.3 Syntactic Analysis 3 INFORMATION EXTRACTION 3.1 INTRODUCTION 3.2 RULE-BASED INFORMATION EXTRACTION 3.3 NAMED-ENTITY RECOGNITION 3.3.1 N-Gram Models 3.4 RELATION EXTRACTION 3.5 EVALUATION 3.1 SUMMARY 3.2 EXERCISE 3.2.1 Regular Expressions 3.2.2 Named-Entity Recognition 4 DOCUMENT REPRESENTATION 4.1 INTRODUCTION 4.2 DOCUMENT INDEXING 4.3 VECTOR SPACE MODELS 4.3.1 Boolean Representation Model 4.3.2 Term Frequency Model 4.3.3 Inverse Document Frequency Model 4.1 SUMMARY 4.2 EXERCISES 4.2.1 TFxIDF Representation Model 5 ASSOCIATION RULES MINING 5. INTRODUCTION 5.2 ASSOCIATION PATTERNS 5.3 EVALUATION 5.3.1 Support 5.3.2Confidence 5.3.3 Lift 5.4 ASSOCIATION RULES GENERATION 5.1 SUMMARY 5.2 EXERCISES 5.2.1 Extraction of Association Rules 6 CORPUS-BASED SEMANTIC ANALYSIS 6.1 INTRODUCTION 6.2 CORPUS-BASED SEMANTIC ANALYSIS 6.3 LATENT SEMANTIC ANALYSIS 6.3.1 Creating Vectors with LSA 6.4 WORD2VEC 6.4.1 Embedding Learning 6.4.2 Prediction and Embeddings Interpretation 6.1 SUMMARY 6.2 EXERCISES 6.2.1 Latent Semantic Analysis 6.2. Word Embedding with Word2Vec 7 DOCUMENT CLUSTERING 7.1 INTRODUCTION 7.2 DOCUMENT CLUSTERING 7.3K-MEANS CLUSTERING 7.4 SELF-ORGANIZING MAP 7.4.1Topological Maps Learning 7.1 SUMMARY 7.2 EXERCISES 7.2.1 K-means Clustering 7.2.2 Self-Organizing Maps 8 TOPIC MODELING 8.1 INTRODUCTIO 8.2TOPIC MODELING 8.3 LATENT DIRICHLET ALLOCATION 8.4 EVALUATION 8.1 SUMMARY 8.2 EXERCISES 8.2.1 Modeling Topics with LDA 9 DOCUMENT CATEGORIZATION 9.1INTRODUCTION 9.2 CATEGORIZATION MODELS 9.3 BAYESIAN TEXT CATEGORIZATION 9.4 MAXIMUM ENTROPY CATEGORIZATION 9.5 EVALUATION 9.1 SUMMARY 9.2 EXERCISES 9.2.1 Naïve Bayes Categorization 9.2.2 MaxEnt Categorization
John Atkinson-Abutridy has been a university professor and researcher over the last 25 years. He received a PhD in Artificial Intelligence (AI) from the University of Edinburgh (UK), and has led scientific and technological projects both at national and international levels on several AI topics including Natural-Language Processing, Machine Learning, Evolutionary Computation, and Text Mining, and has published almost 100 peer review scientific articles in journals and conferences. Furthermore, he has been AI consultant and transferred some intelligent system technologies into the industry. Dr. Atkinson-Abutridy has been a visiting researcher/professor in several universities and research centers worldwide such as the University of Cambridge (UK), MIT (USA), IBM T.J. Watson Labs (USA), and INRIA (France). He is also a professional member of the AAAI and a senior member of the ACM.