1st Edition

Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision Techniques and Use Cases

By L. Ashok Kumar, D. Karthika Renuka Copyright 2023
    246 Pages 149 B/W Illustrations
    by CRC Press

    Deep Learning Approach for Natural Language Processing, Speech, and Computer Vision provides an overview of general deep learning methodology and its applications of natural language processing (NLP), speech, and computer vision tasks. It simplifies and presents the concepts of deep learning in a comprehensive manner, with suitable, full-fledged examples of deep learning models, with an aim to bridge the gap between the theoretical and the applications using case studies with code, experiments, and supporting analysis.


    • Covers latest developments in deep learning techniques as applied to audio analysis, computer vision, and natural language processing.
    • Introduces contemporary applications of deep learning techniques as applied to audio, textual, and visual processing.
    • Discovers deep learning frameworks and libraries for NLP, speech, and computer vision in Python.
    • Gives insights into using the tools and libraries in Python for real-world applications.
    • Provides easily accessible tutorials and real-world case studies with code to provide hands-on experience.

    This book is aimed at researchers and graduate students in computer engineering, image, speech, and text processing.

    1 Introduction 1.1 Introduction; 1.2 Machine Learning Methods for NLP, Computer Vision (CV), and Speech; 1.3 Tools, Libraries, Datasets, and Resources for the Practitioners; 1.4 Summary 2 Natural Language Processing 2.1 Natural Language Processing; 2.2 Generic NLP Pipeline; 2.3 Text Pre-processing; 2.4 Feature Engineering; 2.5 Modeling; 2.6 Evaluation; 2.7 Deployment; 2.8 Monitoring and Model Updating; 2.9 Vector Representation for NLP; 2.10 Language Modeling with n-grams; 2.11 Vector Semantics and Embeddings; 2.12 Summary 3 State-of-the-Art Natural Language 3.1 Introduction; 3.2 Sequence-to-Sequence Models; 3.3 Recurrent Neural Networks; 3.4 Attention Mechanisms; 3.5 Transformer Model; 3.6 Summary 4 Applications of Natural Language Processing 4.1 Introduction; 4.2 Word Sense Disambiguation; 4.3 Text Classification; 4.4 Sentiment Analysis; 4.5 Spam Email Classification; 4.6 Question Answering; 4.7 Chatbots and Dialog Systems; 4.8 Summary 5 Fundamentals of Speech Recognition 5.1 Introduction; 5.2 Structure of Speech; 5.3 Basic Audio Features; 5.4 Characteristics of Speech Recognition System; 5.5 The Working of a Speech Recognition System; 5.6 Audio Feature Extraction Techniques; 5.7 Statistical Speech Recognition; 5.8 Speech Recognition Applications; 5.9 Challenges in Speech Recognition; 5.10 Open-source Toolkits for Speech Recognition; 5.11 Summary 6 Deep Learning Models for Speech Recognition 6.1 Traditional Methods of Speech Recognition; 6.2 RNN-based Encoder–Decoder Architecture; 6.3 Encoder; 6.4 Decoder; 6.5 Attention-based Encoder–Decoder Architecture; 6.6 Challenges in Traditional ASR and the Motivation for End-to-End ASR; 6.7 Summary 7 End-to-End Speech Recognition Models 7.1 End-to-End Speech Recognition Models; 7.2 Self-supervised Models for Automatic Speech Recognition; 7.3 Online/Streaming ASR; 7.4 Summary 8 Computer Vision Basics 8.1 Introduction; 8.2 Image Segmentation; 8.3 Feature Extraction; 8.4 Image Classification; 8.5 Tools and Libraries for Computer Vision; 8.6 Applications of Computer Vision; 8.7 Summary 9 Deep Learning Models for Computer Vision 9.1 Deep Learning for Computer Vision; 9.2 Pre-trained Architectures for Computer Vision; 9.3 Summary 10 Applications of Computer Vision 10.1 Introduction; 10.2 Optical Character Recognition; 10.3 Face and Facial Expression Recognition; 10.4 Visual-based Gesture Recognition; 10.5 Posture Detection and Correction; 10.6 Summary


    L. Ashok Kumar, D. Karthika Renuka