1st Edition

Phishing Detection Using Content-Based Image Classification

By Shekhar Khandelwal, Rik Das Copyright 2022
    130 Pages 35 B/W Illustrations
    by Chapman & Hall

    130 Pages 35 B/W Illustrations
    by Chapman & Hall

    Phishing Detection Using Content-Based Image Classification is an invaluable resource for any deep learning and cybersecurity professional and scholar trying to solve various cybersecurity tasks using new age technologies like Deep Learning and Computer Vision. With various rule-based phishing detection techniques at play which can be bypassed by phishers, this book provides a step-by-step approach to solve this problem using Computer Vision and Deep Learning techniques with significant accuracy.

    The book offers comprehensive coverage of the most essential topics, including:

    • Programmatically reading and manipulating image data
    • Extracting relevant features from images
    • Building statistical models using image features
    • Using state-of-the-art Deep Learning models for feature extraction
    • Build a robust phishing detection tool even with less data
    • Dimensionality reduction techniques
    • Class imbalance treatment
    • Feature Fusion techniques
    • Building performance metrics for multi-class classification task

    Another unique aspect of this book is it comes with a completely reproducible code base developed by the author and shared via python notebooks for quick launch and running capabilities. They can be leveraged for further enhancing the provided models using new advancement in the field of computer vision and more advanced algorithms.

    Chapter 1. Phishing and Cybersecurity. Basics of Phishing in Cybersecurity. Phishing Detection Techniques. List (whitelist/blacklist) based. Heuristics (pre-defined rules) based. Visual similarity based. Race between Phishers and Anti-Phishers. Chapter 2. Image Processing based Phishing Detection Techniques. Image processing based phishing detection techniques. Challenges in Phishing Detection using website images. Comparison of Techniques. Summary of Phishing detection using image processing techniques. Chapter 3. Implementing CNN for classifying phishing websites. Data Selection and Pre-Processing. Classification using CNN. CNN implementation. Performance metrics. Building a Convolutional Neural Network Model. Chapter 4. Transfer Learning Approach in Phishing Detection. Classification using Transfer Learning. Transfer Learning python implementation. Performance assessment of CNN models. Chapter 5. Feature Extraction and Representation Learning. Classification using Representation Learning. Data Preparation.. Feature Extraction using CNN off-the-shelf architectures. Handling class imbalance using SMOTE. SMOTE python implementation. Machine learning Classifier. Performance assessment of various experimentations. Chapter 6. Dimensionality Reduction Techniques. Basics of dimensionality reduction. PCA implementation using python. Performance assessment of various experimentations. Chapter 7. Feature Fusion Techniques. Basics of feature fusion technique. Different combinations of image representations. Different feature fusion approaches. Performance assessment of various experimentations. Chapter 8. Comparison of Phishing detection approaches. Classification Approaches. Evaluation of Classification Experiments. Comparison of the best performing model with the State-of-the-art. Chapter 9. Basics of Digital Image Processing. Basics of digital image processing. Basics of extracting features using OpenCV.

    Biography

    Shekhar Khandelwal is a Data Scientist and works for Ernst & Young (EY) for Data & Analytics team. He has an extensive experience of around 15 years in the industry, and has worked across every sphere of Software Development Lifecycle. He has worked as a product developer, industry solutions developer, data engineer, data scientist and also as a Cloud developer. Previously, he worked for IBM Software labs where he also got a chance to work for industrial IoT based IBM cognitive product development and client deployment using various Watson tools and technologies. He is an industry leader solving challenging Computer Vision, NLP and Predictive Analytics based problems using Machine Learning and Deep Learning.

    Dr. Rik Das is currently a Lead Software Engineer in Computer Vision Research at Siemens Advanta, India. Previously he was with Xavier Institute of Social Service, Ranchi, as an Assistant Professor for the Post Graduate Program in Information Technology. Dr.Das has over 17 years of experience in industrial and academic research. He was professionally associated with many leading universities and institutes in India, including Narsee Monjee Institute of Management Studies (NMIMS) (deemed-to-be-university), Globsyn Business School and Maulana Abul Kalam Azad University of Technology. Dr. Das has a Ph.D. (Tech.) in Information Technology from the University of Calcutta. He has also received his M.Tech. (Information Technology) from the University of Calcutta after his B.E. (Information Technology) from the University of Burdwan, West Bengal, India.