296 Pages 143 B/W Illustrations
    by Chapman & Hall

    296 Pages 143 B/W Illustrations
    by Chapman & Hall

    Continue Shopping

    Fundamentals of Data Science is designed for students, academicians and practitioners with a complete walkthrough right from the foundational groundwork required to outlining all the concepts, techniques and tools required to understand Data Science.

    Data Science is an umbrella term for the non-traditional techniques and technologies that are required to collect, aggregate, process, and gain insights from massive datasets. This book offers all the processes, methodologies, various steps like data acquisition, pre-process, mining, prediction, and visualization tools for extracting insights from vast amounts of data by the use of various scientific methods, algorithms, and processes

    Readers will learn the steps necessary to create the application with SQl, NoSQL, Python, R, Matlab, Octave and Tablue.

    This book provides a stepwise approach to building solutions to data science applications right from understanding the fundamentals, performing data analytics to writing source code. All the concepts are discussed in simple English to help the community to become Data Scientist without much pre-requisite knowledge.

     

    Features :

    • Simple strategies for developing statistical models that analyze data and detect patterns, trends, and relationships in data sets.
    • Complete roadmap to Data Science approach with dedicatedsections which includes Fundamentals, Methodology and Tools.
    • Focussed approach for learning and practice various Data Science Toolswith Sample code and examples for practice.
    • Information is presented in an accessible way for students, researchers and academicians and professionals.

    Part-I Data Science Introduction

      1. Importance of Data Science

      1. Need for Data Science
      2. What is Data Science
      3. Data Science Process
      4. Business Intelligence and Data Science
      5. Prerequisite for Data Scientist
      6. Components of Data Science
      7. Tools and Skills Need
      8. Summary

    Exercise

    References

      1. Statistics and Probability

    2.1 Data Types

    2.2. Variable Types

    2.3 Statistics

    2.4 Sampling Techniques and Probability

    2.5 Information Gain and Entropy

    2.6 Probability Theory

    2.7 Probability Types

    2.8 Probability Distribution

    2.9 Bayes Theorem

    2.10 Inferential Statistics

    2.11 Summary

    Exercise

    References

    3. Databases for Data Science

    3.1 SQL-Tool for Data Science

    3.1.1 Basic Statistics with SQL

    3.1.2 Data Munging with SQL

    3.1.3 Filtering, Joins and Aggregation

    3.1.4 Window Functions and Ordered Data

    3.1.5 Preparing Data for Analytics Tool

    3.2 NoSQL for Data Science

    3.2.1 Why NoSQL

    3.2.2 Document databases for Data Science

    3.2.3 Wide-Column Databases for Data Science

    3.2.4 Graph Databases for Data Science

    3.3 Summary

    Exercise

    References

    Part II Data Modelling and Analytics

    Chapter 4: Data Science Methodology

    4.1 Analytics for Data Science

    4.2 Data Analytics Examples

    4.3 Data Analytics Life Cycle

    4.3.1 Data Discovery

    4.3.2 Data preparation

    4.3.3 Model Planning

    4.3.4 Model Building

    4.3.5 Communicate Results

    4.3.6 Operationalization

    4.4 Summary

    Exercise

    References

    Chapter 5: Data Science Methods and Machine learning

    5.1 Regression Analysis

    5.1.1 Linear Regression

    5.1.2 Logistic Regression

    5.1.3 Multinomial Logistic Regression

    5.1.4 Time Series Models

    5.2 Machine Learning

    5.2.1 Decision Trees

    5.2.2 Naïve Bayes

    5.2.3 Support Vector Machines

    5.2.4 Nearest Neighbour learning

    5.2.5 Clustering

    5.2.6 Confusion Matrix

    5.3 Summary

    Exercise

    References

    Chapter 6: Data Analytics and Text Mining

    6.1 Text Mining

    6.1.1 Major Text Mining Areas

    6.2 Text Analytics

    6.2.1 Text Analysis Subtasks

    6.2.2 Basic Text Analysis Steps

    6.3 Natural Language Processing

    6.3.1 Major Components of NLP

    6.3.2 Stages of NLP

    6.3.3 Statistical Processing of Natural Language

    6.3.4 Applications of NLP

    6.4 Summary

    Exercise

    References

    Part III: Platforms for Data Science

    Chapter 7: Data Science Tool: Python

      1. Basics Of Python
      2. Python libraries: Data Frame Manipulation with Pandas, Numpy
      3. Data Analysis Exploration With Python
      4. Time Series Data
      5. Clustering with Python
      6. Arch & Garch
      7. Dimensionality Reduction
      8. Python for Machine Learning
      9. Algorithms: KNN, Decision Tree, Random Forest, SVM
      10. Python IDEs for Data Science
      11. Summary

    Exercise

    References

     

    Chapter 8: Data Science Tool: R

    8.1 Reading and Getting Data into R

    8.1.1 Reading Data into R

    8.1.2 Writing Data into File

    8.1.3 Scan() function

    8.1.4 Built-in Datasets

    8.2 Ordered and Unordered Factors

    8.3 Arrays and Matrices

    8.3.1 Arrays

    8.3.2 Matrices

    8.4 Lists and Data Frames

    8.4.1 Lists

    8.4.2 Data Frames

    8.5 Probability Distributions

    8.5.1 Normal Distribution

    8.6 Statistical Models in R

    8.6.1 Model Fitting

    8.6.2 Marginal Effects

    8.7 Manipulating Objects

    8.7.1 Viewing Objects

    8.7.2 Modifying Objects

    8.7.3 Appending Elements

    8.7.4 Deleting Objects

    8.8 Data Distribution

    8.8.1 Visualizing Distributions

    8.8.2 Statistics in Distributions

    8.9 Summary

    Exercise

    References

     

    Chapter 9: Data Science Tool: MATLAB

    9.1 Data Science Workflow and MATLAB

    9.2 Importing Data

    9.2.1 How Data is stored

    9.2.2 How MATLAB Represents Data

    9.2.3 MATLAB Data Types

    9.2.4 Automating the Import Process

    9.3 Visualizing and Filtering Data

    9.3.1 Plotting Data Contained in Tables

    9.3.2 Selecting Data from Tables

    9.3.3 Accessing and Creating Table Variables

    9.4 Performing Calculations

    9.4.1 Basic Mathematical Operations

    9.4.2 Using Vectors

    9.4.3 Using Functions

    9.4.4 Calculating Summary Statistics

    9.4.5 Correlations between Variables

    9.4.6 Accessing Subsets of Data

    9.4.7 Performing Calculations by Category

    9.5 Summary

    Exercise

    References

     

    Chapter 10 : GNU Octave as a Data Science Tool

    10.1 Vectors and Matrices

    10.2 Arithmetic Operations

    10.3 Set Operations

    10.4 Plotting Data
    10.5 Summary

    Exercise

    References

    Chapter 11: Data Visualization using Tableau

    11.1 Introduction to Data Visualization

    11.2 Tableau Basics

    11.3 Dimensions, Measures and Descriptive Statistics

    11.4 Basic Charts

    11.5 Dashboard Design & Principles

    11.6 Special Chart Types

    11.7 Integrate Tableau with Google Sheets

    11.8 Summary

    Exercise

    References

    Index

    Biography

    Sanjeev Wagh, working as Professor and Head in Department of Information Technology at Govt. College of Engineering, Karad. He has completed his BE (1996), ME(2000) & PhD(2009) in Computer Science & Engineering from Govt. College of Engineering, Pune &Nanded. He was full time Post Doctorate fellow at Center for TeleInfrastructure, Aalborg University, Denmark during 2013-14. He has also completed MBA (IT) from NIBM (2015), Chennai. He has total 24 years of experience in academics & research. His research interest areas are Natural Science Computing, Internet technologies & Wireless Sensor networks, Data Sciences & Analytics. He has 100+ research papers to his credit, published in International /National Journals & conferences. 4 research scholars completed PhD under his supervision from Pune University. Currently 3 research scholars are pursuing PhD under his supervision in various Indian Universities. He is fellow member of ISTE, IETE, and member of IEEE, ACM & CSI. He is co-editor for International Journals in Engineering & Technology. He has visited to Denmark (Aalborg University, Aalborg & Copenhagen) Sweden (Gothenburg University, Gothenburg), Germany (Hamburg University, Hamburg), Norway (University of Oslo), France (University of London Institute in Paris), China (Shanghai Technology Innovation Center Shanghai, delegation visit), Thailand (Kasetsart University, Bangkok), Mauritius (University of Technology, Port Louis) for academic & research purpose.

    Manisha S. Bhende working as Associate Professor in Dr D Y Patil Institute of Engineering Management and Research, Pune. She has completed BE(1998) ,ME(2007) and PhD(2017) in Computer Engineering from University of Pune and bachelors degree from Government College of Engineering , Amravati, India. Her research interests are IoT and Wireless Networks, Network Security, Cloud Computing, Data science and Machine learning, Data analytics etc. She has 39 research papers/book chapters in International, National conferences and Journals. She delivered expert talk on various domain such as Wireless Communication, Wireless Sensor networks, Data Science, Cyber Security, IoT, Embedded and Real Time Operating Systen , IPR and Innovation etc. She has Published 4 Patents and 3 Copyrights Received on her credit. She is reviewer/Examiner for PhD thesis and ME dissertations for state/National University. She is associated with PhD research centers. She is working as Editor/reviewer for various national/International repute Journals and Conferences. She is coordinator of IQAC ,IPR Cell, IIP Cell, Research Cell at institute level. She is working as Subject Chairman for various Computer Engineering subjects under SavitribaiPhule Pune University (SPPU).She contributed for SPPU syllabus Content designing and revision. She received Regional young IT Professional award by CSI in 2006. She is member of ISTE, ACM, CSI, IAENG, Internet Society etc.

    Anuradha D. Thakare received her Ph.D in Computer Science and Engineering from SGB Amravati University, M.E. degree in Computer Engineering from SavitribaiPhule Pune University and BE degree in Computer Science and Engineering from SantGadge Baba Amravati University, Amravati, India. She is working as a Professor in Computer Engineering Department of PimpriChinchwad College of Engineering, Pune. Dr. Anuradha is Secretary of Institution of engineering & Technology Pune LN, Member of IEEE and ACM. She is PhD guide in Computer Engineering in SPPU, Pune. She has been a General Chair of IEEE International Conference ICCUBEA 2018 and Advisory member for International Conferences. She worked as reviewer for Journal of International Blood Research, IEEE transactions and Scopus Journals. She is reviewer and Examiner for PhD defence for state/National University. She has published 78 research papers in reputed Conferences and Journals with indexing in Scopus, IEEE, Web of Science, Elsevier and Springer etc. She received Research project grant and workshop grants from AICTE-AQIS, QIP-SPPU, BCUD-SPPU Pune and Maharashtra State Commission for Women. She Received Best Women Researcher Award and Best Faculty Award from International Forum on Science, Health & Engineering. She received best paper award in International Conferences. Delivered 20 expert talks on Machine Learning, Evolutionary Algorithms, Outcome Based Education etc. she worked with industries like DRDO, NCL etc for research projects.  She is working as Subject Chairman for various Computer Engineering subjects under SavitribaiPhule Pune University (SPPU). She contributed for SPPU syllabus Content designing and revision.