Hands-On Data Science for Librarians
- Available for pre-order on March 28, 2023. Item will ship after April 18, 2023
Prices & shipping based on shipping country
Librarians understand the need to store, use and analyze data related to their collection, patrons and institution, and there has been consistent interest over the last 10 years to improve data management, analysis, and visualization skills within the profession. However, librarians find it difficult to move from out-of-the-box proprietary software applications to the skills necessary to perform the range of data science actions in code. This book will focus on teaching R through relevant examples and skills that librarians need in their day-to-day lives that includes visualizations but goes much further to include web scraping, working with maps, creating interactive reports, machine learning, and others. While there’s a place for theory, ethics, and statistical methods, librarians need a tool to help them acquire enough facility with R to utilize data science skills in their daily work, no matter what type of library they work at (academic, public or special). By walking through each skill and its application to library work before walking the reader through each line of code, this book will support librarians who want to apply data science in their daily work. Hands-On Data Science for Librarians is intended for librarians (and other information professionals) in any library type (public, academic or special) as well as graduate students in library and information science (LIS).
- Only data science book available geared toward librarians that includes step-by-step code examples
- Examples include all library types (public, academic, special)
- Relevant datasets
- Accessible to non-technical professionals
- Focused on job skills and their applications
Table of Contents
1. Introduction 2. Using RStudio’s IDE 3. Tidying data with dplyr 4. Visualizing your project with ggplot2 5. Webscraping with rvest 6. Mapping with tmap 7. Textual Analysis with tidytext 8. Creating Dynamic Documents with rmarkdown 9. Creating a flexdashboard 10. Creating an interactive dashboard with shiny 11. Using tidymodels to Understand Machine Learning 12. Conclusion Appendix A. Dependencies Appendix B. Additional Skills
Sarah Lin manages the Enterprise Information Management team at Posit, PBC. A graduate of the University of Illinois iSchool, Sarah worked as a technical services librarian in many different library types (academic, special, medical & law) before moving into corporate librarianship and information management. Her professional interests include findability and metadata, and how metadata enables findability. Sarah is a certified Carpentries instructor and received her undergraduate degree in African/African-American Studies & Anthropology from the University of Chicago. She didn’t know anything about coding in R before joining Posit.
Dorris Scott is currently the Academic Director of Data Studies at Washington University in St. Louis - University College. As Academic Director of Data Studies, Dorris develops new curriculum, certificate and degree programs related to data analytics, data science, and geographic information systems (GIS) along with forges local and regional partnerships in alignment with University College’s strategic vision. Previously, Dorris was Geographic Information Systems (GIS) Librarian and Social Science Data Curator at University Libraries at Washington University and provided consultation on projects that use geospatial data along with providing training in various GIS software, programming applications of geospatial data, and data management. As part of this position, Dorris also served as a liaison between Washington University Libraries and social science departments assisting faculty with their data needs such as data management and data curation. Dorris received a PhD in Geography from the University of Georgia, with a specialization in GIS applications for public health.