Introduction to Data Technologies: 1st Edition (Paperback) book cover

Introduction to Data Technologies

1st Edition

By Paul Murrell

Chapman and Hall/CRC

418 pages | 179 B/W Illus.

Purchasing Options:$ = USD
Paperback: 9781138118027
pub: 2017-05-31
SAVE ~$16.59
Hardback: 9781420065176
pub: 2009-02-23
SAVE ~$44.00
eBook (VitalSource) : 9780429139697
pub: 2009-02-23
from $110.00

FREE Standard Shipping!


Providing key information on how to work with research data, Introduction to Data Technologies presents ideas and techniques for performing critical, behind-the-scenes tasks that take up so much time and effort yet typically receive little attention in formal education. With a focus on computational tools, the book shows readers how to improve their awareness of what tasks can be achieved and describes the correct approach to perform these tasks.

Practical examples demonstrate the most important points

The author first discusses how to write computer code using HTML as a concrete example. He then covers a variety of data storage topics, including different file formats, XML, and the structure and design issues of relational databases. After illustrating how to extract data from a relational database using SQL, the book presents tools and techniques for searching, sorting, tabulating, and manipulating data. It also introduces some very basic programming concepts as well as the R language for statistical computing. Each of these topics has supporting chapters that offer reference material on HTML, CSS, XML, DTD, SQL, R, and regular expressions.

One-stop shop of introductory computing information

Written by a member of the R Development Core Team, this resource shows readers how to apply data technologies to tasks within a research setting. Collecting material otherwise scattered across many books and the web, it explores how to publish information via the web, how to access information stored in different formats, and how to write small programs to automate simple, repetitive tasks.


Paul Murrell, best known for his R Graphics book, has delivered a second masterpiece for people who have the difficult task to clean and prepare raw data for further use in common statistical software packages. … provides the perfect basis for a course on data literacy … Moreover, the book also is an excellent basis for advanced M.S. and Ph.D. students as well as practitioners in academia and industry who are confronted with the task to clean and preprocess their own or their colleagues’ data.

—Jürgen Symanzik, Technometrics, May 2011

Introduction to Data Technologies introduces various computer-related topics, including markup languages, statistical computing languages, coding, storage, and querying, in a systematic manner. … the book may serve as an introduction to readers with general interest who plan to supplement their knowledge in specific computer-related topics, in addition to R programming.

Journal of the American Statistical Association, Vol. 105, No. 492, December 2010

This is a very gentle book. It enables students and statisticians, particularly those just entering the profession, to begin to familiarize themselves with important concepts and tools from the world of databases … it is encouraging that such topics are finding their way into statistics courses at all. … I found the style of the book very engaging … . It has the Paul Murrell light touch, first evident to me in his eminently readable and comprehensive book on R graphics. Like that one, the present book has interesting, occasionally slightly unusual examples and an easy and elegant writing style. The book does not hesitate to offer plain, direct advice in contexts in which other authors might simply let readers exercise their personal preferences. For students, particularly, I think this is a good thing. …

—Bill Venables, CSIRO, Australian & New Zealand Journal of Statistics, 2010

Table of Contents


Case Study: Point Nemo

Writing Computer Code

Case Study: Point Nemo (continued)



Writing Code

Checking Code

Running Code

The DRY Principle

HTML Reference

HTML Syntax

HTML Semantics

CSS Reference

CSS Syntax

CSS Semantics

Linking CSS to HTML

CSS Tips

Data Storage

Case Study: YBC 7289

Plain Text Formats

Binary Formats




XML Reference

XML Syntax

Document Type Definitions

Data Queries

Case Study: The Data Expo (continued)

Querying Databases

Querying XML

SQL Reference

SQL Syntax

SQL Queries

Other SQL Commands

Data Processing

Case Study: The Population Clock

The R Environment

The R Language

Data Types and Data Structures


More on Data Structures

Data Import/Export

Data Manipulation

Text Processing

Data Display


Other Software

R Reference

R Syntax

Data Types and Data Structures


Getting Help


Searching for Functions

Regular Expressions Reference







Further Reading appears at the end of each chapter.

About the Author/Editors

Paul Murrell is a Senior Lecturer in the Department of Statistics at the University of Auckland, New Zealand. Author of the bestselling R Graphics (2006), he is also part of the development team for the R and Omegahat statistical computing projects. Dr. Murrell’s research interests include computational and graphical statistics.

About the Series

Chapman & Hall/CRC Computer Science & Data Analysis

Learn more…

Subject Categories

BISAC Subject Codes/Headings:
COMPUTERS / Database Management / General
COMPUTERS / Database Management / Data Mining
MATHEMATICS / Probability & Statistics / General
MEDICAL / Biostatistics