As today’s organizations are capturing exponentially larger amounts of data than ever, now is the time for organizations to rethink how they digest that data. Through advanced algorithms and analytics techniques, organizations can harness this data, discover hidden patterns, and use the newly acquired knowledge to achieve competitive advantages.
Presenting the contributions of leading experts in their respective fields, Big Data: Algorithms, Analytics, and Applications bridges the gap between the vastness of Big Data and the appropriate computational methods for scientific and social discovery. It covers fundamental issues about Big Data, including efficient algorithmic methods to process data, better analytical strategies to digest data, and representative applications in diverse fields, such as medicine, science, and engineering. The book is organized into five main sections:
- Big Data Management—considers the research issues related to the management of Big Data, including indexing and scalability aspects
- Big Data Processing—addresses the problem of processing Big Data across a wide range of resource-intensive computational settings
- Big Data Stream Techniques and Algorithms—explores research issues regarding the management and mining of Big Data in streaming environments
- Big Data Privacy—focuses on models, techniques, and algorithms for preserving Big Data privacy
- Big Data Applications—illustrates practical applications of Big Data across several domains, including finance, multimedia tools, biometrics, and satellite Big Data processing
Overall, the book reports on state-of-the-art studies and achievements in algorithms, analytics, and applications of Big Data. It provides readers with the basis for further efforts in this challenging scientific field that will play a leading role in next-generation database, data warehousing, data mining, and cloud computing research. It also explores related applications in diverse sectors, covering technologies for media/data communication, elastic media/data storage, cross-network media/data fusion, and SaaS.
Table of Contents
Scalable Indexing for Big Data Processing; Hisham Mohamed and Stephane Marchand-Maillet
Scalability and Cost Evaluation of Incremental Data Processing using Amazon's Hadoop Service; Xing Wu, Yan Liu, and Ian Gorton
Singular Value Decomposition, Clustering, and Indexing for Similarity Search for Large Data Sets in High-Dimensional Spaces; Alexander Thomasian
Multiple Sequence Alignment and Clustering with Dot Matrices, Entropy, and Genetic Algorithms; John Tsiligaridis
Approaches for High-Performance Big Data Processing: Applications and Challenges; Ouidad Achahbar, Mohamed Riduan Abid, Mohamed Bakhouya, Chaker El Amrani, Jaafar Gaber, Mohammed Essaaidi, and Tarek A. El Ghazawi
The Art of Scheduling for Big Data Science; Florin Pop and Valentin Cristea
Time-Space Scheduling in the MapReduce Framework; Zhuo Tang, Lingang Jiang, Ling Qi, Kenli Li, and Keqin Li
The Graph Engine for Multithreaded Systems Graph Database System for Commodity Clusters; Alessandro Morari, Vito Giovanni Caltellana, Oreste Villa, Jesse Weaver, Greg Williams, David Haglin, Antonino Tumeo, and John Feo
KSC-net: Community Detection for Big Data Networks; Raghvendra Mall and Johan A.K. Suykens
Making Big Data Transparent to the Software Developers' Community; Yu Wu, Jessica Kropczynski, and John M. Carroll
Key Technologies for Big Data Stream Computing; Dawei Sun, Guangyan Zhang, Weimin Zheng, and Keqin Li
Streaming Algorithms for Big Data Processing on Multicore Architecture; Marat Zhanikeev
Organic Streams: A Unified Framework for Personal Big Data Integration and Organization Towards Social Sharing and Individualized Sustainable Use; Xiaokang Zhou and Qun Jin
Managing Big Trajectory Data: Online Processing of Positional Streams; Kostas Patroumpas and Timos Sellis
Personal Data Protection Aspects of Big Data; Paolo Balboni
Privacy-Preserving Big Data Management: The Case of OLAP; Alfredo Cuzzocrea
Big Data in Finance; Taruna Seth and Vipin Chaudhary
Semantic-Based Heterogeneous Multimedia Big Data Retrieval; Kehua Guo and Jianhua Ma
Topic Modeling for Large-Scale Multimedia Analysis and Retrieval; Juan Hu, Yi Fang, Nam Ling, and Li Song
Big Data Biometrics Processing: A Case Study of an Iris Matching Algorithm on Intel Xeon Phi; Xueyan Li and Chen Liu
Storing, Managing, and Analyzing Big Satellite Data: Experiences and Lessons Learned from a Real-World Application; Ziliang Zong
Barriers to the Adoption of Big-Data Applications in the Social Sector; Elena Strange
Kuan-Ching Li is a professor in the Department of Computer Science and Information Engineering at Providence University, Taiwan. He was department chair in 2009, has been special assistant to the university president since 2010, and was appointed vice dean for the Office of International and Cross-Strait Affairs (OIA) in 2014. He earned a PhD in 2001 from the University of São Paulo, Brazil.
Dr. Li is a recipient of awards from NVIDIA, the Ministry of Education (MOE)/Taiwan, and the Ministry of Science and Technology (MOST)/Taiwan. He also received guest professorships at universities in China, including Xiamen University (XMU), Huazhong University of Science and Technology (HUST), Lanzhou University (LZU), Shanghai University (SHU), Anhui University of Science and Technology (AUST), and Lanzhou Jiaotong University (LZJTU). He has been involved actively in conferences and workshops as a program/general/steering conference chairman and in numerous conferences and workshops as a program committee member, and he has organized numerous conferences related to high-performance computing and computational science and engineering.
Dr. Li is the editor in chief of the technical publications International Journal of Computational Science and Engineering (IJCSE), International Journal of Embedded Systems (IJES), and International Journal of High Performance Computing and Networking (IJHPCN), all published by Interscience. He also serves on a number of journals’ editorial boards and guest editorships. In addition, he has been acting as editor/coeditor of several technical professional books, published by CRC Press and IGI Global. His topics of interest include networked computing, GPU computing, parallel software design, and performance evaluation and benchmarking. Dr. Li is a member of the Taiwan Association of Cloud Computing (TACC), a senior member of the IEEE, and a fellow of the IET.
Hai Jiang is an associate professor in the Department of Computer Science at Arkansas State University, United States. He earned a BS at Beijing University of Posts and Telecommunications, China, and MA and PhD degrees at Wayne State University. His research interests include parallel and distributed systems, computer and network security, high-performance computing and communication, big data, and modeling and simulation.
Dr. Jiang has published one book and several research papers in major international journals and conference proceedings. He has served as a US National Science Foundation proposal review panelist and a US Department of Energy (DoE) Smart Grid Investment Grant (SGIG) reviewer multiple times. He serves as an editor for the International Journal of High Performance Computing and Networking (IJHPCN); a regional editor for the International Journal of Computational Science and Engineering (IJCSE) as well as the International Journal of Embedded Systems (IJES); an editorial board member for the International Journal of Big Data Intelligence (IJBDI), the Scientific World Journal (TSWJ), the Open Journal of Internet of Things (OJIOT), and the GSTF Journal on Social Computing (JSC); and a guest editor for the IEEE Systems Journal, International Journal of Ad Hoc and Ubiquitous Computing, Cluster Computing, and The Scientific World Journal for multiple special issues. He has also served as a general chair or program chair for some major conferences/workshops (CSE, HPCC, ISPA, GPC, ScalCom, ESCAPE, GPU-Cloud, FutureTech, GPUTA, FC, SGC). He has been involved in 90 conferences and workshops as a session chair or as a program committee member, including major conferences such as AINA, ICPP, IUCC, ICPADS, TrustCom, HPCC, GPC, EUC, ICIS, SNPD, TSP, PDSEC, SECRUPT, and ScalCom. He has reviewed six cloud computing–related books (Distributed and Cloud Computing, Virtual Machines, Cloud Computing: Theory and Practice, Virtualized Infrastructure and Cloud Services Management, Cloud Computing: Technologies and Applications Programming, The Basics of Cloud Computing) for publishers such as Morgan Kaufmann, Elsevier, and Wiley.
Dr. Jiang serves as a review board member for a large number of international journals (TC, TPDS, TNSM, TASE, JPDC, Supercomputing, CCPE, FGCS, CJ, and IJPP). He is a professional member of ACM and the IEEE Computer Society. Locally, he serves as US NSF XSEDE (Extreme Science and Engineering Discovery Environment) Campus Champion for Arkansas State University.
Dr. Laurence T. Yang is a professor in the Department of Computer Science at St. Francis Xavier University, Canada. His research includes parallel and distributed computing, embedded and ubiquitous/pervasive computing, cyber–physical–social systems, and big data.
Dr. Yang has published 200+ refereed international journal papers in the above areas; about one-third are in IEEE/ACM transactions/journals and the rest mostly are in Elsevier, Springer, and Wiley journals. He has been involved in conferences and workshops as a program/ general/steering conference chair and as a program committee member. He served as the vice chair of the IEEE Technical Committee of Supercomputing Applications (2001–2004), the chair of the IEEE Technical Committee of Scalable Computing (2008–2011), and the chair of the IEEE Task Force on Ubiquitous Computing and Intelligence (2009–present).
Dr. Yang was on the steering committee of the IEEE/ACM Supercomputing Conference series (2008–2011) and on the National Resource Allocation Committee (NRAC) of Compute Canada (2009–2013).
In addition, Dr. Yang is the editor in chief and editor of several international journals. He is the author/coauthor or editor/coeditor of more than 25 books. Mobile Intelligence (Wiley 2010) received an Honorable Mention by the American Publishers Awards for Professional and Scholarly Excellence (the PROSE Awards). He has won several best paper awards (including IEEE best and outstanding conference awards, such as the IEEE 20th International Conference on Advanced Information Networking and Applications [IEEE AINA-06]), one best paper nomination, the Distinguished Achievement Award (2005, 2011), and the Canada Foundation for Innovation Award (2003). He has given 30 keynote talks at various international conferences and symposia.
Alfredo Cuzzocrea is a senior researcher at the Institute of High Performance Computing and Networking of the Italian National Research Council, Italy, and an adjunct professor at the University of Calabria, Italy. He is an associate professor in computer science engineering at the Italian National Scientific Habilitation of the Italian Ministry of Education, University and Research (MIUR). He also obtained habilitation as an associate professor in computer science by the Aalborg University, Denmark, and habilitation as an associate professor in computer science by the University of Rome Tre, Italy. He is an adjunct professor at the University of Catanzaro "Magna Graecia," Italy, the University of Messina, Italy, and the University of Naples "Federico II," Italy. Previously, he was an adjunct professor at the University of Naples "Parthenope," Italy. He holds 35 visiting professor positions worldwide (Europe, United States, Asia, and Australia). He serves as a Springer Fellow Editor and as an Elsevier Ambassador. He holds several roles in international scientific societies, steering committees for international conferences, and international panels, some of them having directional responsibility. He serves as a panel leader and moderator in international conferences. He was an invited speaker in several international conferences worldwide (Europe, United States, and Asia). He is a member of scientific boards of several PhD programs worldwide (Europe and Australia). He serves as an editor for the Springer series Communications in Computer and Information Science. He covers a large number of roles in international journals, such as editor in chief, associate editor, and special issue editor (including JCSS, IS, KAIS, FGCS, DKE, INS, and Big Data Research). He has edited more than 30 international books and conference proceedings.
He is a member of the editorial advisory boards of several international books. He covers a large number of roles in international conferences, such as general chair, program chair, workshop chair, local chair, liaison chair, and publicity chair (including ODBASE, DaWaK, DOLAP, ICA3PP, ICEIS, APWeb, SSTDM, IDEAS, and IDEAL). He served as the session chair in a large number of international conferences (including EDBT, CIKM, DaWaK, DOLAP, and ADBIS). He serves as a review board member for a large number of international journals (including TODS, TKDE, TKDD, TSC, TIST, TSMC, THMS, JCSS, IS, KAIS, FGCS, DKE, and INS). He also serves as a review board member in a large number of international books and as a program committee member for a large number of international conferences (including VLDB, ICDE, EDBT, CIKM, IJCAI, KDD, ICDM, PKDD, and SDM). His research interests include multidimensional data modeling and querying, data stream modeling and querying, data warehousing and OLAP, OLAM, XML data management, web information systems modeling and engineering, knowledge representation and management models and techniques, Grid and P2P computing, privacy and security of very large databases and OLAP data cubes, models and algorithms for managing uncertain and imprecise information and knowledge, models and algorithms for managing complex data on the web, and models and algorithms for high-performance distributed computing and architectures. He is the author or coauthor of more than 330 papers in international conferences (including EDBT, CIKM, SSDBM, MDM, DaWaK, and DOLAP), international journals (including JCSS, IS, KAIS, DKE, and INS), and international books (mostly edited by Springer). He is also involved in several national and international research projects, where he also covers responsibility roles.
Featured Author Profiles
The collection presented in the book covers fundamental and realistic issues about Big Data, including efficient algorithmic methods to process data, better analytical strategies to digest data, and representative applications in diverse fields. ... This book is required understanding for anyone working in a major field of science, engineering, business, and financing.
—Jack Dongarra, University of Tennessee
The editors have assembled an impressive book consisting of 22 chapters written by 57 authors from 12 countries across America, Europe, and Asia. ... This book has great potential to provide fundamental insight and privacy to individuals, long-lasting value to organizations, and security and sustainability to the cyber–physical–social ecosystem ....
—D. Frank Hsu, Fordham University
These editors are active researchers and have done a lot of work in the area of Big Data. They assembled a group of outstanding chapter authors. ... Each section contains several case studies to demonstrate how the related issues are addressed. ... I highly recommend this timely and valuable book. I believe that it will benefit many readers and contribute to the further development of Big Data research.
—Dr. Yi Pan, Georgia State University