Skip to main content
main-content
Top

About this book

"It’s not easy to find such a generous book on big data and databases. Fortunately, this book is the one." Feng Yu. Computing Reviews. June 28, 2016.

This is a book for enterprise architects, database administrators, and developers who need to understand the latest developments in database technologies. It is the book to help you choose the correct database technology at a time when concepts such as Big Data, NoSQL and NewSQL are making what used to be an easy choice into a complex decision with significant implications.

The relational database (RDBMS) model completely dominated database technology for over 20 years. Today this "one size fits all" stability has been disrupted by a relatively recent explosion of new database technologies. These paradigm-busting technologies are powering the "Big Data" and "NoSQL" revolutions, as well as forcing fundamental changes in databases across the board.

Deciding to use a relational database was once truly a no-brainer, and the various commercial relational databases competed on price, performance, reliability, and ease of use rather than on fundamental architectures. Today we are faced with choices between radically different database technologies. Choosing the right database today is a complex undertaking, with serious economic and technological consequences.

Next Generation Databases demystifies today’s new database technologies. The book describes what each technology was designed to solve. It shows how each technology can be used to solve real word application and business problems. Most importantly, this book highlights the architectural differences between technologies that are the critical factors to consider when choosing a database platform for new and upcoming projects.

Introduces the new technologies that have revolutionized the database landscape

Describes how each technology can be used to solve specific application or business challenges

Reviews the most popular new wave databases and how they use these new database technologies

Table of Contents

Next Generation Databases

Frontmatter

Chapter 1. Three Database Revolutions

This book is about a third revolution in database technology. The first revolution was driven by the emergence of the electronic computer, and the second revolution by the emergence of the relational database. The third revolution has resulted in an explosion of nonrelational database alternatives driven by the demands of modern applications that require global scope and continuous availability. In this chapter we’ll provide an overview of these three waves of database technologies and discuss the market and technology forces leading to today’s next generation databases.
Guy Harrison

Chapter 2. Google, Big Data, and Hadoop

In the history of computing, nothing has raised the profile of data processing, storage, and analytics as much as the concept of Big Data. We have considered ourselves to be an information-age society since the 1980s, but the concentration of media and popular attention to the role of data in our society have never been greater than in the past few years–thanks to Big Data.
Guy Harrison

Chapter 3. Sharding, Amazon, and the Birth of NoSQL

The last time we saw a major new brand of relational database was around 1995, with the first release of MySQL. In 1995, the World Wide Web in the United States was barely two years old—the Netscape browser had been released only the year before. In terms of computer systems, it was a different era.
Guy Harrison

Chapter 4. Document Databases

A document database is a nonrelational database that stores data as structured documents, usually in XML or JSON formats. The “document database” definition doesn’t imply anything specific beyond the document storage model: document databases are free to implement ACID transactions or other characteristics of a traditional RDBMS, though the dominant document databases provide relatively modest transactional support.
Guy Harrison

Chapter 5. Tables are Not Your Friends: Graph Databases

Proponents of key-value stores, document databases, and relational systems disagree about practically every aspect of database design, but they do agree in one respect: databases are about storing information about “things,” be those things represented by JSON, tables, or binary values. But sometimes it’s the relationship between things, rather than the things themselves, that are of primary interest. This is where graph database systems shine.
Guy Harrison

Chapter 6. Column Databases

Those of us raised in Western cultures have been conditioned to think of data as arranged in rows. The way data is presented in ledgers, tables, spreadsheets, and even in the left to right, top to bottom organization of European languages has programmed us to visualize data in row format. It’s not surprising, therefore, that the first digital files were created with each record represented as a row. But no matter how convenient and familiar this format may be, it is not always the best way to organize data physically.
Guy Harrison

Chapter 7. The End of Disk? SSD and In-Memory Databases

Ever since the birth of the first database systems, database professionals have strived to avoid disk IO at all costs. IO to magnetic disk devices has always been many orders of magnitude slower than memory or CPU access, and the situation has only grown worse as Moore’s law accelerated the performance of CPU and memory while leaving mechanical disk performance behind.
Guy Harrison

The Gory Details

Frontmatter

Chapter 8. Distributed Database Patterns

Administrators of web applications have traditionally had two choices when the application demand exceeds database capacity: scaling up by increasing the power of individual servers, or scaling out by adding more servers. For most of the relational database era, scaling up was the more practical option. Early relational databases did not provide a clustering option, whereas the CPU and memory supplied by a single server was constantly and exponentially increasing in line with Moore’s law. Consequently, scaling out was neither practical nor necessary.
Guy Harrison

Chapter 9. Consistency Models

One of the biggest factors powering the nonrelational database revolution is a desire to escape the restrictions of strict ACID consistency. It’s widely believed that the new breed of nonrelational databases provide only weak or at best eventual consistency, and that the underlying consistency mechanisms are simplistic. This belief represents a fundamental misunderstanding of nonrelational database systems. Nonrelational systems offer a range of consistency guarantees, including strict consistency, albeit at the single-object level. And in fact, there are some complex architectures required to balance an acceptable degree of consistency when we lose the strict and predictable rules provided by the ACID transaction model.
Guy Harrison

Chapter 10. Data Models and Storage

The relational database model provided a strong theoretical foundation for the representation of data that eliminated redundancy and maximized flexibility of data access. Many next-generation database systems, particularly those of the NewSQL variety, continue to embrace the relational model: their innovations generally focus on the underlying physical storage of data.
Guy Harrison

Chapter 11. Languages and Programming Interfaces

Crucial to the dominance of the relational database was the almost universal adoption of the SQL language as the mechanism for querying and modifying data. SQL is not a perfect language, but it has demonstrated sufficient flexibility to meet the needs of both non-programming database users and professional database programmers. Programmers embed SQL in programming languages, while non-programmers use SQL either explicitly within a query tool or implicitly when a BI tool uses SQL under the hood to talk to the database. Prior to the introduction of SQL, most IT departments labored with a backlog of requests for reports; SQL allowed the business user to “self-serve” these requests.
Guy Harrison

Chapter 12. Database of the Future

This book is the story of how a revolution in database technology saw the “one size fits all” traditional relational SQL database give way to a multitude of special-purpose database technologies. In the past 11 chapters, we have reviewed the major categories of next-generation database systems and have taken a deep dive into some of the internal architectures of those systems.
Guy Harrison

Chapter 13. Database Survey

In this appendix I’ve provided a short description for the major database systems covered in this book.
Guy Harrison
Additional information