Distributed Systems Paper - Final.pdf

Viewer
Transcript

An exploration of ACID and BASE NoSQL Databases and Challenges to the CAP Theorem Mohamed Sayed Ghoneim Department of Computer Science and Engineering The American University in Cairo Cairo, Egypt [email protected]

Abstract— this paper is a humble guide for NoSQL new comers to understand quickly the true potential of some of the major NoSQL database systems. This paper will discuss the 4 major systems the wide column store, and the graph databases, key-value databases, key-document databases. Then it discusses 3 successful and wide spread NoSQL implementations HBase, Apache Cassandra and Neo4j. The comparison between the different implementations will mainly focus on the CAP theorem components, together with a thorough analysis in order to identify the database as ACID or BASE. Keywords—NoSQL; Comparison; CAP; ACID; BASE; Graph; Column Store;

I. INTRODUCTION For developers who are used to work in small scale applications which user space does not exceed hundreds or thousands and its data are in thousands/millions records, the idea that there are databases that are not ACID will be hard to grasp. ACID (Atomicity, Consistency, Isolation, Durability) are mostly SQL databases and they have been there for decades. It turns out that at large scale computing when working with terabytes and petabytes of data, the programmer should give up some of these requirements in favor of others. At large scale computing applications should be scalable, which might be BASE (Basic Availability, Soft State, and Eventual Consistency). NoSQL databases provide scalability out of the box, and it comes with different flavors, key-value pair, wide column-store, document store, and graph databases. These different flavors have different priorities of nonfunctional requirements some favor consistency over availability, or vice versa, some of them try to be strongly ACID or strongly BASE or somewhere in between. At the end they are all tied up with the CAP theorem. The CAP (Consistency, Availability, Partition tolerance) states that any system can only achieve 2 of the 3 components of the CAP guarantees[2]. II. PROPER INTRODUCTION TO NOSQL A. Definition NoSQL means not only SQL databases. NoSQL databases do not follow the relational database model. It is a new paradigm of storing and retrieving data that is easily scalable unlike familiar RDBMS. Unlike normal SQL Databases, NoSQL is schema-less, this is what makes it scalable. NoSQL

are very common nowadays because of many reasons. The widespread of unstructured data vs. structured data, where the schema-less database succeeds in storing them efficiently (figure 1). They are distributed and scalable by design, which makes them a very popular choice for cloud-computing vendors [5]. NoSQL smoothly fit most of the promises of cloud-computing [1]. NoSQL enhances faster development time, because they are schema-less, which essentially means that the developer does not need to be a master of RDBMS to create his storages. This without doubt will make the developer happy [4].

Fig. 1. Unstructred vs Structured data from 2000 - 2011

There is nothing perfect, although NoSQL solves lots of the toughest tasks faced when developing large applications; it fails in satisfying many other quests like writing complex queries because it is schema-less. Before digging more into NoSQL drawbacks, let me define what ACID and BASE mean. They are acronyms which describe the database transactions (logical operations done on the data). B. ACID stands for [2]: Atomic: Either the entire transactions succeed, or everything is rolled back. Consistent: The Database will always be in a consistent state. Isolation: Transactions are executed in isolated mode, so they do not interfere; which solves the concurrency problem. Durable: Committed transactions persist, against all system failures.

C. BASE stands for [2]: Basic Availability: The system will be always available but might be in an inconsistent state. Soft State: The system state is always changing even in idle times. Eventual Consistent: The system will eventually be consistent, while the transactions are being propagated everywhere. This eases the availability task of the system. D. CAP Theorem ACID and BASE databases essentially follow the CAP theorem that any distributed system cannot simultaneously provide these 3 non-functional requirements: Consistency: All nodes have the same latest data. Availability: All users can access all the data at all times. Partition Tolerance: The system is tolerant to different failure causes. According to the CAP theorem any system of NoSQL will not be able to provide the 3 guarantees. This is why lots of different systems arise trying to overcome this constraint, where each system follows a different implementation methodology. Figure 2 below shows the different achievements of different NoSQL implementations.

E1,Mohamed,Ghoneim,7000 E2,Mohamed,Mohsen,8000 E3,Mohammed,Hammad,8000 But in columnar databases: E1,E2,E3 Mohamed,Mohamed,Mohamed Ghoneim,Mohsen,Hammad 7000,8000,8000 With columnar databases it is really easy to add new columns to the data, which perfectly suits the unstructured data where the developer wants to add new information to the data that did not appear before with ease. Columnar database really show off when working with aggregates because they typically group the data of each column together which makes doing a sum, average, minimum or maximum really trivial [2]. Columnar database can be compressed with ease because the way they gather data around, and due to the fact that most of the time data in the same column are redundant [3]. The above example can be compressed like that: E1,E2,E3 3;Mohamed Ghoneim,Mohsen,Hammad 7000,2;8000 Such compression techniques make columnar database a perfect choice for data warehouses where they typically store lots of data and want to store them efficiently with lowest cost. Examples: Apache Cassandra, HBase, Google BigTable, and Apache Cassandra. B. Graph Store Databases Graph databases looks like a network where nodes are the main parts, and edges represent the relationships. It has 3 constructs [2]: • Node: an independent object that is the main object in the graph databases. • Edge: a relationship between the nodes. • Property: an attribute of a node.

Fig. 2. CAP theorem Triangle

III. DIFFERENT NOSQL SYSTEMS There are major 4 systems of NoSQL Databases: A. Column Oriented Databases Columnar databases store the data in a column by column manner, unlike traditional databases which store the data a row by row manner. Let's have an example, imagine this is the database [2]: EmployeeID

FirstName

LastName

Salary

E1

Mohamed

Ghoneim

7000

E2

Mohamed

Mohsen

8000

E3

Mohamed

Hammad

8000

Fig. 3. Column Oriented Database Example

In RDBMS they will be stored like that:

Fig. 4. Graph Database Example

Figure 4 is an example of a graph database. There are 3 nodes. The first node with Id: 1 has 2 attributes, Name: Alice and Age: 18. The second node Id: 2 has 2 attributes Name: Bob, Age: 22. The third node Id: 3 has 2 attributes Type: Group, Name: Chess. There are lots of relationships between the nodes for example the first node KNOWS* the second node, the second node IS_MEMBER of the third node, and the third node HAS AS A MEMBER the first node. It is also clear that different nodes can have different attributes and there is no specific schema for the nodes. Unlike other NoSQL databases, graph databases can handle relationships real easy, it is provided by design. Graph databases are mainly used with relation heavy databases. Another advantage of Graph databases is easy representation manipulation, and retrieval of relationships between nodes [2]. It is also a common practice to store data in key-document database, and store the relationships in a graph database [2]. Examples: Neo4j, Google Pregel, and Twitter FlockDB C. Key-Value NoSQL Databases They are exactly the same as maps or associative arrays. It has a very basic functionality, it stores a key which is associated with a value, the only available function is to query for the key. It can store multiple values with the same key, which will essentially allow it to work as a relational database. In order to retrieve a row, search for the values associated with its key. This will essentially lead to lots of redundancy and data duplication [3]. Key-Value databases are great for in-memory cache; they essentially work as a hash-table in a larger scale that is saved to a persistent memory. It is also possible to query against range of keys and get different useful results. Examples: Redis, Memcached, Amazon's SimpleDB, and Tokyo Cabinet. D. Document Oriented Databases It is also known as Key-Document Databases. Document Oriented databases are typically like key-value databases except that they store a whole document instead of a single value. The document type might be XML, JSON, or BLOB. A document might be seen as a row in RDBMS, where each attribute in the document might represent a cell in the RDBMS tables. Unlike RDBMS the document might not have a schema; each document might have different types of data and in different formats. This adds huge flexibility when dealing with unstructured data.

Fig. 5. Document Database Example: (a,b)

*The words marked by an underline are relationships.

In figure 5, there are 2 examples of document from an employee database. Figure 5a, contains a description of Bob whose address is 5 Oak St. and his hobby is sailing, while figure 5b contains the a desription of Jonathan, whose address is 15 Wanamassa Point Road, but with no hobby. On the other hand it contains description of his children. It is clear that graph databases have no schema, also it is not necessary to mention every missing property in the document for example in figure 5b there is no hobby, and in figure 5a there are no children. The main advantage of document based databases is being schemaless (loosely defined) which adds tons of flexibility when dealing with unstructred data. Searching in documentbased databases only requires a keyword and it will automatically check against every property in every document in the database, which makes searching for a value in different entity types really trivial unlike RDBMS [2]. Indexes can be added to key-document databases, for faster retrieval of information. Example: MongoDB, CouchDB IV. NOSQL USE CASES The coming use cases are provided to give some examples of what NoSQL databases can provide. It should be clear by now that each NoSQL system has its advantages, but no one of them has it all. Each one of them succeeds in a different domain, for different needs. NoSQL also is not meant to be a replacement of normal SQL databases. It is meant to coexist with it, where NoSQL databases succeed in areas SQL databases would struggle. The main idea is that a company may use different systems of NoSQL or SQL databases at once each satisfying different requirement of the company. The coming few paragraphs will take a closer look into HBase, Apache Cassandra, and Neo4j. A. HBase Apache HBase is the Hadoop database, distributed, scalable, big table store [6]. It was modeled after Google BigTable, which was published in [7], where the main difference is that HBase is Open Source. HBase is a columnwide NoSQL database. According the CAP theorem HBase is partition tolerant, and consistent but not highly available and here is why: • Partition Tolerance: HBase is built upon Hadoop File system HDFS which is highly reliable and fault tolerant [7]. HDFS replication works by copying each data block 3 times to guarantee fault tolerance. The 3 copies are split like that: 2 replicas at the same rack to improve performance, and the 3rd replica at a different rack to guarantee partition tolerance [7]. • Consistency: Also due to the fact HBase is built upon Hadoop, it eases strong consistency due to its data replication behavior. When committing a write, HBase will wait for Hadoop ACK of weather the data

was written successfully or not. If HBase detects a failure of a single HDFS replica, it will automatically obtain new blocks to store the data [9]. • Availability: Due to the fact that HBase is strongly consistent and it mainly operates in a distributed system, it becomes so hard to be highly available. Upon any system failure the consistency constraint will enforce the system to hold on while it is healing then it will continue. There are many scenarios where some nodes would fail. In [9] they have noticed that HBase region* would go offline at different point in time duo different causes. On the other hand, the main reason for cluster failure was system maintenance, because it took servers minutes to restart [9]. B. Apache Cassandra "Apache Cassandra is an open source, distributed, decentralized, elastically scalable, highly available, faulttolerant, tuneably consistent, column-oriented database that bases its distribution design on Amazon’s Dynamo and its data model on Google’s BigTable. Created at Facebook, it is now used at some of the most popular sites on the Web." According to the CAP theorem Apache Cassandra is partition tolerant, highly available, but not strongly consistent, and here is why: • Partition Tolerance: Without getting into deeper details of how Cassandra works internally, it is enough to say each data block is replicated to N nodes [10]. Cassandra provides different replication policies "Rack Aware", "Rack Unaware" and "Datacenter Aware" [10]. • Consistency: According to [3], Cassandra is eventual consistent, and in the paper they prefer to call it tuneably consistent. It is tuneably consistent in the sense that the developer might determine how many replicas must be created of a block of data, before making it available (maybe each data block should be replicated 5 times but the developer decides that the system should be available after generated the first 2 replicas) [3]. With only 1 replica required the system will be very highly available but highly inconsistent at some stages, and vice versa. All in all the system eventually will be consistent once more (eventual consistency). • Availability: Like said earlier availability and consistency are tunable. C. Neo4j Neo4j is a graph database built by NeoTechnology. Neo4j is intuitive, reliable, durable, fast, massively scalable, highly available, expressive and simple database [11]. Neo4j is ACID in nature, because it can operate on a single node. Its basic idea is master-slave architecture. At any point in time the *Region: A region comprises of a subset of a table’s rows. [8]

master and the slaves should contain a copy of the whole graph [12]. Whenever a new transaction arrives it should be executed by the master then it will propagate to all the slaves. The main functionality of the slaves is election, where there is always a slave that is ready to replace the master if it fails. According to the CAP theorem and assuming the system will be operating on multiple nodes, Neo4j is partition tolerant, highly available, but consistency is not strong (but this is highly dependent on the developer choices), and here is why: • Partition Tolerant: Neo4j has a heavy redundancy system, where a copy of the whole graph is stored in multiple nodes, which provides strong fault tolerance. If the system must be fault tolerant against 5 node failures then it should contain at least 6 nodes [12]. • Availability: The system will always be available as long as there is at least one node still functioning in the system, this provide strong availability. Now imagine it is only a single node is functioning it would look like normal RDBMS so the database might be considered ACID in that case [13]. • Consistency: consistency across multiple nodes is not guaranteed, in a typical scenario, the transaction will be executed on the master in an ACID manner then it will be propagated to other nodes in the system [13]. V. CONCLUSION Although RDBMS work great for small scale applications, they do not show off when dealing with large databases (millions/billions records). NoSQL databases can be the solution with large data but unfortunately there is no complete and perfect NoSQL system that can solve all the problems. Different systems are provided and each succeeds in different direction. It is the developer responsibility to decide which system to use according to his/her own use case.

REFERENCES [1]

[2]

Han, Jing; Song, Meina; Song, Junde, "A Novel Solution of Distributed Memory NoSQL Database for Cloud Computing," Computer and Information Science (ICIS), 2011 IEEE/ACIS 10th International Conference on , vol., no., pp.351,355, 16-18 May 2011 G. Vaish, Getting Started with NoSQL, 1st ed. Birmingham, UK: Packt Pub., 2013, p. 11, p.26-47, p.117-118

[3]

E. Hewitt, Cassandra: the definitive guide, 1st ed. Sebastopol, CA: O'Reilly, 2011, p. 265, 256.

[4]

G. Vaish, Getting Started with NoSQL, 1st ed. Birmingham, UK: Packt Pub., 2013, p. 11.

[5]

Han, Jing; Song, Meina; Song, Junde, "A Novel Solution of Distributed Memory NoSQL Database for Cloud Computing," Computer and Information Science (ICIS), 2011 IEEE/ACIS 10th International Conference on , vol., no., pp.351,355, 16-18 May 2011 HBase. Home. https://hbase.apache.org/ Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber, "Bigtable: A Distributed Storage System for Structured Data", Seventh Symposium on Operating System Design and Implementation (OSDI), Seattle, WA: Usenix Association; November 2006. Vora, M.N., "Hadoop-HBase for large-scale data," Computer Science and Network Technology (ICCSNT), 2011 International Conference on ,

[6] [7]

[8]

[9]

vol.1, no., pp.601,605, 24-26 Dec. 2011 doi: 10.1109/ICCSNT.2011.6182030 Dhruba Borthakur, Jonathan Gray, Joydeep Sen Sarma, Kannan Muthukkaruppan, Nicolas Spiegelberg, Hairong Kuang, Karthik Ranganathan, Dmytro Molkov, Aravind Menon, Samuel Rash, Rodrigo Schmidt, and Amitanand Aiyer. 2011. Apache hadoop goes realtime at Facebook. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (SIGMOD '11). ACM, New York, NY, USA, 1071-1080. DOI=10.1145/1989323.1989438 http://doi.acm.org/10.1145/1989323.1989438

[10] Lakshman, Avinash; Malik, Prashant, “Cassandra, a decentralized structured storage system”, http://www.cs.cornell.edu/projects/ladis2009/papers/lakshmanladis2009.pdf, Cornell University, (unpublished), [11] Neo4j. Home. http://neo4j.org, 2014. [12] D. Montag, Understanding Neo4j Scalability, 1st ed. 2013. http://info.neotechnology.com/rs/neotechnology/images/Understanding %20Neo4j%20Scalability(2).pdf [13] Neo4j. Documentation. http://docs.neo4j.org/chunked/stable/hahow.html