在2008年的時候,我還是只知道DB2,
Oracle, MS SQLServer, Sybase, MySQL, PostgreSQL, Firebird等主流商業或者開源數據庫。當汲取知識于網絡之際,突然發現很多新的名詞魚躍而出,什么 SQLite, Memcached, FastDB, MongoDB,
Solr, Redis, HBase, Cassandra,
Teradata, Hive, CouchDB, HBase等等。也不免的困惑了很多。
試著逐步理清頭緒,首先數據庫分為關系型數據庫Relational DBMS和非關系型數據庫NoSQL 數據庫。
>> 關系型數據庫Relational DBMS包含:
DB2, Oracle, MS SQLServer, Sybase, MySQL, PostgreSQL, Firebird。
>> 非關系型數據庫NoSQL包含:
SQLite, Memcached, FastDB, MongoDB, Solr, Redis, HBase, Cassandra,
Teradata, Hive, CouchDB, HBase, Neo4j, Riak
簡單應用場景對比:
關系型數據庫的優勢: 結構化數據、范式模型、ACID事務。
NoSQL優勢:性能、可擴展性、靈活的模式和分析能力。在下列應用中更有優勢
a.存儲的數據實質上是半結構化或者松散的;
b.要求一定等級的性能和擴展性;
c.存取該數據的應用和最終的一致性相吻合
NoSQL非關系型數據庫典型支持功能:
a.模式靈活
b.無共享結構
c.分片做為數據存儲模型的一部分
d.異步復制
e.使用BASE取代ACID事務。
繼續分類:
其次,數據庫也可以分為基于磁盤的數據庫和基于內存的數據庫。
硬盤型數據庫包含
>>關系型數據庫Relational DBMS: 全部
>>NoSQL:MongoDB
內存型數據庫包含
>>NoSQL:SQLite, Memcached, FastDB, Redis
繼續分類:
NoSQL根據實現又分為:
http://www.infoq.com/research/nosql-databases?utm_source=infoqresearch&utm_campaign=lr-homepage
You are here: InfoQ Homepage Research NoSQL Database Adoption Trends
NoSQL Database Adoption Trends
by Srini Penchikala on Jul 23, 2013
UPDATE Aug 08 2013: The following new NoSQL database options were added today, after user request and feedback: GridGain, GigaSpaces, Tibco, and MarkLogic.
UPDATE Jul 25 2013: The following options were added today, after user request: Oracle Coherence, Terracotta BigMemory, Couchbase, and Oracle NoSQL Database.
NoSQL databases have been getting lot of attention over the last few years for their performance, scalability, schema flexibility and analytics capabilities. While relational databases are still good choice for certain use cases - like structured data and applications that require ACID transactions - NoSQL databases are better suited for use cases where:
· The data stored is semi-structured or unstructured in nature
· The applications that access this data require a certain level of performance and scalability
· The applications that access this data are ok with eventual consistency
Non-relational databases typically support the following capabilities:
· Schema flexibility
· Shared nothing architecture
· Sharding as part of the data storage model
· Asynchronous replication
· BASE instead of ACID Transactions
InfoQ would like to learn what NoSQL databases you are currently using or planning on using in your applications.
Document Databases
· MongoDB: MongoDB is an open-source document oriented database.
· CouchDB: Apache CouchDB is a database that uses JSON for documents, JavaScript for MapReduce queries, and HTTP for an API.
· Couchbase: NoSQL document database based on JSON model.
· RavenDB: RavenDB is a document-oriented database based on .NET language.
· MarkLogic: MarkLogic NoSQL database is used to store XML-based, document-centric information. It supports schema flexibility.
· Other Document Database
Graph Databases
· Neo4j: Neo4j is a property graph database; supports ACID transactions.
· InfiniteGraph: Graph database used to persist and traverse relationships between objects, supports distribute data stores.
· AllegroGraph: AllegroGraph is a graph database that uses memory utilization in combination with disk-based storage for scalability, supports SPARQL, RDFS++, and Prolog reasoning.
· Other Graph Database
Key Value Data Stores
· Riak: Riak is an open source, distributed key value database, supports data replication and fault-tolerance.
· Redis: Redis is an open source key-value store. Supports master-slave replication, transactions, Pub/Sub, Lua scripting, Keys with a limited time-to-live.
· Dynamo: Dynamo is a key-value distributed data store. It is directly implemented as Amazon DynamoDB; used in Amazon S3 product.
· Oracle NoSQL Database: Key-value NoSQL database from Oracle. It supports ACID transactions and JSON.
· Voldemort: Distributed key-value storage system with the data replication and partitioning.
· Aerospike: Aerospike database is a key-value store; supports hybrid memory architecture and data integrity with strong or tunable consistency.
· Other Key Value Data Store
Columnar Databases
· Cassandra: Cassandra is column database that supports data replication across multiple data centers. Its data model offers column indexes, log-structured updates, support for denormalization, materialized views, and built-in caching.
· HBase: Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable. It provides Bigtable-like capabilities on top of Hadoop and HDFS.
· Amazon SimpleDB: Amazon SimpleDB is a non-relational data store that offloads the work of database administration. Developers store and query data items using web services requests.
· Apache Accumulo: Apache Accumulo sorted, distributed key/value data store created based on Google's BigTable design and is built on top of Apache Hadoop, Zookeeper, and Thrift technologies.
· Hypertable: Hypertable is an open source, scalable database, also modeled after Bigtable; supports sharding.
· Azure Tables: Windows Azure Table Storage Service offers NoSQL capabilities for applications that require storage of large amounts of unstructured data. Tables can auto-scale to store up to several terabytes of data. They are accessible via REST and managed APIs.
· Other Columnar Database
In-Memory Data Grids
· Hazelcast: Hazelcast CE is an open source data distribution platform. It allows the developers to share and partition the data across the database cluster.
· Oracle Coherence: Oracle's in-memory data grid solution that provides fast access to frequently used data. Coherence supports event capabilities and dynamic partitioning of data.
· Terracotta BigMemory: Distributed in-memory management solution from Terracotta. The product includes an Ehcache interface, Terracotta Management Console and BigMemory-Hadoop Connector (early access).
· GemFire: VMware vFabric GemFire is a distributed data management platform and provides elastic in-memory data management, replication, partitioning, data-aware routing, and continuous querying.
· Infinispan: Infinispan is a Java based open source key/value NoSQL datastore and distributed data grid platform. It supports transactions and peer-to-peer as well as client/server architecture.
· GridGain: Distributed, object-based, in-memory, SQL+NoSQL key-value database. Supports ACID transactions.
· GigaSpaces: GigaSpaces in-memory data grid (the Space) serves as the system of record for the applications and supports a variety of caching scenarios.
· Tibco: ActiveSpaces product from Tibco provides an infrastructure to create virtual data caches from the aggregate memory of participating nodes in the cluster and to scale as nodes join and leave.
· Other In-Memory Data Grid
(譯版:一網打盡2013最常用的NoSQL數據庫http://blog.chedushi.com/archives/7306)
>>文檔數據庫
a. MongoDB:開源、面向文檔,也是當下最人氣的NoSQL數據庫。
b. CounchDB:Apache CounchDB是一個使用JSON的文檔數據庫,使用Javascript做MapReduce查詢,以及一個使用HTTP的API。
c. Couchbase:NoSQL文檔數據庫基于JSON模型。
d. RavenDB:RavenDB是一個基于.net語言的面向文檔數據庫。
e. MarkLogic:MarkLogic NoSQL數據庫用來存儲基于XML和以文檔為中心的信息,支持靈活的模式。
>>圖數據庫
a. Neo4j: Neo4j是一個圖數據庫;支持ACID事務(原子性、獨立性、持久性和一致性)
b. InfiniteGraph:一個圖數據庫用來維持和遍歷對象間的關系,支持分布式數據存儲。
c. AllegroGraph:AllegroGraph是結合使用了內存和磁盤,提供了高可擴展性,支持SPARQ、RDFS++和Prolog推理。
>>鍵值數據存儲
a. Riak:Riak是一個開源,分布式鍵值數據庫,支持數據復制和容錯。
b. Redis:Redis是一個開源的鍵值存儲。支持主從式復制、事務,Pub/Sub、Lua腳本,還支持給Key添加時限。
c. Dynamo:Dynamo是一個鍵值分布式數據存儲。它直接由亞馬遜Dynamo數據庫實現;在亞馬遜S3產品中使用。
d. Oracle NoSQL Database:來自Oracle的鍵值NoSQL數據庫。它支持事務ACID(原子性、一致性、持久性和獨立性)和JSON。
e. Oracle NoSQL Database:具備數據備份和分布式鍵值存儲系統。
f. Voldemort:具備數據備份和分布式鍵值存儲系統。
g. Aerospike:Aerospike數據庫是一個鍵值存儲,支持混合內存架構,通過強一致性和可調一致性保證數據的完整性。
>>列存儲數據庫
a. Cassandra:Cassandra是列存儲數據庫,支持跨數據中心的數據復制。它的數據模型提供列索引,log-structured修改,支持反規范化,實體化視圖和嵌入超高速緩存。
b. HBase:Apache Hbase源于Google的Bigtable,是一個開源、分布式、面向列存儲的模型。在Hadoop和HDFS之上提供了像Bigtable一樣的功能。
c. Amazon SimpleDB:Amazon SimpleDB是一個非關系型數據存儲,它卸下數據庫管理的工作。開發者使用Web服務請求存儲和查詢數據項。
d. Apache Accumulo:Apache Accumulo的有序的、分布式鍵值數據存儲,基于Google的BigTable設計,建立在Apache Hadoop、Zookeeper和Thrift技術之上。
e. Hypertable:Hypertable是一個開源、可擴展的數據庫,模仿Bigtable,支持分片
f. Azure Tables:Windows Azure Table Storage Service為要求大量非結構化數據存儲的應用提供NoSQL性能。表能夠自動擴展到TB級別,能通過REST和Managed API訪問。
>>內存數據網格
a. Hazelcast:Hazelcast CE是一個開源數據分布平臺,它允許開發者在數據庫集群之上共享和分割數據。
b. Oracle Coherence:Oracle的內存數據網格解決方案提供了常用數據的快速訪問能力,一致性支持事務處理能力和數據的動態劃分。
c. Terracotta BigMemory:來自Terracotta的分布式內存管理解決方案。這項產品包括一個Ehcache界面、Terracotta管理控制臺和BigMemory-Hadoop連接器。
d. GemFire:Vmware vFabric GemFire是一個分布式數據管理平臺,也是一個分布式的數據網格平臺,支持內存數據管理、復制、劃分、數據識別路由和連續查詢。
e. Infinispan:Infinispan是一個基于Java的開源鍵值NoSQL數據存儲,和分布式數據節點平臺,支持事務,peer-to-peer 及client/server 架構。
f. GridGain:分布式、面向對象、基于內存、SQL+NoSQL鍵值數據庫。支持ACID事務。
g. GigaSpaces:GigaSpaces內存數據網格能夠充當應用的記錄系統,并支持各種各樣的高速緩存場景。