29

As far as I can work out, CosmoDB has the ability to make Graph queries using the Gremlin query language. Apart from that the pricing, marketing etc. all seem the same. It seems strange that they came up with a new product to add Gremlin when they didn't do the same to add MongoDB support. What are the discernable differences between these two products?

Andrew Liu
  • 8,045
  • 38
  • 47
Muhammad Rehan Saeed
  • 35,627
  • 39
  • 202
  • 311

3 Answers3

51

The Azure Cosmos DB team member here.

Azure Cosmos DB started as “Project Florence” in 2010 to address developer pain-points faced by large scale applications inside Microsoft. Observing that the challenges of building globally distributed apps are not a problem unique to Microsoft, in 2015 we made the first generation of this technology available to Azure developers in the form of Azure DocumentDB. Since that time, we’ve added new features and introduced significant new capabilities. Azure Cosmos DB is the result. It is the next big leap in globally distributed, at scale, cloud databases. As a part of this release of Azure Cosmos DB, DocumentDB customers, with their data, are automatically Azure Cosmos DB customers. The transition is seamless and they now have access to the new breakthrough system and capabilities offered by Azure Cosmos DB.

In the evolution of Cosmos DB, we have added significant new capabilities since 2015 (when DocumentDB was made generally available) but only a subset of these capabilities was available in DocumentDB. These capabilities are in the areas of the core database engine as well as, global distribution, elastic scalability and industry-leading, comprehensive SLAs. Specifically, we have evolved the Cosmos DB database engine to be able to efficiently map all popular data models, type systems and APIs to the underlying data model of Cosmos DB. The developer facing manifestation of this work currently will experience it via support for Gremlin and Table Storage APIs. And this is just the beginning… We will be adding other popular APIs and newer data models over time with more advances towards performance and storage at global scale.

We also have extended the foundation for global and elastic scalability of throughput and storage. One of the very first manifestations of it is the RU/m (https://learn.microsoft.com/en-us/azure/cosmos-db/request-units-per-minute) but we have more capabilities that we will be announcing in these areas. The new capabilities will help save cost for our customers for various workloads. We have made several foundational enhancements to the global distribution subsystem. One of the many developer facing manifestations of this work is the consistent prefix consistency model (making in total 5 well-defined consistency models). However, there are many more interesting capabilities we will release as they mature.

It is important to point out that we view Azure Cosmos DB as a constantly evolving database service. Typically, we first validate all new capabilities with the large scale applications inside Microsoft, subsequently expose them to key external customers, and finally, release them to the world.

It is also important to point out that DocumentDB’s SQL dialect has always been just one of the many APIs that the underlying Cosmos DB was capable of supporting. As a developer using a fully managed service like Cosmos DB, the only interface to the service is the APIs exposed by the service. To that end, nothing really changes for a DocumentDB customer. Cosmos DB offers the exactly the same SQL API that DocumentDB did. However, now (and in the future) you can get access to other capabilities which were previously not accessible.

Andrew Liu
  • 8,045
  • 38
  • 47
  • So I can now use Gremlin or the Table Storage SDK to query my documents stored using the DocumentDB SDK right? Is the underlying data agnostic to the method of accessing it? BTW, RU/m makes it difficult to use DocumentDB/CosmosDB where you don't know how much load you will get (Basically 99% of workloads), particularly for [spikes in load](https://stackoverflow.com/questions/36714563/handling-request-units-per-second-rus-s-spikes-in-documentdb), please consider an alternative. – Muhammad Rehan Saeed May 16 '17 at 07:57
  • 1
    The long-term goal is to converge and allow mixed use of all APIs. Today, you can use Gremlin with DocumentDB API - however Tables API and MongoDB API are separate. The RU/m is intended to increase the time range you can spread spikes in workload (it is an improvement over only provisioning RU/s). We are also investigating additional improvements on pricing models - including a storage-optimized model. – Andrew Liu May 17 '17 at 01:06
  • 2
    > "Today, you can use Gremlin with DocumentDB API" That sounds awesome. Is this documented? Or is this a possibility, but not recommended type of thing? – Quang Van May 17 '17 at 04:24
  • RU/m certainly helps over RU/s but us devs still have to write code so we don't get overcharged for RU/m we are not using. I would like an Azure App Services model where I can set a range using sliders and the RU/m change based on load in the system (Essentially the code I have to write myself today). The storage-optimized model sounds interesting, can you elaborate on it? – Muhammad Rehan Saeed May 17 '17 at 07:26
  • 1
    Re: Graph + DocumentDB API - Gremlin can be exposed as an extension library on top of the DocumentDB .NET SDK - which makes it very easy and convenient for creating relationships (edges) between documents. Check out: https://learn.microsoft.com/en-us/azure/cosmos-db/tutorial-develop-graph-dotnet – Andrew Liu May 17 '17 at 18:37
  • Re: RU/m feedback - we understand the desire for auto-scale. FWIW, we already expose many of the primitives to roll your own solution (between webhooks on alerts, and programmaticly retrieving metrics and adjusting container throughput). We can try to assist you on a sample, as we know we're currently lacking good doc's in this space. As for new offers... the current pricing model offers the best-in-class price for performance when you have a high throughput : storage ratio. In other words, it's optimized for throughput. We're working on an additional offer for low throughput : storage ratios. – Andrew Liu May 17 '17 at 18:40
6

DocumentDB is one of the APIs for CosmosDB. Others include Table Storage, MongoDB, Gremlin.

Think about CosmosDB as the database platform that handles scaling, throughput, consitency, etc and DocumentDB as one of the types of the databases than run on CosmosDB.

Azure Cosmos DB natively supports multiple data models including documents, key-value, graph, and column-family. The core content-model of Cosmos DB’s database engine is based on atom-record-sequence (ARS). Atoms consist of a small set of primitive types like string, bool, and number. Records are structs composed of these types. Sequences are arrays consisting of atoms, records, or sequences.

The database engine can efficiently translate and project different data models onto the ARS-based data model. The core data model of Cosmos DB is natively accessible from dynamically typed programming languages and can be exposed as-is as JSON.

https://learn.microsoft.com/en-us/azure/cosmos-db/introduction

Community
  • 1
  • 1
Jakub Konecki
  • 45,581
  • 7
  • 87
  • 126
5

CosmosDB is the new DocumentDB for NoSQL solution.

As Cosmosdb architect Rimma mentioned

The Azure Cosmos DB DocumentDB API or SQL (DocumentDB) API is now known as Azure Cosmos DB SQL API. You don't need to change anything to continue running your apps built with DocumentDB/DocumentDB API. The functionality remains the same. Thanks.

DocumentDB is one of the APIs for CosmosDB.As of now, if you go to Azure portal and try to create an Azure Cosmos DB, you have to select one of the 4 APIs available there:

  • Gremlin (Graph)
  • MongoDB
  • SQL (DocumentDB)
  • Table (key-value)
Sajeetharan
  • 216,225
  • 63
  • 350
  • 396