A Performance Comparison of Drivers for NoSQL



The metrics in this article were found using the most up-to-date drivers available as of July 2017. Find new performance metrics in our updated article.

In this article, we compare the performance of the CData Drivers for MongoDB to the same technologies produced by two companies (Competitor 1 and Competitor 2), as well as the matching "drivers" produced by MongoDB, Inc. We compared read performance, measuring the amount of time that it takes to query MongoDB for data and process the result set in some way.

The test machine specifications are as follows:
Operating System: Windows 10
Processor: Intel® CoreTM 2 Quad CPU Q8400 @ 2.66GHz
Installed Memory (RAM): 6.00 GB
System type: 64-bit Operating System

Since the drivers are being compared side-by-side, the performance of the machine itself is relatively unimportant; what matters is how the drivers compare relative to one another.

The Data



In order to provide a reproducible comparison, we copied the sample restaurants dataset (made publicly available by MongoDB, Inc.) and then built successively larger datasets based on the sample data. The relevant details for the table(s) queried are below:

Table Number of Rows
restaurants 25,360
restaurants_2 2,003,362
restaurants_3 10,011,962

Queries



The main goal of this investigation was to compare the related performance of the drivers. We did this by running the same queries with each driver. To simulate actual processing of the data beyond simply reading from MongoDB, we stored the values of each row in an array (that was replaced for each row). The queries are listed below:

  1. SELECT borough, restaurant_id, _id, cuisine, name FROM restaurants
  2. SELECT borough, restaurant_id, _id, cuisine, name FROM restaurants_2
  3. SELECT borough, restaurant_id, _id, cuisine, name FROM restaurants_3

Results



Below, you can see the performance of the various queries, based on the driver/platform.

JDBC / Java Drivers

All four the companies compared produce a JDBC driver or other technology that provide a native experience with MongoDB data in Java applications. The results of processing query results in a simple Java application are below.

JDBC/Java Query Times by Company (in seconds)
Query CData Software Competitor 1 Competitor 2 MongoDB, Inc.
1 (~25,000 rows) 0.9 (-56% - +33%) 0.4 1.2 0.5
2 (~2,000,000 rows) 6.3 (+44% - +138%) 15.1 9.1 13.2
3 (~10,000,000 rows) 30.6 (+95% - +173%) 89.2 94.1 67.3

As can be seen in the results, the CData drivers were able to work with large result sets faster than the other drivers, regularly retrieving and processing results over twice as fast. In the case where the CData drivers are slower, the margins are barely noticeable and are due to performing a live schema discovery.

The average runtime for each query (of the larger datasets) is compared in the charts below:

Results for ~2,000,000 Rows

Results for ~10,000,000 Rows

ADO.NET Provider / C# Drivers

CData Software and MongoDB, Inc are the two companies that provide native support for connecting to MongoDB data in .NET applications. CData Software has an ADO.NET Provider and MongoDB, Inc has their own C# and .NET MongoDB Driver. The times required for each product to process the results are in the table below.

ADO.NET/C# Query Times by Company (in seconds)
Query CData Software MongoDB, Inc.
1 (~25,000 rows) 1.4 (-44%) 0.8
2 (~2,000,000 rows) 12.7 (+161%) 33.1
3 (~10,000,000 rows) 60.8 (+215%) 191.4

As can be seen in the results, the CData ADO.NET Provider was able to work with large result sets faster than the MongoDB, Inc. C# Drivers, processing the largest dataset over three times faster than the MongoDB, Inc. driver.

The average runtime for each query (of the larger datasets) is compared in the charts below:

Results for ~2,000,000 Rows

Results for ~10,000,000 Rows

Conclusion



The CData Software drivers regularly prove to be faster than the competitors' equivalent products, particularly when dealing with large data sets. When the drivers are slower, the difference is barely noticeable (less than half a second in most cases) and is the trade-off for live schema discovery. We realize that speed is only one measurement, but the performance of our drivers is a strong indicator of the depth and technical prowess embedded in all of our drivers and data access technologies. Our developers have spent countless hours optimizing the performance in processing the results returned by the MongoDB database to the point that the drivers seem to only be hindered by web traffic and server processing times.

Related Articles