Don't Use MongoDB: Actually, It's Fine

Don't Use MongoDB: Actually, It's Fine

D
dongAuthor
7 min read

There was a time when “Don’t use MongoDB” was a popular saying in the developer community. Developers avoided MongoDB due to its lack of schema enforcement and limited transaction support. But as of 2024, those concerns are mostly a thing of the past.

Why You Should Never Use MongoDB

MongoDB has evolved into an enterprise-grade database, and even companies like Stripe, which process over $1 trillion annually, rely on it as a core part of their infrastructure. In this post, we’ll explore why MongoDB is a solid choice and how it’s used in real-world, large-scale services.

“Don’t Use MongoDB” – That’s Old News

Criticism of MongoDB in the past mostly stemmed from limitations in its early versions—lack of schema validation, no ACID transactions, and consistency issues. But MongoDB has since resolved most of these problems.

Schema validation support: MongoDB now offers schema validation. You can enforce structure with JSON schemas, enabling constraints similar to those in RDBMS when needed.

Transaction support: As of version 4.2, MongoDB fully supports ACID transactions. Large-scale transactions can impact performance, but this is true for most databases.

High availability and consistency: With automatic failover using the Raft algorithm, recovery happens in about one second. Read/write concerns allow for consistent data and efficient traffic distribution.

MongoDB is no longer just for quick prototypes. It’s now a stable and scalable solution fit for enterprise environments.

How Stripe’s document databases supported 99.999% uptime with zero-downtime data migrations

MongoDB Validated by TLA+

TLA+ (Temporal Logic of Actions Plus) is a language for formally specifying and verifying distributed systems. Simply put, it helps validate whether a complex system behaves as intended.

MongoDB uses TLA+ to design its distributed algorithms and ensure system correctness. For example, it models and verifies that data replication across nodes works properly and that data consistency is maintained during failures.

This shows MongoDB isn’t just a “quickly built” NoSQL database—it’s an enterprise-grade system backed by formal mathematical verification.

Conformance Checking at MongoDB: Testing That Our Code Matches Our TLA+ Specs

Key Features of MongoDB

Document-Oriented Structure

MongoDB stores data in JSON-like documents, which are processed internally as BSON (Binary JSON). This structure is a perfect fit for modern web applications.

// MongoDB document example
{
  "_id": ObjectId("507f1f77bcf86cd799439011"),
  "name": "Kim Developer",
  "email": "kim@example.com",
  "skills": ["JavaScript", "React", "Node.js"],
  "projects": [{  "name": "E-commerce Project",  "status": "Completed",  "technologies": ["React", "MongoDB"]}
  ]
}

In an RDBMS, such a nested structure would require multiple tables and complex JOINs. In MongoDB, it’s handled simply within a single document.

Flexible Schema

A flexible schema doesn’t mean “no schema.” It means you can enforce or relax schema rules as needed. For instance, in a user profile, some users might have a phone number, others might not. In RDBMS, this would require nullable fields or extra tables. MongoDB handles this naturally.

Scalability (Sharding)

MongoDB’s sharding is a powerful feature. It distributes data across multiple servers, enabling horizontal scaling. A sharded cluster includes:

  • mongos: routes queries to the correct shard
  • config servers: manage metadata
  • shard nodes: store actual data

Data is distributed via a shard key, enabling efficient queries and scaling. But a poorly chosen shard key can lead to performance issues and data imbalance, so careful design is crucial.

Diverse Indexing Options

MongoDB supports various indexes—from simple B-Tree indexes to compound, multikey, geospatial, hashed, and text indexes. The geospatial index, in particular, uses the S2Geometry library for efficient location-based data handling.

// Geospatial index example
db.places.createIndex({ "location": "2dsphere" })

// Search within a 1km radius
db.places.find({
  location: {$near: {  $geometry: { type: "Point", coordinates: [127.0276, 37.4979] },  $maxDistance: 1000}
  }
})

Stripe Actively Uses MongoDB

Stripe is a prime example of MongoDB’s power in an enterprise setting. In 2023, it processed $1 trillion in payments while maintaining 99.999% uptime.

How Stripe’s document databases supported 99.999% uptime with zero-downtime data migrations

DocDB: A MongoDB-Based Custom Solution

Stripe built its own database infrastructure called DocDB on top of MongoDB Community. DocDB handles over 5 million queries per second and supports all of Stripe’s products.

Why Stripe chose MongoDB:

  1. Flexible document model: Naturally represents complex payment data
  2. Massive real-time data handling: Handles millions of transactions per second
  3. Improved developer productivity: Faster development than RDBMS

Zero-Downtime Data Migration

One of Stripe’s most impressive feats is migrating data without downtime. The Data Movement Platform enables live migrations with no service interruptions. It works in the following steps:

  1. Register migration: Intent logged with the chunk metadata service
  2. Bulk data import: Optimized via snapshotting (10× performance improvement)
  3. Asynchronous replication: Real-time sync via Kafka and Amazon S3
  4. Validation: Ensures data integrity and accuracy
  5. Traffic switch: Seamless cutover in under 2 seconds

This lets Stripe split shards during traffic spikes and consolidate thousands of databases during low usage periods.

Proxy Server for Reliability

Stripe built a custom database proxy server in Go to handle reliability, scalability, and access control—adding enterprise-grade features on top of MongoDB’s capabilities.

Advantages of MongoDB

High Performance

MongoDB excels in certain use cases. Cars24 used MongoDB Atlas to enhance search for 300 million users and cut costs in half.

MongoDB Atlas Search allows executing search queries directly in the database, eliminating the need for a separate search engine and sync mechanism. It delivers real-time results and simplifies architecture.

High Availability

MongoDB’s replica sets provide automatic failover. If the primary node fails, a secondary takes over in about a second, minimizing service disruption. This is based on the proven Raft algorithm.

Ease of Use

MongoDB Query Language (MQL) is based on JavaScript, familiar to web developers. Its JSON-like syntax makes it easy to learn.

// User search example
db.users.find({
  $and: [{ "age": { $gte: 18 } },{ "skills": { $in: ["JavaScript", "React"] } },{ "location.city": "Seoul" }
  ]
}).sort({ "createdAt": -1 }).limit(10)

Unified Development Experience

As seen with Cars24, MongoDB Atlas provides both storage and search capabilities on a single platform. Developers can handle everything via one API, without additional tools or sync processes.

This enables teams to focus on app development and product building rather than index management or data syncing.

Fast Developer Onboarding

It’s easy to onboard developers already familiar with MongoDB. This is especially valuable for startups and fast-growing organizations.

Considerations When Using MongoDB

Data Duplication Instead of JOINs

MongoDB doesn’t support traditional JOINs. While $lookup is available, it’s not as performant as RDBMS JOINs. So pre-duplicating necessary data is recommended.

Minimize Transaction Use

While transactions are supported, large-scale transactions can hurt performance. Embedded documents help reduce the need for transactions.

Importance of Shard Key Selection

Poor shard key selection can cause performance bottlenecks and uneven data distribution. Choose keys carefully based on query patterns and distribution needs.

References

Don't Use MongoDB: Actually, It's Fine | devdong