One of the most critical decisions in system design is choosing the right database. The wrong choice can lead to massive refactoring or system-wide performance problems. The two broad categories are Relational (SQL) and Non-Relational (NoSQL) databases.
SQL databases store data in structured tables with predefined schemas (rows and columns). Relationships between tables are enforced via foreign keys. They use Structured Query Language (SQL) for querying.
Examples: MySQL, PostgreSQL, Oracle, SQL Server.
NoSQL databases provide flexible schemas and are designed for specific data models and access patterns. There are four major types:
Data is stored as a collection of key-value pairs. Extremely fast for simple lookups.
Data is stored as JSON-like documents. Each document can have a different structure. Schema-less.
Data is stored in tables, rows, and dynamic columns. Optimized for queries over large datasets.
Data is stored as nodes and edges (relationships). Optimized for traversing relationships.
| Feature | SQL | NoSQL |
|---|---|---|
| Data Model | Tables with fixed schema | Flexible (documents, key-value, etc.) |
| Scaling | Primarily vertical | Primarily horizontal |
| Transactions | Full ACID support | Usually eventual consistency |
| Query Language | Standardized SQL | Database-specific APIs |
| Schema | Rigid, predefined | Dynamic, flexible |
| Best For | Complex queries, relationships | High throughput, flexible data |
Most large-scale systems use both (Polyglot Persistence):