A computer system, like any other physical device, is subject to failure from a variety of causes: disk crashes, power outages, software errors, or fires in the data center.
A DBMS must guarantee the Atomicity and Durability properties of transactions. If the power cord is pulled out of the server halfway through a massive bank transfer, the DBMS must be able to completely recover the system to a consistent state when the power comes back on.
The absolute most important component of database recovery is the Write-Ahead Log (WAL) (also called the Redo Log).
To maximize performance, databases do not immediately write updated rows to the actual data files on the hard disk. Instead, they modify the data in memory (RAM). However, RAM is volatile—if the power goes out, the data is lost.
To guarantee Durability, the DBMS utilizes the WAL protocol:
Writing to a sequential log file is extremely fast because it is an "append-only" operation (no disk arm seeking required), which is why databases use this instead of immediately updating the massive data files.
Over time, the WAL file grows massively. If a database crashes after running for a year, reading the entire year-long WAL to rebuild the state would take days.
To solve this, the DBMS periodically performs a Checkpoint:
<checkpoint> record to the WAL.If the system crashes, the recovery manager only needs to read the WAL backwards until it hits the most recent <checkpoint>. It knows that any transactions completed before that checkpoint are safely stored in the data files and can be ignored.
ARIES (Algorithm for Recovery and Isolation Exploiting Semantics) is the industry-standard recovery algorithm used by almost all modern relational databases.
When a crashed database is rebooted, ARIES performs three distinct passes over the Write-Ahead Log: