[Concurrency Control] Why Do We Need Concurrency Control? - Schedule, Serializability, Recoverability
๐ Schedule & Conflict Serializability
Transaction: A sequence of actions where patients A and B access the database.
R (Read): Retrieve the value of
available_beds
from the Hospitals table.W1 (Write): Decrement the value of
available_beds
in the Hospitals table by 1.W2 (Write): Set the value of
is_sent
in the Reports table totrue
.
Operation: The individual Read/Write actions performed within a transaction.
Schedule: The sequential ordering of operations from multiple transactions.
Schedule
The order in which operations from multiple transactions are executed when running concurrently.
The order of operations within each transaction remains unchanged.
Serial Schedule: Transactions are executed one after another without overlap.
This approach lacks concurrency, resulting in suboptimal performance as only one transaction is processed at a time. In reality, this approach is not practical.
Read and Write operations involve I/O, while handling a new transaction is a CPU task. In a serial schedule, while processing one transaction, the CPU remains idle, leading to decreased performance.
Non-serial Schedule: Transactions are overlapped (interleaved) and executed concurrently.
As transactions are executed concurrently, it increases concurrency and allows processing more transactions in the same timeframe, resulting in better performance.
CPU tasks continue without idle time, as other transactions can be initiated while performing I/O operations for a single transaction.
The problem with a non-serial schedule is that the outcome can be incorrect depending on how transactions are interleaved. => Dilemma: Can non-serial schedules be executed without producing incorrect results? => Idea: Execute the same non-serial schedule as a serial schedule. => To do this, we need to define the meaning of "schedule equivalence."
Conflict
Two operations are considered a conflict if they satisfy the following three conditions:
They belong to different transactions.
They access the same data item.
At least one of them is a Write operation (read-write conflict or write-write conflict).
- Changing the order of conflict operations will alter the result.
Conflict Equivalent
Two schedules are considered conflict equivalent if they satisfy the following two conditions:
Both schedules involve the same set of transactions.
The order of any conflicting operations is the same in both schedules.
Conflict Serializability
A schedule is conflict serializable if it is equivalent to a serial schedule.
- Conflict serializable non-serial schedules can resolve the problem of incorrect outcomes based on how transactions are interleaved.
Conclusion
Concurrency control can ensure that any schedule is serializable.
Practical Implementation
Implement a protocol that allows multiple transactions to execute concurrently while ensuring the resulting schedule is conflict serializable. => This protocol can be implemented using Locks, MVCC (Multi-Version Concurrency Control), or similar techniques.
๐ Recoverability
Key points to remember:
Committed transactions cannot be rolled back due to the durability property.
Rolling back a transaction restores the system to its previous state as per the atomicity property.
Unrecoverable Schedule
A schedule where a transaction reads data that was written by a committed transaction before it was rolled back.
- This kind of schedule is not allowed by a DBMS as rolling back cannot fully recover the previous state.
Recoverable Schedule
A schedule in which no transaction reads data written by an uncommitted transaction until it is committed or rolled back.
- DBMS should allow only such schedules where rollback can fully recover the previous state.
Cascading Rollback
When one transaction rolls back, other
transactions dependent on it must also roll back.
Problem: If multiple transactions roll back in a cascade, it incurs a significant cost.
Solution: Allow only schedules where transactions read data after the writing transaction commits or rolls back. This ensures cascadeless schedules.
Cascadeless Schedule
A schedule in which no transaction reads data written by an uncommitted transaction.
- Such schedules are desired because rolling back a transaction doesn't lead to cascading rollbacks.
Strict Schedule
A schedule in which no transaction reads or writes data that has been written by an uncommitted transaction.
In other words, if two write operations occur, one of them must commit/abort before the next write operation can proceed.
Rolling back is straightforward in terms of recovery; restoring the previous transaction state is sufficient.