[Concurrency Control] Why Do We Need Concurrency Control? - Schedule, Serializability, Recoverability

ยท

4 min read

๐Ÿ’ก
Why Does CodeBLUE Project Need Concurrency Control?

๐Ÿ”— Schedule & Conflict Serializability

  • Transaction: A sequence of actions where patients A and B access the database.

    • R (Read): Retrieve the value of available_beds from the Hospitals table.

    • W1 (Write): Decrement the value of available_beds in the Hospitals table by 1.

    • W2 (Write): Set the value of is_sent in the Reports table to true.

  • Operation: The individual Read/Write actions performed within a transaction.

  • Schedule: The sequential ordering of operations from multiple transactions.

Schedule

The order in which operations from multiple transactions are executed when running concurrently.

  • The order of operations within each transaction remains unchanged.

  • Serial Schedule: Transactions are executed one after another without overlap.

    • This approach lacks concurrency, resulting in suboptimal performance as only one transaction is processed at a time. In reality, this approach is not practical.

    • Read and Write operations involve I/O, while handling a new transaction is a CPU task. In a serial schedule, while processing one transaction, the CPU remains idle, leading to decreased performance.

  • Non-serial Schedule: Transactions are overlapped (interleaved) and executed concurrently.

    • As transactions are executed concurrently, it increases concurrency and allows processing more transactions in the same timeframe, resulting in better performance.

    • CPU tasks continue without idle time, as other transactions can be initiated while performing I/O operations for a single transaction.

  • The problem with a non-serial schedule is that the outcome can be incorrect depending on how transactions are interleaved. => Dilemma: Can non-serial schedules be executed without producing incorrect results? => Idea: Execute the same non-serial schedule as a serial schedule. => To do this, we need to define the meaning of "schedule equivalence."

Conflict

Two operations are considered a conflict if they satisfy the following three conditions:

  1. They belong to different transactions.

  2. They access the same data item.

  3. At least one of them is a Write operation (read-write conflict or write-write conflict).

  • Changing the order of conflict operations will alter the result.

Conflict Equivalent

Two schedules are considered conflict equivalent if they satisfy the following two conditions:

  1. Both schedules involve the same set of transactions.

  2. The order of any conflicting operations is the same in both schedules.

Conflict Serializability

A schedule is conflict serializable if it is equivalent to a serial schedule.

  • Conflict serializable non-serial schedules can resolve the problem of incorrect outcomes based on how transactions are interleaved.

Conclusion

Concurrency control can ensure that any schedule is serializable.

Practical Implementation

Implement a protocol that allows multiple transactions to execute concurrently while ensuring the resulting schedule is conflict serializable. => This protocol can be implemented using Locks, MVCC (Multi-Version Concurrency Control), or similar techniques.

๐Ÿ”— Recoverability

  • Key points to remember:

    • Committed transactions cannot be rolled back due to the durability property.

    • Rolling back a transaction restores the system to its previous state as per the atomicity property.

Unrecoverable Schedule

A schedule where a transaction reads data that was written by a committed transaction before it was rolled back.

  • This kind of schedule is not allowed by a DBMS as rolling back cannot fully recover the previous state.

Recoverable Schedule

A schedule in which no transaction reads data written by an uncommitted transaction until it is committed or rolled back.

  • DBMS should allow only such schedules where rollback can fully recover the previous state.

Cascading Rollback

When one transaction rolls back, other

transactions dependent on it must also roll back.

  • Problem: If multiple transactions roll back in a cascade, it incurs a significant cost.

  • Solution: Allow only schedules where transactions read data after the writing transaction commits or rolls back. This ensures cascadeless schedules.

Cascadeless Schedule

A schedule in which no transaction reads data written by an uncommitted transaction.

  • Such schedules are desired because rolling back a transaction doesn't lead to cascading rollbacks.

Strict Schedule

A schedule in which no transaction reads or writes data that has been written by an uncommitted transaction.

  • In other words, if two write operations occur, one of them must commit/abort before the next write operation can proceed.

  • Rolling back is straightforward in terms of recovery; restoring the previous transaction state is sufficient.

ย