Skip to main content

Command Palette

Search for a command to run...

[TIL] Concurrency Control

05/20/23

Published
[TIL] Concurrency Control

Concurrency

Concurrency vs Parallelism

Concurrency is the concept of executing multiple threads alternately through context switching in a single core, giving the illusion of simultaneous execution. On the other hand, parallelism involves dividing the work and processing it simultaneously on multiple physical cores.

Concurrency: Making multiple tasks appear to be executed "concurrently." Parallelism: Actually processing multiple tasks "in parallel."

  • Reference

    • Multitasking: A concept that is often confused with concurrency. It refers to the execution of multiple tasks simultaneously in a single system. Each task is independent (e.g., tasks: watching YouTube videos, updating KakaoTalk desktop app, opening VSCode, etc.) => Operation of processes

    • Concurrency: Simultaneously processing multiple subtasks within a single task. It refers to dividing a task into multiple threads for processing or asynchronously processing multiple tasks. Each subtask has dependencies (e.g., task: watching YouTube videos, subtasks: loading, recommendations, controls, playback, etc.) => Operation of threads

\=> Through concurrency processing, multiple threads (workers) can appear to be executing simultaneously.

Reasons for Concurrency Control

When multiple threads execute concurrently, access to shared resources (e.g., posting a request for one diagnosis) can occur simultaneously, leading to unexpected results or data inconsistency (synchronization issues). => Requires appropriate synchronization techniques and concurrency control methods.

But JS is single-threaded, right?

  • JS is single-threaded, meaning the JS engine executes code in a single thread and processes one task sequentially.

  • Why is concurrency control needed in JS?

    • Situations where concurrency control is required:

      1. Asynchronous tasks: JavaScript has a strong advantage in handling asynchronous tasks. For example, network requests, file I/O, database queries, etc., are processed asynchronously without blocking. Through concurrency control, these asynchronous tasks can be efficiently managed and executed in parallel.

      2. Event handling: JavaScript has an event-driven programming model. When handling events occurring in a web browser (e.g., click, keyboard input) or events occurring on the server-side (e.g., requests, responses), concurrency control allows managing events concurrently and improving responsiveness.

      3. Parallel processing: There are situations in JavaScript where parallel processing is required. For example, when processing a large amount of data simultaneously, concurrency control can improve processing speed by executing tasks in parallel.

      4. Multi-threaded environment and interoperability: Although JavaScript is a single-threaded language, it can emulate a multi-threaded environment using technologies like Web Workers. Through concurrency control, tasks can be distributed and interoperability can be provided in a multi-threaded environment.

    • What is Web Worker?

      • Supports multi-threaded processing => fast processing without order

      • Allows JS code to run in a background thread, separate from the main thread, enabling work to be performed in parallel.

      • If there are 300,000 diagnostic submission requests, they can be divided into small units and assigned to each web worker.

      • Each web worker can process the assigned tasks in parallel, allowing concurrent processing.

\=> Conclusion: JS is single-threaded. However, we need concurrency control under asynchronous tasks and event-driven environment.

Concurrency Control in JavaScript

Locking

  1. Mutex Lock: Ensures that only one thread can access a shared resource at a time, allowing multiple tasks to be processed sequentially. It implements mutual exclusion.

  2. Semaphore: Extends mutex locks by limiting the number of threads that can access a shared resource simultaneously. It uses a counter to control the allowed number of resource accesses.

  3. Reader-Writer Lock: Supports concurrent reading and exclusive writing. Multiple threads can read simultaneously, but writing is exclusive.

  4. Compare-and-Swap (CAS): Controls concurrency using atomic operations. It compares the current value of a variable with an expected value and sets a new value if they match. CAS requires hardware support for atomic operations.

Apache Kafka

Apache Kafka is a distributed streaming platform that enables real-time processing of large-scale data. It can handle massive amounts of data asynchronously without the need for explicit locking or web worker mechanisms. Kafka internally distributes and processes data through partitioning and replication, using consumer groups to achieve parallel processing and concurrency control.

By utilizing Kafka's messaging queue, you can control concurrency and scale processing according to your needs. Producers send messages to Kafka, while consumers read and process those messages. Kafka organizes data into topics, which are divided into multiple partitions. Each partition can be processed in parallel by different consumer groups, ensuring scalability and performance. Kafka also provides durability and replication features to minimize data loss and ensure high availability.

Kafka is widely used for handling large-scale data processing, real-time streaming, and building data pipelines. It offers reliable storage on disk, replication for fault tolerance, and real-time stream processing capabilities for various applications such as real-time analytics, log processing, and event sourcing.

In conclusion, Kafka can be used to achieve parallel processing, handle large-scale data, and provide concurrency control without the need for explicit locking or web worker mechanisms.