Background

We’ve already seen that processes can execute concurrently or in parallel
- concurrently:CPU scheduler switches rapidly between processes to provide concurrent execution.
- parallel:two instruction streams (representing different processes) execute simultaneously on separate processing cores
Although the producer and consumer routines shown above are correct separately, they may not function correctly when executed concurrently
We would arrive at this incorrect state because we allowed both processes to manipulate the variable count concurrently
A situation like this, where several processes access and manipulate the same data concurrently and the outcome of the execution depends on the particular order in which the access takes place, is called a race condition

The Critical-Section Problem

. Consider a system consisting of n processes . Each process has a segment of code, called a critical section, in which the process may be accessing — and updating — data that is shared with at least one other process.
The important feature of the system is that, when one process is executing in its critical section, no other process is allowed to execute in its critical section. That is, no two processes are executing in their critical sections at the same time.
Each process must request permission to enter its critical section
The section of code implementing this request is the entry section. The critical section may be followed by an exit section. The remaining code is the remainder section
A solution to the critical-section problem must satisfy the following three requirements:
- Mutual exclusion. If process is executing in its critical section, then no other processes can be executing in their critical sections
- Progress. If no process is executing in its critical section and some processes wish to enter their critical sections, then only those processes that are not executing in their remainder sections can participate in deciding which will enter its critical section next, and this selection cannot be postponed indefinitely.
- Bounded waiting. There exists a bound, or limit, on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is granted
Two general approaches are used to handle critical sections in operating systems: preemptive kernels and nonpreemptive kernels
Obviously, a nonpreemptive kernel is essentially free from race conditions on kernel data structures, as only one process is active in the kernel at a time.We cannot say the same about preemptive kernels.
A preemptive kernel may be more responsive ,and is more suitable for real-time programming, as it will allow a real-time process to preempt a process currently running in the kernel.

Peterson’s Solution

Next, we illustrate a classic software-based solution to the critical-section problem known as Peterson’s solution
there are no guarantees that Peterson’s solution will work correctly on such architectures

Peterson’s solution is restricted to two processes that alternate execution between their critical sections and remainder sections

int turn，
boolean flag [2]；
while （true）｛ 
	flag[i] = true； 
	turn= j； 
	while （flag [j] && turn == j） 
	； 
	/* critical section */ 
	flag[i] = false； 
	/*remainder section */
｝

Hardware Support for Synchronization

Memory Barriers

. How a computer architecture determines what memory guarantees it will provide to an application program is known as its memory model. In general, a memory model falls into one of two categories:
- Strongly ordered, where a memory modification on one processor is immediately visible to all other processors
- Weakly ordered, where modifications to memory on one processor may not be immediately visible to other processors.
computer architectures provide instructions that can force any changes in memory to be propagated to all other processors, thereby ensuring that memory modifications are visible to threads running on other processors. Such instructions are known as memory barriers or memory fences
Note that memory barriers are considered very low-level operations and are typically only used by kernel developers when writing specialized code that ensures mutual exclusion

Hardware Instructions

test and set()
- important characteristic of this instruction is that it is executed atomically. Thus, if two test and set() instructions are executed simultaneously (each on a different core), they will be executed sequentially in some arbitrary order.
The compare and swap() instruction (CAS)
- if two CAS instructions are executed simultaneously (each on a different core), they will be executed sequentially in some arbitrary order.

Atomic Variables

Typically, the compare and swap() instruction is not used directly to provide mutual exclusion. Rather, it is used as a basic building block for constructing other tools that solve the critical-section problem. One such tool is an atomic variable, which provides atomic operations on basic data types such as integers and booleans.
These functions are often implemented using compare and swap() operations.
their use is often limited to single updates of shared data such as counters and sequence generators

Mutex Locks

operating-system designers build higher-level software tools to solve the critical-section problem. The simplest of these tools is the mutex lock. (In fact, the term mutex is short for mutual exclusion.)
Calls to either acquire() or release() must be performed atomically.Thus, mutex locks can be implemented using the CAS operation
The main disadvantage of the implementation given here is that it requires busy waiting.
The type of mutex lock we have been describing is also called a spin-lock because the process “spins” while waiting for the lock to become available.

Semaphores

A semaphore S is an integer variable that, apart from initialization, is accessed only through two standard atomic operations: wait() and signal().
the wait() operation was originally termed P,signal() was originally called V
The definition of wait() is as follows:

wait （S）｛ 
	while （S <= 0） 
	；// busy wait 
	S--；
}

The definition of signal() is as follows:
1
2
3
signal（S）｛
S++；
｝
when one process modifies the semaphore value, no other process can simultaneously modify that same semaphore value

Semaphore Usage

The value of a counting semaphore can range over an unrestricted domain. The value of a binary semaphore can range only between 0 and 1.
Each process that wishes to use a resource performs a wait() operation on the semaphore (thereby decrementing the count). When a process releases a resource, it performs a signal() operation (incrementing the count).

Semaphore Implementation

The process is restarted by a wakeup() operation, which changes the process from the waiting state to the ready state. The process is then placed in the ready queue.
1
2
3
4
typedef struct｛
int value；
struct process *list；
｝semaphore；

Now, the wait() semaphore operation can be defined as

wait（semaphore *S）｛
	S->value--； 
	if （S->value < 0）｛ 
		add this process to S->1ist； 
		sleep()；
	｝
 ｝

and the signal() semaphore operation can be defined as

signal（semaphore *s）｛ 
	S->valuet++； 
	if （S->value <= 0）｛ 
		remove a process P from S->1ist； 
		wakeup （P）； 
	｝ 
｝

guarantee that no two processes can execute wait() and signal() operations on the same semaphore at the same time
- single-processor environment:inhibiting interrupts
- multicore environment:such as compare and swap() or spinlocks # Monitors
the semaphore solution to the critical-section problem:All processes share a binary semaphore variable mutex, which is initialized to 1. Each process must execute wait(mutex) before entering the critical section and signal(mutex) afterward
one fundamental high-level synchronization construct—the monitor type.

Monitor Usage

An abstract data type—or ADT—encapsulates data with a set of functions to operate on that data that are independent of any specific implementation of the ADT.
A monitor type is an ADT that includes a set of programmer-defined operations that are provided with mutual exclusion within the monitor.
we need to define additional synchronization mechanisms. These mechanisms are provided by the condition construct.

Implementing a Monitor Using Semaphores

这三段代码结合在一起展示了如何使用信号量（semaphore）实现监视器（monitor），一种确保线程安全访问共享资源的机制。通过信号量和条件变量，代码能够有效管理线程之间的同步，确保对共享资源的访问是有序且排他的。下面是将三段代码结合起来的完整解释：

背景：监视器与信号量

监视器是一种高级同步机制，它封装了共享资源及对这些资源的操作，同时确保只有一个线程能够在任何时刻进入监视器中的关键部分（临界区）。
信号量在这里用于实现两方面的控制：一是线程对共享资源的互斥访问，二是线程在特定条件下的等待和唤醒。

代码解析（综合三段）：

第一部分：进入监视器的控制

x_count++;
if (next_count > 0)
    signal(next);
else
    signal(mutex);
wait(x_sem);
x_count--;

x_count++：进入监视器的线程数量加 1，用于记录当前有多少线程试图进入监视器。
if (next_count > 0)：检查是否有其他线程在等待监视器的资源。
- 如果有 (next_count > 0)，则唤醒下一个等待线程（signal(next);）。
- 否则，释放 mutex，允许其他线程获得互斥锁（signal(mutex);）。
wait(x_sem)：当前线程等待信号量 x_sem，这确保了它只有在某些条件满足后才能继续执行。
x_count--：离开监视器时，将 x_count 减 1。

第二部分：退出时的控制

wait(mutex);
// body of F (执行监视器的关键操作)
if (next_count > 0)
    signal(next);
else
    signal(mutex);

wait(mutex)：线程在执行关键操作之前需要先获取互斥锁 mutex，确保当前没有其他线程在访问共享资源。
执行关键部分的代码（body of F）。
完成后，检查是否有其他线程在等待：
- 如果 next_count > 0，则通过 signal(next) 唤醒下一个等待的线程。
- 否则，通过 signal(mutex) 释放互斥锁，让其他线程能够获取锁并进入临界区。

第三部分：等待和唤醒机制

if (x_count > 0) {
    next_count++;
    signal(x_sem);
    wait(next);
    next_count--;
}

if (x_count > 0)：如果当前监视器中仍有其他线程（x_count > 0），则执行以下操作：
- next_count++：等待线程的计数增加，表示该线程现在处于等待状态。
- signal(x_sem)：发出信号，释放信号量 x_sem，通知其他线程可以继续执行。
- wait(next)：当前线程进入等待状态，直到被 next 信号唤醒为止。
- next_count--：唤醒后，等待线程的计数减少。

结合后的解释：

进入监视器：每个线程试图进入监视器时，先增加 x_count，表示有一个新的线程进入。然后根据是否有其他线程在等待，决定是唤醒下一个线程还是释放互斥锁。接着，线程等待 x_sem 信号，确保同步执行。
在监视器内执行：线程进入临界区后，首先获取互斥锁 mutex，执行共享资源的操作。执行完毕后，再次检查是否有等待的线程，如果有则唤醒下一个等待线程，否则释放锁。
线程的等待与唤醒：如果有线程在等待监视器资源，计数器 next_count 增加，并发送 x_sem 信号，接着当前线程进入 next 信号的等待状态。被唤醒后，线程减少 next_count，表示已经结束等待。

总结：

通过这三段代码，使用信号量实现了对监视器的控制，确保线程能够有序地进入和退出临界区，同时通过条件变量（x_sem 和 next）控制线程的等待和唤醒。这种方式保证了多个线程在访问共享资源时的互斥和同步，实现了对并发线程的安全管理。

Resuming Processes within a Monitor

how do we determine which of the suspended processes should be resumed next? One simple solution is to use a first-come, first-served (FCFS) ordering.
the conditional wait construct can be used.

1	x.wait(c);

The value of c, which is called a priority number, is then stored with the name of the process that is suspended.

Liveness

Liveness refers to a set of properties that a system must satisfy to ensure that processes make progress during their execution life cycle
A process waiting indefinitely under the circumstances just described is an example of a “liveness failure.”
A very simple example of a liveness failure is an infinite loop.

Deadlock

The implementation of a semaphore with a waiting queue may result in a situation where two or more processes are waiting indefinitely for an event that can be caused only by one of the waiting processes. The event in question is the execution of a signal() operation. When such a state is reached, these processes are said to be deadlocked.
We say that a set of processes is in a deadlocked state when every process in the set is waiting for an event that can be caused only by another process in the set.

Priority Inversion

As an example, assume we have three processes—L, M, and H—whose priorities follow the order L < M < H. Assume that process H requires a semaphore S, which is currently being accessed by process L. Ordinarily, process H would wait for L to finish using resource S. However, now suppose that process M becomes runnable, thereby preempting process L. Indirectly, a process with a lower priority—process M—has affected how long process H must wait for L to relinquish resource S.
This liveness problem is known as priority inversion, and it can occur only in systems with more than two priorities. Typically, priority inversion is avoided by implementing a priority-inheritance protocol
In the example above, a priority-inheritance protocol would allow process L to temporarily inherit the priority of process H, thereby preventing process M from preempting its execution. When process L had finished using resource S, it would relinquish its inherited priority from H and assume its original priority.Because resource S would now be available, process H—not M—would run next.

Evaluation

The following guidelines identify general rules concerning performance differences between CAS-based synchronization and traditional synchronization (such as mutex locks and semaphores) under varying contention loads:
- Uncontended. Although both options are generally fast, CAS protection will be somewhat faster than traditional synchronization.
- Moderate contention. CAS protection will be faster—possibly much faster —than traditional synchronization.
- High contention. Under very highly contended loads, traditional synchronization will ultimately be faster than CAS-based synchronization
The hardware solutions are considered very low level and are typically used as the foundations for constructing other synchronization tools, such as mutex locks.

ZC's Blog

Synchronization Tools