Skip to content

Semaphores

benliao1 edited this page Aug 15, 2020 · 5 revisions

What are Semaphores?

First, it would be useful if you have read the wiki page on threads; specifically, the section regarding mutexes will be important in understanding semaphores.

If you read that wiki page, here's a one-sentence explanation of what a semaphore is: mutexes are to threads what semaphores are to processes. Simple! (In fact, semaphores that can only be in two states, "unlocked" (1) and "locked" (0) are sometimes called mutexes interchangeably.)

We will explain in more depth below:

A semaphore is basically a piece of memory owned by the operating system that holds a nonnegative integer. The semaphore is accessible to any process that knows its name. A process gains access to the semaphore by opening the semaphore once it has been created by some other process. The operating system guarantees that any change to the value of this integer is "atomic", meaning that when a process increments or decrements the value of the semaphore, that value is reflected for all processes that have that semaphore open. It sounds a little pointless, but semaphores are actually very useful. Say, for example, that two processes are sharing some data. When one process tries to modify the data, the other process must not try to modify the data at the same time. It needs to wait until the first process is done updating the data before accessing it. Semaphores help ensure that this happens. You impose the following rule on both processes: if a process wants to modify the data, it needs to first decrement the semaphore before modifying the data; after it has finished modifying the data, it must then increment the semaphore. If you initialize the semaphore with the value of 1 at the start of the program, this ensures that no more than one process will be accessing the data at any given time (recall that the value of the semaphore cannot be negative, therefore if a process is modifying the data and has decremented the semaphore so that it has a value of 0, another process cannot decrement the value further to -1).

Functions Used When Working With Semaphores

There aren't many functions that are used when working with semaphores in Runtime, and they are all pretty straightforward to understand! The following is a brief overview of the functions. Semaphores are represented by the type sem_t.

sem_t *sem_open(const char *name, int oflag, mode_t mode, unsigned int value);

This function opens a semaphore with the specified name, opening options, access mode, and initial value (if the semaphore is being created). The name of the semaphore is usually something like /<name>-sem (see shm_wrapper.h for many examples). The oflag argument is either 0 (just open it) or O_CREAT (open it and create it). When oflag = O_CREAT, the third and fourth arguments have meaning; when oflag = 0, the third and fourth arguments are meaningless and are conventionally also set to 0. When oflag = O_CREAT, the mode argument sets the permissions on the semaphore (usually 0660, which is read and write permissions for user and group, no permissions for other; see additional resources for more information), and the value sets the initial value of the semaphore (remember that a semaphore is a nonnegative (unsigned) integer in the operating system). This function returns the newly opened (and perhaps newly created) semaphore.

int sem_close(sem_t *sem); 

This function closes the specified semaphore. The caller must ensure that the semaphore is not locked (does not have a value of 0) before exiting (unless that is the desired behavior).

int sem_unlink(const char *name);

This function unlinks the semaphore with the specified name. It tells the operating system to reclaim the memory that was used by that semaphore when all processes that have the specified semaphore open call sem_close() on that semaphore. As a result, calling sem_unlink on the specified name results in the specified name to be able to be reused.

int sem_wait(sem_t *sem);

This function decrements the specified semaphore. If the semaphore cannot be decremented (it has a value of 0), this function blocks until the semaphore can be decremented. In any case, the result after this function returns is the semaphore has been effectively acquired by the calling process.

int sem_post(sem_t *sem); 

This function increments the specified semaphore. It does not block.

POSIX semaphores vs. System V semaphores

This page has talked entirely about POSIX semaphores, which are the newest implementation of semaphores on Linux systems. They are generally considered to be more lightweight (i.e. faster) and have a cleaner programming interface than the older System V semaphores. However, you may encounter them in your own research about semaphores through the Internet, so be on the lookout for functions like semget(), semctl(), and semop(). These functions are System V semaphore functions; it is interesting to read pages comparing the two semaphore systems.

POSIX named semaphores vs. POSIX unnamed semaphores

In Runtime, we exclusively use named semaphores. Every semaphore that is defined or created by the shared memory wrapper has a name. Named semaphores are good when multiple unrelated processes are accessing the same semaphore, but it can be difficult to ensure that named semaphores are deleted properly. If some process exits without posting the named semaphore, or if some process exits without closing the name semaphore, it can lead to disastrous consequences such as leaving all the other processes that wait on that semaphore to wait forever or lead to a failure on reboot of the program because the named semaphore is still on the system but is inaccessible to the new process.

Unnamed semaphores are identified by a memory address instead of a name on the file system; as a result, unnamed semaphores are typically used to control thread operations within the same process (where the semaphore can be initialized once, stored in a global variable, and then accessed by various threads spawned by that process). However, it is possible to use unnamed semaphores across different process, by creating a block of shared memory (see the wiki page on shared memory), memory-mapping the block of shared memory, and then forking the process (or creating a block of shared memory in one process, memory-mapping it in the same way to two separate processes, and then putting the unnamed semaphore into a pre-determined location in the memory-mapped block). The interface for working with unnamed semaphores is somewhat similar and somewhat different from the interface for working with named semaphores, and is as follows:

Unnamed Semaphore Named Semaphore Equivalent
sem_init sem_open
sem_wait sem_wait
sem_post sem_post
sem_destroy sem_close + sem_unlink

Named semaphores were chosen over unnamed semaphores in Runtime because sharing unnamed semaphores among unrelated processes is cumbersome, and we had enough confidence in our code to be able to ensure proper cleanup of semaphores on exit.

Use in Runtime

Semaphores are used in Runtime exclusively in the shared memory, to control access to the various shared memory blocks between the three main Runtime processes (net_handler, dev_handler, and executor). More information on exactly how they are used can be found in the Shared Memory Wrapper wiki.

Additional Resources

Clone this wiki locally