Skip to content
benliao1 edited this page Jul 31, 2020 · 25 revisions

This page will attempt to explain the motivation behind the logger, the expected behavior of the logger, and how it works.

Motivation

There are several reasons why having a logger is important.

  1. In general, in any system, it's a good idea to have some sort of record of the events that happened. Logs allow us to do this.
  2. Not all status messages are created equal. Some messages are useful only when debugging; others are critical, and notify the user of something terrible that has happened. A logger allows us to choose which "class" of messages we see.
  3. When Runtime is running, there is literally no way for us to tell what's going on if we don't have a logger and are just connected to Runtime through Dawn. The Raspberry Pi does not have a screen attached and runs as a background service, so under normal operation we can't see any output from the program. Having a logger that emits logs over the network to Dawn for us view in the Dawn console is an absolutely critical feature for us to debug efficiently.

Expected Behavior

The logger is a tool that allows any process within Runtime to submit log messages which are then emitted on standard out (the terminal), to a file, over the network to Dawn, or some combination of the three.

The logger supports five different log levels, which are, in order of least critical to most critical: DEBUG, INFO, WARN, ERROR, and FATAL. Information about how to choose which of these levels to use for a given log message can be found in the style guide (docs/style_guide.c). The logger needs to be able to let only logs submitted at or above the level specified for the given output location to be emitted to that output location.

Finally, changing the settings for the logger should be fast. You shouldn't have to recompile all of Runtime just to get the logger to run with different output settings. So, there is a configuration file that is parsed by the logger when a process starts that configures the settings dynamically. This means that all you need to do in order to change the output settings is to edit logger.config, save it, and restart Runtime.

For an explanation of the different settings in logger.config, see the comments in that file itself.

Relevant Files

  • logger.h: the (very simple) interface presented to the other Runtime processes for using the logger
  • logger.c: the implementation of the logger interface
  • logger.config: the configuration file for configuring the output settings in the

Detailed Explanation

Reading the Config File

All of the logic described below is contained in the function read_config_file() in logger.c.

We first open the configuration file. In order to find it, we construct the absolute path of the config file, which we know will be in the same directory as logger.c (the file that is trying to find the config file). We use the special macro __FILE__ to obtain the absolute path of logger.c, which, on the Raspberry Pi, is /home/pi/c-runtime/logger/logger.c. Then, we find the location of the last / in that path and append logger.config to the string starting at that location, to get the path of the configuration file.

The rest of the function first opens the configuration file, then reads its contents line by line, skipping comment lines and blank lines, until it has parsed all seven of the configuration settings and finished setting the appropriate global variables. It sets the variable OUTPUTS, which is a bitmask representing which of the three output locations logs will be emitted. It sets the three output levels stdout_level, file_level, and network_level, which set the minimum levels of logs that will be emitted to standard out, a log file, or the network, respectively. Finally, it sets the absolute path of the log file, which will be the location that logs will be written to on the file system if the user has enabled logging to a file.

The PYTHON Log Level

This level is not one of the standard log levels because it is used for one very specific purpose: printing Python tracebacks if and when student code errors. All the other logs contain a header that shows the log level, the process that generated that log, and the timestamp of the log. We don't want this header when printing Python tracebacks; we want the traceback to appear exactly as it would appear in the Python interpreter. That's why we needed a sixth "secret" level in the logger. This level has an internal representation that is between WARN and ERROR.

Emitting a Log Message (log_printf)

All of the logic described below is contained in the function log_printf used to generate the log message emitted to the desired outputs.

First, the input format string argument char *format is parsed using va_start, vsprintf, and va_end to generate the fully formatted message string, which is stored in the variable msg. Then, if the log level is PYTHON, the msg string is copied verbatim into the final_msg string with no additional modifications. Otherwise, the log message header is generated. We get the current system time as a formatted clock time, and then convert that into a string called time_str. All of the parts of the message are then put into the final_msg string one after the other:

<log_level_str> @ <process_st> <time_str> <msg>\n

We now have the string that needs to be output contained in the string final_msg. The rest of the function is relatively straightforward:

  1. If the user has specified logs are to be emitted to standard out and the level of this log is higher than the level specified for standard out, then print final_msg to standard output using printf.
  2. If the user has specified logs are to be emitted to a file and the level of this log is higher than the level specified for the log file, then print final_msg to the file using fprintf.
  3. If the user has specified logs are to be emitted to the network and the level of this log is higher than the level specified for the network, then write final_msg to the FIFO pipe connected to net_handler using writen (defined in runtime_util), to be read and processed when a connection with Dawn is made.

The Log FIFO

A FIFO (short for "First-In-First-Out") pipe is exactly what it sounds like: a pipeline of communication between one or more writers and one or more readers where the messages are written into one end of the pipe and read out the other end in the same order that they were written in. Once a reader has pulled data out of the pipe, it is no longer in the pipe.

The logger uses a FIFO pipe to facilitate logging to the network. Every process that initializes the logger becomes a writer to the log FIFO (whose file path is defined is logger.h). The only reader of the FIFO is the network handler thread that is taking care of the connection between Runtime and Dawn, since all logs output over the network are sent to Dawn by that thread.

By default, when a process opens a FIFO pipe, it is in what's called "blocking" mode. This means that a reader and a writer must exist for that pipe before the call to open returns. This is a problem for us though, since the only reader of the FIFO pipe is the thread in the network handler managing the connection between Runtime and Dawn. That thread is only spawned when Dawn connects to Runtime, and that requires Runtime to be up and running for that connection to occur. Therefore, when each process initializes the logger, we have to open the log FIFO in write-only and non-blocking mode with the flags O_WRONLY | O_NONBLOCK in order for the call to open to return immediately and the setup routines for all the processes can continue as normal.

Below is a state machine diagram detailing how the log FIFO moves from one state to the next.

Log-FIFO-State-Machine

fifo_up is a global variable that represents whether the pipe is ready to be written to (fifo_up = 1) or not (fifo_up = 0). When the logger is initialized, fifo_up = 0 to start with, and logger_init tries to make and open the FIFO pipe for a first time. The first process to call mkfifo will succeed, because the FIFO pipe will not exist on the system and the pipe will be created successfully. Afterwards, every call to mkfifo from any process will fail, because you cannot create two FIFO pipes with the same name. We ignore this error (EEXIST).

Each time we call log_printf after the initial logger_init and fifo_up is still 0, we attempt to make and open the FIFO again. If the read end of the pipe is not yet ready (Dawn has not connected), open will fail with error code ENXIO and we stay at fifo_up = 0. Otherwise, if we successfully opened the FIFO pipe, fifo_up = 1 and we now can write logs to the FIFO to be processed by the network handler thread.

If Dawn suddenly disconnects, the network handler thread will be terminated and the read end of the pipe will be closed. The next time a process tries to write to the pipe, it will result in the signal SIGPIPE being generated (due to attempting to write to a pipe with no read end). We catch that signal, set fifo_up = 0 again, and go back to trying to make and open the pipe on every call to log_printf.

Potential Shortcomings

One small pitfall that the logger has right now is that there is no synchronization between processes to ensure that only one log is emitted at a time. If two processes happen to emit one log each at the exact same time, and the operating system performs a context switch at an unfortunate time, it is possible for logs to get duplicated, interleaved, or even dropped.

However, this is much less likely to happen between processes and much more likely to happen with multiple threads in the same process. There is protection against multiple threads writing logs at the same time, with the pthread_mutex lock around the entire log generation logic. If we wanted protection between processes, we could replace that pthread mutex lock with a semaphore instead, but that would a require a setup similar to shared memory to create and unlink the semaphore on starting and terminating Runtime. This was deemed unnecessary since under normal operation, the logger should only be emitting logs at most once or twice per second. If there are a ton of logs coming out at once, it will be from student code running in executor and not from any other process, so the chances of interference between different processes is small.