Skip to content

02. The Very First Skeleton

Guanzhou Hu edited this page Jan 15, 2022 · 18 revisions

In this very first step, we will build from scratch towards a "Hello, world!" kernel skeleton running on Intel x86 IA32 architecture.

James's tutorial is not recommended here because many of its technical choices are obsolete and inappropriate. Instead, follow the Bare Bones page of the OSDev wiki. We should scan through the wiki page (and all the subpages it links to) carefully since a correct, robust environment saves time in later development.

We are making a kernel from scratch so we need a host OS as the development environment until our own system is self-hostable (this will be a long time away, if ever possible). I am developing on Ubuntu 18.04. Other platforms are fine but I would recommend a Linux-based distribution since building up the required toolchain on it would be much easier. (Try Vagrant if needed, a handy VM management tool that enables you to work within a Linux VM if your main host is non-Linux.)

Main References of This Chapter

Scan through them before going forth:

Development Environment Setup

The very first step towards any successful project must be setting up a proper development environment. For developing Hux, we will make the following preparations.

QEMU Emulator & GRUB Bootloader

How Does a Kernel Load? Here is a flowchart I made that demonstrates a simplified OS booting procedure:

In this project, we will be using these two wonderful tools to help boot our kernel:

  • QEMU system-i386 >= v6.2.0 as the hardware platform (the computer that runs our OS)
    # If your host OS is Debian/Ubuntu, build from source if apt source too old:
    $ sudo apt install ninja-build libpixman-1-dev
    $ wget https://download.qemu.org/qemu-6.2.0.tar.xz
    $ tar xvJf qemu-6.2.0.tar.xz
    $ cd qemu-6.2.0
    $ ./configure --target-list=i386-softmmu
    $ make -j$(nproc)
    # Now, prepend $(pwd)/build to PATH in your shell.
  • GRUB v2 as the bootloader (only supports booting 32-bit OS)
    # If your host OS is Debian/Ubuntu:
    $ sudo apt install grub2-common xorriso
    # We will use grub-mkrescue from xorriso to make CDROM image.

This means that the only thing left to boot our OS is a small piece of x86 assembly booting code boot.s following GRUB's multiboot specification. Sounds good, it won't be that hard!

Cross-compiler For i686-elf

Our Hux kernel should be compiled into a 32-bit x86 ELF. Hux will definitely be different with your host OS thus we must setup a cross compiling toolchain to ensure safety and correctness ✭.

  • Your host system's gcc (called system compiler in the wiki) version should be no older than the cross compiler you are building
  • If GNU official mirror is slow, try USTC mirror or others near you

Follow the "Building GCC" page to upgrade your system compiler & other dependencies to the latest release. Then, follow the "GCC Cross-Compiler" page to build a gcc cross-compiler of the same version for the i686-elf target.

You should now have a cross-compiler i686-elf-gcc ready for use (and exported in PATH). It is runnable on you host OS, yet produces ELF of our target architecture x86 IA32 and does not rely on any C library for the target. A set of other binary utilities, e.g., i686-elf-objdump, should be ready as well.

"Hello, world!" Kernel

Bootstraping Assembly Code

We normally need to write this piece of booting code in assembly because we do not yet have any memory abstractions over the physical memory at this point. Any higher-level languages (including C) won't work without a basic stack.

Code @ src/boot/boot.s:

/**
 * Declare the multiboot header. It should contain "magic", "flags", and
 * "checksum" in order. GRUB will search for this header in the first 8KiB of
 * the kernel file, aligned to 32-bit boundary.
 */
.set MAGIC,     0x1BADB002

.set ALIGN,     1<<0    /** Align modules on page boundaries. */
.set MEMINFO,   1<<1    /** Provide memory map. */
.set FLAGS,     ALIGN | MEMINFO

.set CHECKSUM,  -(MAGIC + FLAGS)    /** Add up to 0. */

/** Header section here. */
.section .multiboot
.align 4
.long MAGIC
.long FLAGS
.long CHECKSUM


/**
 * The kernel must provide a temporary stack for code execution at booting,
 * when the entire virtual memory system and user mode execution has not been
 * set up yet. We make a 16KiB stack space here.
 * 
 * It will grow downwards from current 'stack_hi'. The stack must be
 * aligned to 16 Bytes according to the System V ABI standard.
 */
.section .bss
.align 16
stack_lo:
.skip 16384     /** 16 KiB. */
stack_hi:


/**
 * Our linker script will use '_start' as the entry point to the kernel. No
 * need to return from the _start function because the bootloader has gone
 * at this time point.
 *
 * We will be in 32-bit x86 protected mode. Interrupts disabled, paging
 * disabled, and processor state as defined in multiboot specification 3.2.
 * The kernel has full control over the machine.
 */
.section .text
.global _start
.type _start, @function
_start:

    /** Setup the kernel stack by setting ESP to our 'stack_hi' symbol. */
    movl $stack_hi, %esp

    /**
     * Other processor state modifications and runtime supports (such as
     * enabling paging) should go here. Make sure your ESP is still 16 Bytes
     * aligned.
     */
    
    /** Jump to the 'kernel_main' function. */
    call kernel_main

    /**
     * Put the computer into infinite loop if it has nothing more to do after
     * the main function. The following is such a trick.
     */
    cli         /** Disable interrupts. */
halt:
    hlt         /** Halt and wait for the next interrupt. */
    jmp halt    /** Jump to 'halt' if any non-maskable interrupt occurs. */


/**
 * Set the size of the '_start' symbol as the current location '.' minus
 * the starting point. Useful when later debugging or implementing call
 * stack tracing.
 */
.size _start, . - _start

It is written in AT&T style (GAS) thus it can be directly assembled by the i686-elf-as assembler (GAS) installed along with the cross compiler.

Kernel Entry Code

Now comes the actual C kernel code that boot.s kicks the processor into. When writing an OS, we must write Freestanding version C code (instead of Hosted version) because we do not have a C standard library and other runtime supports. Headers we can import from outside only include those provided in libgcc:

  • <stdbool.h> for _Bool datatype
  • <stddef.h> for size_t and NULL
  • <stdint.h> for fixed length datatypes, e.g., int32_t
  • <limits.h>, <stdarg.h>, <float.h>, and some more

Code @ src/kernel.c:

#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>


/** Check correct cross compiling. */
#if !defined(__i386__)
#error "The Hux kernel needs to be compiled with an 'ix86-elf' compiler"
#endif


/**
 * Minimal VGA terminal display support.
 * VGA text mode display buffer is at address 0xB8000 by specification.
 */
uint16_t *VGA_BUFFER = (uint16_t *) 0xB8000;

static const size_t VGA_WIDTH  = 80;
static const size_t VGA_HEIGHT = 25;
static const uint8_t VGA_COLOR = 0 << 4 | 15;   /** Black bg | White fg. */

/** A VGA entry (here, a char) = [4bits bg | 4bits fg | 8bits content]. */
static inline uint16_t
vga_entry(unsigned char c)
{
    return (uint16_t) c | (uint16_t) VGA_COLOR << 8;
}

/** Current terminal position. */
size_t terminal_row;
size_t terminal_col;

void
terminal_init(void)
{
    terminal_row = 0;
    terminal_col = 0;

    /** Flush the window to be all spaces. */
    for (size_t y = 0; y < VGA_HEIGHT; ++y) {
        for (size_t x = 0; x < VGA_WIDTH; ++x) {
            VGA_BUFFER[y * VGA_WIDTH + x] = vga_entry(' ');
        }
    }
}

void
terminal_putchar(char c)
{
    VGA_BUFFER[terminal_row * VGA_WIDTH + terminal_col] = vga_entry(c);

    /** When cursor hits the window boundary. */
    if (++terminal_col == VGA_WIDTH) {
        terminal_col = 0;
        if (++terminal_row == VGA_HEIGHT) {
            terminal_row = 0;
        }
    }
}

void
terminal_writestring(const char *str)
{
    /** Calculate string length. */
    size_t len = 0;
    while (str[len])
        len++;

    for (size_t i = 0; i < len; ++i)
        terminal_putchar(str[i]);
}


/** The main function that `boot.s` jumps to. */
void
kernel_main(void)
{
    terminal_init();
    terminal_writestring("Hello, world!");
}

We will stick to gnu99 C standard (C99 with GNU extensions) throughout this whole project ✭.

You can ignore VGA terminal functions in this piece of code for now. We will talk about special addresses mapping to other devices instead of the main memory, and re-design the VGA text-mode graphics driver in the next chapter.

Linker Script

We may now compile kernel.c and assemble boot.s both into object files. However, they do not combine as a runnable OS ELF until we correctly link them. The GNU linker ld installed during cross compiler setup can do the linking under the guidance of our own linker script kernel.ld. It specifies where to put each section, thus indirectly specifies the kernel part's memory layout in physical memory.

Code @ scripts/kernel.ld:

/** Starts execution at the '_start' symbol as defined in `boot.s`. */
ENTRY(_start)


/** Sections layout. */
SECTIONS
{
    /**
     * Kernel's booting code will be loaded starting at 1MiB address by the
     * bootloader by convention.
     */
    . = 1M;

    .text ALIGN(4K):    /** Align to 4KiB boundary. */
    {
        KEEP(*(.multiboot))     /** Put multiboot header before code. */
        *(.text)
        *(.comment)
    }

    .rodata ALIGN(4K):
    {
        *(.rodata)
    }

    .data ALIGN(4K):
    {
        *(.data)
    }

    .bss ALIGN(4K):
    {
        *(.bss)     /** Includes our 16KiB temporary stack. */
        *(COMMON)
    }
}

GRUB Menu Config

Put grub.cfg with the following content @ scripts/grub.cfg.

menuentry "Hux" {
    multiboot /boot/hux.bin    
}

This guides GRUB to put the first boot menu entry with the name "Hux", which boots into the kernel image hux.bin (inside the CDROM ISO which is described below).

Centralized Makefile

The remaining steps are:

  1. Verifying multiboot
  2. Creating a bootable CDROM image
  3. Launching the Hux OS with QEMU from that image

(along with compiling, assembling, linking the above code) can all be centralized within a Makefile:

#!Makefile


TARGET_BIN=hux.bin
TARGET_ISO=hux.iso

C_SOURCES=$(shell find ./src/ -name "*.c")
C_OBJECTS=$(patsubst %.c, %.o, $(C_SOURCES))

S_SOURCES=$(shell find ./src/ -name "*.s")
S_OBJECTS=$(patsubst %.s, %.o, $(S_SOURCES))

ASM=i686-elf-as
ASM_FLAGS=

CC=i686-elf-gcc
C_FLAGS=-c -Wall -Wextra -ffreestanding -O2 -std=gnu99

LD=i686-elf-gcc
LD_FLAGS=-ffreestanding -O2 -nostdlib

HUX_MSG="[--Hux->]"


#
# Targets for building.
#
all: $(S_OBJECTS) $(C_OBJECTS) kernel verify update

$(S_OBJECTS): %.o: %.s
    @echo
    @echo $(HUX_MSG) "Compiling kernel assembly '$<'..."
    $(ASM) $(ASM_FLAGS) -o $@ $<

$(C_OBJECTS): %.o: %.c
    @echo
    @echo $(HUX_MSG) "Compiling kernel C code '$<'..."
    $(CC) $(C_FLAGS) -o $@ $<

kernel:
    @echo
    @echo $(HUX_MSG) "Linking kernel image..."  # Remember to link 'libgcc'.
    $(LD) $(LD_FLAGS) -T scripts/kernel.ld -lgcc -o $(TARGET_BIN) $(S_OBJECTS) $(C_OBJECTS)


#
# Verify GRUB multiboot sanity.
#
.PHONY: verify
verify:
    @if grub-file --is-x86-multiboot $(TARGET_BIN); then \
        echo;                                            \
        echo $(HUX_MSG) "VERIFY MULTIBOOT: Confirmed ✓"; \
    else                                                 \
        echo;                                            \
        echo $(HUX_MSG) "VERIFY MULTIBOOT: FAILED ✗";    \
    fi


#
# Update CDROM image.
#
.PHONY: update
update:
    @echo
    @echo $(HUX_MSG) "Writing to CDROM..."
    mkdir -p isodir/boot/grub
    cp $(TARGET_BIN) isodir/boot/$(TARGET_BIN)
    cp scripts/grub.cfg isodir/boot/grub/grub.cfg
    grub-mkrescue -o $(TARGET_ISO) isodir


#
# Launching QEMU/debugging.
#
.PHONY: qemu
qemu:
    @echo
    @echo $(HUX_MSG) "Launching QEMU..."
    qemu-system-i386 -vga std -cdrom $(TARGET_ISO)


#
# Clean the produced files.
#
.PHONY: clean
clean:
    @echo
    @echo $(HUX_MSG) "Cleaning the build..."
    rm -f $(S_OBJECTS) $(C_OBJECTS) $(TARGET_BIN) $(TARGET_ISO)

Pay attention to the compiling/linking flags, they are important. Explanations can be found on the Bare Bones page.

Progress So Far

Current repo structure:

hux-kernel
├── Makefile
├── scripts
│   ├── grub.cfg
│   └── kernel.ld
├── src
│   ├── boot
│   │   └── boot.s
│   └── kernel.c

Perform the build chain (compile - assemble - link - verify - write into CDROM image) by:

$ make [verify|update]  # If you only want to verify multiboot/update the image.

Clean up the build by:

$ make clean

Launch the Hux kernel on QEMU with:

$ make qemu

You should see the GRUB UI:

Hit [Enter] to select our Hux OS:

Now we have a self-made (though minimal) operating system kernel!

Running QEMU within a desktop VM might be problematic (there are keyboard mapping and networking issues). If you develop within a Linux VM, you could sync the dev folder between your host system and the VM, build the image in the VM, and then use QEMU on your host system to launch Hux.