Lecture 8 - 5 October 2023

IMPORTANT ANNOUNCEMENT

errerewerewrerewwwe... the following is a test of the emergency broadcast system. errrrrwerwererwerwe ... ....

The instructors are willing to spend more time helping students who take initiative and are less patient when asked questions at 11 PM on a due date for which the answers are very easily available by a web search. If you have made it this far in computer science, then you should have at least moderate proficiency in searching the internet for the solution to your computer problems.

BEFORE you contact an instructor about a problem you are having, the expectation is that:

IF you are able to solve your own problem, but you think that:

Then, we invite you to contact us and we will appreciate your participation and take it into account. But if you are interested in contributing your change to our open source project that is the entire course, we will guide you through the process of making, testing, and submitting the change. Student who go this extra mile and complete the process will be given extra credit in our course and public recognition for their contribution.

Keep in mind that this policy covers anything in the course, including code, scripts, process, curriculum, formatting, style, security, design, testing, automation, devops, site reliability, IT, or any other aspect that you can imagine and convince us is a part of the course.

HOWEVER, if you are still stuck and you have tried a couple of solutions, include ALL of the following in your request for help:

The more complete this information is when you ask for help, the more generous and flexible the instructors will be in their effort to help you.

You will get out of this course what you put into it.

And now back to your regularly scheduled programming.

Topics covered:

Assignment Questions

A student asked us to clarify the P1 due date.

P1 is NOT due next Tuesday. We want to give you plenty of time to do P1 since it’s a bit more involved than E1. As a reminder, P assignments are the most heavily weighted component of the course outside of the final assignment and presentation combo.

As always, we encourage you to get started early so you have plenty of time to take breaks when you run into problems and later return to your work with a fresh mind.

Next, a student asked when their grades will be available. E0 grades are available upon request via Matrix DM to one of the instructors. P0 grades are not yet available and any new information will be shared as soon as it is ready.

The instructors wish to congratulate our students. This group delivered the all-time strongest average performance on the kernel compilation assignment! Nice job!

C compilation process deep dive: how a C program becomes an executable

Take this C source file hello.c:

#include <stdio.h>

int main(void)
{
	puts("Hello, world!");
	return 0;
}

You should be familiar with the following invocation:

gcc hello.c -o hello

This is the most basic way to compile a C source file into an binary file you can execute.

Most of you are also familiar with breaking this process into two steps:

Compilation:

gcc -c hello.c -o hello.o

Linking:

gcc hello.o -o hello

This has advantages for large projects because the compilation can be done in parallel, and as you edit the code, only the files that you change need to be recompiled.

While two steps is enough for practical purposes (e.g. decreasing build time), it is not the full story. In reality, the C compiler performs least four distinct processes behind the scenes: preprocessing, compilation, assembly, and linking.

The command gcc -c hello.c -o hello.o encompasses the preprocessing, compilation, and assembly steps, while the command gcc hello.o -o hello encompasses the linking step.

We can invoke each step manually like so:

  1. Preprocessing

     cpp hello.c -o hello.i
    

The C preprocessor removes comments, collapses whitespace, and resolves macros.

The output is traditionally given the suffix .i which stands for intermediate.

  1. Compilation

     cc -S hello.i -o hello.s
    

This is where C language constructs like variables, types and control-flow are flattened into undifferentiated data and code.

After this point, we have no way to tell with certainty that this assembly output came from C program input. A compiler for a different language could plausibly generate identical assembly output.

  1. Assembly

     as hello.s -o hello.o
    

The instructions are replaced with their machine code equivalents. This part is reversible, but the assembler also rips out the last remnants of structure leftover from the original C program source. Static data and functions lose their names and are referred to by only their address, and any exported or imported variables and functions become generic entries in a symbol table. All other labels (e.g. the target of a jump within the same function) are gone without a trace.

After this step, the output is no longer human readable text.

  1. Linking

     ld hello.o -l:crt1.o -lc -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o hello
    

Even though our functions have been compiled into machine code that our CPU could in theory execute, there is still work do be done. The linker collects the dependencies of our program (the C startup runtime -l:crt1.o that provides the _start symbol and the C standard library -lc which provides the puts symbol) and bundles them into one file. The linker makes connections between object files, by cross-referencing their symbol tables to resolve previously unresolved symbols with their now known locations.

In reality symbol resolution is an instance of the classic engineering trade-off between execution speed and memory footprint. Our C program, like most, is at least partially dynamically linked at runtime (-dynamic-linker /lib64/ld-linux-x86-64.so.2).

The output is an executable ELF file that the kernel loader can load into memory and execute on a CPU.

Crossing CPU privilege boundaries

In previous lectures, we’ve discussed what CPU privilege levels are and what they mean.

This lecture will round off that discussion by analyzing in detail the final mechanism for crossing into a privileged execution mode: syscalls.

Exceptions review

We’ve previously demonstrated how our attempts to execute an invalid instruction or a privileged instruction while in user mode causes a CPU exception. At the hardware level, the CPU immediately switches it’s privilege mode to kernel mode and jumps at corresponding kernel function installed at boot to handle handle the exception. In this case, the handler function prints an error message to the kernel ring buffer and kills our program.

More information about exceptions can be found in L05 and L06.

To give a couple more examples: Software conditions such as dividing by zero or accessing an unaligned address trigger CPU exceptions. Hardware can of course also interrupt CPU execution by changing voltage on CPU pins. Finally, attempting to access a pointer (virtual memory address) for which the memory-mapping unit does not have a corresponding physical address triggers an page fault exception which the kernel may resolve by setting up appropriate mapping (e.g. memory that was lazily allocated or swapped to disk), or by sending the program the SIGSEV signal otherwise known as a “Segmentation Fault”.

Syscalls: end-to-end

All previously discussed exceptions were just that: exceptional.

The kernel handled them behind the scenes invisibly to the user or they were the result of a bug in which case the kernel sent the program a fatal signal (e.g. SIGILL, SIGSEGV, and SIGFPE)

Today we demonstrate the mechanism by which your code can intentionally call a system function running in kernel mode; in other words, a system call.

We want to know how the userspace invocation ultimately connects back to the SYSCALL_DEFINE macro that you hunted down, traced, and wrote a history report on during E1

Let’s follow the path of a syscall on x86.

First of all, a user program invokes the syscall instruction.

As described therein, the CPU immediately elevates privilege level and jumps to the address in the LSTAR MSR register.

Who set the value of this LSTAR Model Specific Register?

In the syscall_init function, we find this line:

	wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);

Which sets set the value of LSTAR to the address of the entry_SYSCALL_64 function.

This is not C code :)

The .S suffix indicates this is not just regular assembly code, but code that must be run through cpp in order to resolve macros in the file. The kernel uses these macros to make it easier to write the code correctly and facilitate interoperability with the rest of the kernel.

At this time, the kernel is executing in a privileged mode, but all of the registers refer to userspace data.

To be continued…