Limited Direct Execution

CPU virtualization recall

To support the illusion of having multple processes running concurrently on a single physical CPU, we can have the CPU run one process for a while, then run another, and so on.
This is called time sharing.

Design goals of CPU virtualization

Performance
- The process can be itself and run as fast as possible without frequent interaction with the OS.
Control
- We want to avoid the scenario where a process can run forever and take over all the machine’s resources or a process performs unauthorized actions. This requires interaction with the OS.

The question!

How to efficiently virtualize the CPU with control?

Efficient?

The most efficient way to execute a process is through direct execution.

Problem!

Once the program begins to run, the OS becomes a complete outsider.
No control over the running program.
Problem 1: The program can access anything it wants to, including restricted operations (direct access to hardware devices, especially I/O for unauthorized purposes).
Problem 2: The program may never switch to a different process without explicit. instructions in main(), thus defeating the purposes of time-sharing.

Problem: working with restricted operations

The process should be able to perform restricted operations, such as disk I/O, open network connections, etc.
But we should not give the process complete control of the system.

Solution: hardware support via processor modes

User mode
Kernel mode

Process modes

A mode bit is added to hardware to support distinguishing between user mode and kernel mode.
Some instructions are designated as privileged instructions that cannot be run in user mode (only in kernel mode).
A user-mode process trying to perform privileged instructions will raise a protection fault and be killed.
How can these instructions be called by a process in user-mode?
- System calls

System calls

A small set of APIs for restricted operations.
- XV6 system calls
- Linux x86_64 has 335 systems called (0 to 334) - Last commit June 2, 2018.
- Linux uses a sys_call_table to keep the syscall handlers.
- Syscall_64.tbl
These system calls enable user-mode process to have restricted operations performed without having to gain complete control over the system.

How does a system call happen?

To make a system call, the process need to switch from user mode to kernel mode, do the privileged operation, and then switch back.
This is done through hardware support
- Require assembly instructions
- trap: go from user mode to kernel mode.
- return-from-trap: go back from kernel mode to user mode.

System calls versus normal C calls?

Function declarations are the same
System calls
- Have trap instruction in them
- have extra level of indirection (movements between modes).
- perform restricted operations
- have bigger overhead and are slower than equivalent function calls.
- can use kernel stack.

Tracing echo.c

Open Code server (localhost:18088 or 127.0.0.1:18088)
Open the terminal panel
- Clone and build xv6-riscv, include the make qemu step if this is a new container. Check the folder sidebar for confirmation.

user/usys.S

Generated by user/usys.pl during build.
Contains the trap-generating code (ecall) for each syscall.

ecall is used by a lower-privileged mode to request a service from a higher-privileged mode.
When ecall is executed, the CPU triggers a trap into Supervisor mode.
- Trap cause is recorded in scause.
- Address of the instruction invoking the ecall is saved in sepc.

kernel/trap.c

Handles traps including syscalls and timer interrupts
syscall() is invoked inside void usertrap(void).

kernel/syscall.c

Dispatch table and syscall handler logic

kernel/sysproc.c and kernel/sysfile.c

Actual implementations like sys_fork, sys_exit, etc.
The user space definition will eventually call the kernel definition.
- Compare write() and `sys_write().

Tracing instructions

Edit usertrap() in kernel/trap.c.

 if (r_scause() == 8) {
    printf("usertrap: syscall from pid %d, syscall number = %ld\n", p->pid, p->trapframe->a7);
    ...
}
 

Edit syscall() in kernel/syscall.c:

 if(num > 0 && num < NELEM(syscalls) && syscalls[num]) {
    printf("syscall(): number = %d, a0 = %ld, a1 = %ld, a2 = %ld\n", num, p->trapframe->a0, p->trapframe->a1, p->trapframe->a2);
    ...
}
 

Rebuild and run xv6.

Where is hello printed out?

Problem: switching processes

A free running process may never stop or switch to another process.
OS needs to control the process, but how?
- Once a process is running, OS is NOT running (OS is but another process)
The question: How can OS regain control of the CPU from a process so that it can switch to another process?

First approach: cooperative processes

All programmers promise to insert yield() into their code to give up CPU resources to other people’s program.
We have solved the problem and achieved eternal world peace.
Even in a perfect world, what happens if a process falls into an infinite loop prior to calling yield()?
Collaborative multitasking (Windows 3.1X, Mac PowerPC)

Second approach: non-cooperative processes

Similar to processor modes, the hardware once again provided assistance via timer interrupt.
A timer device can be programmed to raise an interrupt periodically.
When the interrupt is raised, the running process stops, a pre-configured interrupt handler in the OS runs.
The OS regains control.

Second approach: non-cooperative processes

The OS first decides which process to switch to (with the help from the scheduler).
The OS executes a piece of assemble code (context switch).
- Save register values of the currently running process into its kernel stack.
- Restore register values of the soon running process from its kernel stack.

Timer interrupt and regaining control

kernel/trap.c

Identify where in devintr() that clockintr() happens and the return value.
Identify where usertrap() that the yield() function is invoked.

kernel/proc.c

What is the implementation of yield()?
Identify swtch.S, which contains the implementation of swtch() call inside sched().

Edit clockintr() inside kernel/trap.c
- Modify the w_stimecmp call to increase frequency between interrupt request so that it becomes more observable.
- Insert a printf() to indicate when clockintr() is called.
Rebuild and run xv6, observe the changes when executing a function.