Process Lifecycle

Some scenarios for process management:

A process would like to create a new process -- a user opens a text editor from shell
A process would like to duplicate itself -- the NGINX web server creates multiple worker processes to handle user requests.
A process would like to terminate itself.
A process would like to create a new process and at the same time, terminate itself -- the old process works like a "bootloader" for the target process.

All the above tasks can be handled by a combination of fork() and exec() calls.

Process creation in *nix OSes

fork() -- "duplicate" the current process
exec() -- replace the current process with a new executable

The typical way to create a new process in *nix:

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

#include <sys/types.h>

#include <sys/wait.h>

int

main(int argc, char ** argv)

{

pid_t child_pid = fork();

if (child_pid == 0) { // I'm the child

printf("I'm the child. pid == %u\n", getpid());

char * args[] = {"ls", "-al", "/home", 0};

execvp("ls", args);

printf("exec() failed...\n");

} else if (child_pid > 0) { // I'm the parent

printf("I'm the parent. child_pid == %u\n", child_pid);

waitpid(child_pid, NULL, 0);

} else {

printf("fork() failed\n");

}

return 0;

}

The first two processes in xv6

(Read 06: The First Process about the initialization of the first process environment)

userinit()

Create a new process. The process memory is initialized with the content in "initcode"

initcode.S

The first a few instructions of the first process.
initcode's task is simple: prepare the arguments and call exec("/init")
exec() replaces the process binary with that in "/init" (the binary of init.c)

init.c

fork and execute "sh" (shell) -- when "sh" is up, there are two processe: "init" and "sh"

sh.c

opens the special device file "console" for input/output with keyboard and display
Every time it gets a command from user, it fork() and the child exec() the command.

The fork() system call

allocproc()

First it acquire a struct proc for the new process
creates a private kernel stack for the new process' kernel thread.
sets up the initial structure of the kernel stack. It setup a call path by manipulating the return addresses on the stack.
In the near future, when the new process is being switched to, it will first execute forkret(), then "return" to syscall_trapret, then "sysretq" to the user space to execute the user code.
Why the call path is built using the "ret" instructions? In this way we only need to setup a few variables on the stack to make multiple functions being called one after another. Building a forward call-path is possible but is too complex.

copyuvm()

create a new page table and setup the kernel part. Note that only one entry is duplicated (pml4[256]) as the kernel part are somewhat immutable now so we don't need to recursively duplicate the whole subtree!
duplicates the user-side program memory, page by page.
copy the trapframe, the user-side execution flow. Now imagine both the two processes are making the call to fork().
set the child-process' %rax to 0, as its fork() should return 0.
duplicate the file descriptors, which is the standard behavior in *nix systems.

After setting its state to RUNNABLE, the child process is ready to run!

Linux's fork:

copy_process(): https://github.com/torvalds/linux/blob/master/kernel/fork.c#L1821
_do_fork(): https://github.com/torvalds/linux/blob/master/kernel/fork.c#L2380

The exec() system call

search for the executable in the file system -- namei().
load the program header (ph) and check the magic number.
creates a new page table with setupkvm().
on the new page table, load or initialize each segment of the program.
setup the stack for calling the first function with the parameters (**argv).
use elf.entry as the first instruction of the new process.
discard the old page table.

After a successful call to exec() from the user space: The original execution flow of the process is destroyed and replaced by the fresh new process context setup in exec().

Open file descriptors are inherited by default. But a program can also control the behavior.

"man 2 open", search O_CLOEXEC.

fork() in the real world: Copy-On-Write

It's obvious that if we're going to call exec() to execute a new program, the call to fork() did a lot of useless job, as everything created by the fork() will be erased by exec()

In the real system, fork() uses the technique called "copy-on-write" (aka. COW) to minimize the overhead of fork(), so we don't need to add a new system call "CreateProcess".

The core idea: the two process can share the memory pages until something is changed.

Upon a write operation, the writer process creates a private copy of the modified memory (one page at a time).

After fork(), if exec() is immediate called, almost nothing is really duplicated by the fork().

In the real systems, the page table can also be COW-ed, so the real cost for a fork() (immediately followed by exec()) is about duplicating only a few pages -- a new stack and the corresponding page table nodes for that stack.

CreateProcess() in Windows OSes

Windows has CreateProcess() -- why it needs so many parameters?

With fork() and exec(), a lot of jobs can be done on demand between these two calls.

Since the parent process loses control of the child after CreateProcess(), it needs to specify everything with the parameters. The side effect is the CreateProcess() need to check all the parameters.

Threads (POSIX threads)

In Linux, fork() internally calls the clone() system call. There are parameters to specify what should be duplicated with clone().

POSIX threads, or pthreads, is the standard mechanism on Linux.

pthread_create() calls clone() but it does not duplicate the page table. Both "execution flows" share the same VM (and a few other things), which is the behavior of threads.

Read the code in glibc's thread creation code and the man page for details about thread creation.

pthread's use of clone(): (Note that a specified flag means the corresponding resource will be shared by the two processes. -- Don't try to define what is a "process".

const int clone_flags = (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SYSVSEM

| CLONE_SIGHAND | CLONE_THREAD

| CLONE_SETTLS | CLONE_PARENT_SETTID

| CLONE_CHILD_CLEARTID

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/createthread.c;hb=HEAD

fork()'s use of clone():

const int flags = CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD;

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/arch-fork.h;hb=HEAD