This describes the structure of a process and some process data structures used for memory management.
A process is the execuetion of a program and consists of patterns of bytes that the CPU iterprets as machine instructions (called "text"), data and stack. Many processes appear to execute simultaneously as the kernel schedules them for execuetion, and several processes may be instances of one program. A process execuetes by following a strict sequence of instructions that is self-contained and does not jump to that of another process; it reads and writes its data and stack sections, but it cannot read or write the data and stack of other processes. Processes communicate with other processes and with the rest of the world via system calls.
In practical terms, a process ona UNIX system is the entity that is created by the fork system call. Every process except process 0 is created when another process executes the fork system call. The process that invoked the fork system call is called the parent process and the newly created process is the child process. Every process has one parent process, but a process can have many child processes. The kernel identifies each process by its process number, called the process ID (PID). Process 0 is a special process that is created "by hand" when the system boots; after forking a child process (process 1), process 0 becomes the swapper process. Process 1, knows as init, is the ancestor of every other process in the system and enjoys a special relationship with them.
A user compiles the source code of a program to create an executable file, which consists of several parts:
#include <fcntl.h>
char buffer[2048];
int version = 1;
main (argc, argv)
int argc;
char* argv[];
{
int fdold, fdnew;
if (argc != 3)
{
printf("need 2 arguments for copy program\n");
exit(1);
}
fdold = open (argv[1], O_RDONLY); /* open source file read only */
if (fdold == -1)
{
printf("cannot open file %s\n", argv[1]);
exit(1);
}
fdnew = creat(argv[2], 0666); /* create target file rw for all */
if (fdnew == -1)
{
printf("cannot create file : %s\n", argc[2]);
exit(1);
}
copy(fdold,fdnew);
exit(0);
}
copy(old,new)
int old,new;
{
int count;
while ((count == read (old, buffer, sizeof(buffer))) > 0)
write(new, buffer, count);
}
In the above program, the text of the executable file is the generated code for the functions main and copy, the initialized data is the variable version (put into the program just so that it should have some initialized data), and the uninitialized data is the array buffer. System V versions of C compiler create a separate text section by default but support an option that allows inclusion of program instructions in the data section, used in older versions of the system.
The kernel loads an executable file into memory during an exec system call, and the loaded process consists of at least three parts, called regions: text, data and the stack. The text and data regions correspond to the text and bss sections of the executable file, but the stack region is automatically created and its size is dynamically adjusted by the kernel at run time. The stack consists of logical stack frames that are pushed when calling a function and popped when returning; a special register called the stack pointer indicates the current stack depth. A stack frame consists the parameters to the function, its local variables and the data necessary to recover the previous stack frame, including the value of the program counter and stack pointer at the time of the function call. The program code contains instuction sequences that manage stack growth, and the kernel allocates space for the stack as needed. In the above mentioned program, parameters argc and argv and variables fdold and fdnew in the function main appear on the stack when main is called (once in every program, by convention), and parameters old and new and the variable count in the function copy appear on the stack whenever copy is called.
User Stack Direction of stack Kernel Stack
|-------------------------| |---------------------------|
| | ^ | |
| Local Vars Not Shown | | | |
| | | | |
|-------------------------| | |---------------------------|
| Addr of Frame 2 | | | |
|-------------------------| | |---------------------------|
|Ret addr after write call| | |
|-------------------------| |---------------------------|
| new | | |
| parms to write buffer | | |
| count |Frame 3 | |
|-------------------------|call write() Frame 3|---------------------------|
|Local | |Local |
|Vars count | |Vars |
|-------------------------| |---------------------------|
|Addr of Frame 1 | |Addr of Frame 1 |
|-------------------------| |---------------------------|
|Ret addr after copy call | |Ret addr after func2 call |
|-------------------------| |---------------------------|
|parms to copy old | |parms to kernel func2 |
| new |Frame 2 Frame 2 | |
|-------------------------|call copy() |---------------------------|
|Local Vars fdold | call func2()|Local Vars |
| fdnew | | |
|-------------------------| |---------------------------|
|Addr of Frame 0 | |Addr of Frame 0 |
|-------------------------| |---------------------------|
|Ret addr after main call | |Ret addr after func1 call |
|-------------------------| |---------------------------|
|parms to main argc | |parms to kernel func1 |
| argv |Frame 1 Frame 1 | |
|-------------------------|call main() |---------------------------|
call func1()
Frame 0 Frame 0
Start System Call Interface
Because a process in the UNIX system can execute in two modes, kernel or user, it uses a separate stack for each mode. The user stack contains the local arguments, local variables, and other data for functions executing in the user mode. The left side of the above figure shows the user stack for a process when it makes the write system call in the copy program. The process startup procedure (included in a library) had called the funtion main with two parameters, pushing frame 1 onto the user user stack; frame 1 contains space for the two local variables of main. Main then called copy with two parameters, old and new, and pushed frame 2 onto the user stack; frame 2 contains space for the local variable count. Finally the process invoked the system call write by invoking the library function write. Each system call has an entry point in a system call library; the system call library is encoded in assembly language and contains special trap instructions, which, when executed, cause an "interrupt" that results in a hardware switch to kernel mode. A process calls the library entry point for a particular system call just as it calls any function, creating a stack frame for the library function. When the process executes the special instruction, it switches mode to kernel, executes kernel code, and uses the kernel stack.
The kernel stack contains the stack frames for functions executing in kernel mode. The function and data entries on the kernel stack refer to functions and data in the kernel and not in the user program, but its construction is the same as that of the user stack. The kernel stack of a process is null when the process executes in user mode. The right side of the above figure depicts the kernel stack representation for a process executing the write system call in the copy program.
per process
region table region table
|--------| |----------| |----------|
| u area | | | | |
| ^ | | | | |
|--|-----| |----------| |----------|
| |---->| |------>| |-----|
|--|----| | |----------| |----------| |
| | | | |->| |-- | | |
| | | | | |----------| | | | |
|--|----| | | | | | | | |
| V |---| | | | | | | |
| |------| | | | | | |
|-------| | | | |----------| |
| | | | |---->| |---| |
| | | | |----------| | |
| | | | | | | |
| | | | | | | |
|-------| |----------| |----------| | |
| |
-------------------------------- | |
| main memory |<-| |
| |<---|
--------------------------------
Each process has an entry in the kernel process table, and each process is allocated a u area (The u in the u area stands for user. Another name for the u area is u block; this paper will always refer to it as the u area) that contains private data manipulated only by the kernel. The process table contains (or points to) a per process region table, whose entries point to entries in a region table. A region is a contiguous area of a process's address space, such as text, data and stack. Region table entries describe the attributes of the region, such as whether it contains text or data, whether it is shared or private, and where the "data" of the region is located in memory.
The extra level of indirection (from the per process region table to
the region table) allows independent processes to share regions. When a
process invokes the exec system call, the kernel allocates regions for
its text, data and stack after freeing the old regions the process had
been using. When a process issues a fork system call, the kernel duplicates
the addresss space of the old process, allowing processes to share regions
when possible and making a physical copy otherwise. When a process issues
a exit call, the kernel frees the regions the process had used. The
above figure shows the relevant data structures of a running process. The
process table points to a per process region table with pointers to the
region table entries for the text, data and stack regions of the process.