There are three steps to creating a child process in UNIX:
Fork returns values to both the parent process and to the child, if it was successfully created. The child receives a value of 0, while the parent gets the childs PID, or a value of -1 if the child was not created, usually because there were not enough system resources or the user reached his process limit.
Besides their own unique PID, all processes record their parent's id, the PPID. However there is not an easy way to kill all the child and grandchild processes of a given process.
umbc9[8]# ps -l F S UID PID PPID C PRI NI P SZ:RSS WCHAN TTY TIME COMD 30 S 0 11660 145 1 26 20 * 66:20 88249f10 ttyq6 0:00 rlogind 30 S14066 11662 11661 0 26 36 * 129:43 88249f10 ttyq6 0:00 zwgc 30 S 0 11681 11663 0 39 36 * 85:27 88246890 ttyq6 0:00 csh 30 S14066 11661 11660 0 29 36 * 86:33 8815012c ttyq6 0:00 login.kr 30 S14066 11663 11661 0 39 36 * 86:27 88246890 ttyq6 0:00 csh 30 R 0 12539 11681 46 98 36 0 207:171 ttyq6 0:01 ps |
Figure 1: Process Listing
A child process has copies of the parent's file descriptors, but not its file locks, alarms or signals. It has its own records of elapsed processor time, beginning from zero.
In UNIX, process 0 is the swapper, which moves processes in and out of the CPU. Process 1 is init, which, when entering multiuser mode, starts the daemons that provide system services. All processes are forked from process 1 or can be traced back to process 1. Process 2 is the page daemon, which improves paging in virtual memory. Process 3 is the system cache flusher.
Once a child process has been created, communication with the parent process or other children becomes an issue. Separate processes can communicate through shared memory or message passing. Shared memory is more efficient.
Problems: Sharing of memory must be carefully coordinated, since the programmer cannot know in what order the communicating processes receive timeslices (CPU time). On the other hand, message passing requires more resources for copying the message and switching between user and kernel modes to make the "send" and "receive" system calls.
The UNIX operating system handles multitasking (a.k.a. timeslicing) by forking child processes which can communicate with the parent in various ways. The scheduler (swapper) is responsible for apportioning timeslices.
Problem: Swapping entire processes in and out can be time-consuming; there are several approaches to reducing the overhead of swapping.
One early approach used a command called vfork, which suspends the parent process when it creates the child process. Rather than create a new address space for the child, the child is allowed to use the parent's address space.
Problem: The child process must therefore be very careful what it does to the parent's variables, etc., before returning the address space to the parent on exiting. Since the efficiency of fork has been improved, vfork should no longer be used and is often not even implemented in UNIX systems.
Another solution to the problem of process overhead is to implement threads in UNIX. There have been various proprietary implementations, such as Solaris threads, but these efforts led to non-portable code. The IEEE (Institute of Electrical and Electronic Engineers) made POSIX (Portable Operating System Interface) threads a part of the POSIX standards. POSIX threads are implemented through the pthreads library.
Problem: Again, the terminology used in UNIX differs from that commonly used to describe NT threads. A UNIX lightweight process (LWP) corresponds to an NT thread, while the word 'thread' is sometimes used to indicate a user-space implementation of threads.
| UNIX | process | LWP | thread | |
| NT | process | process & primary thread | thread | fiber |
Figure 2: Process and thread correspondence between UNIX and NT
Recall that UNIX processes and NT threads have a kernel mode and a user mode. In UNIX, POSIX threads can be implemented either at the user level or at the kernel level, or both. Each approach has its advantages and disadvantages.
Figure 3: User-space threads
The user-level approach to threads usually doesn't require any changes in the underlying OS. It is similar to implementing threads through Java or Perl. The POSIX threads package treats the processes it uses as so many virtual processor. It distributes its virtual threads among them, requiring no help from the kernel. The underlying OS executes these processes in the normal fashion. This can be a very efficient approach.
Problem: The virtual processors are also real, physical processes. The OS's real-time processing (I/O, paging, etc.) can disrupt the performance of the virtual threads. The virtual threads must compete for the timeslices of the real process which houses them, and are restricted by the priority level of that genuine process. Also, user space threads cannot take advantage of multiple processors, since the process that houses them runs on only one CPU at a time.
Figure 4: Kernel-space threads
With kernel-space threads, the POSIX library creates a new thread in the kernel corresponding to each user thread. The kernel threads compete for CPU time in the usual way, regulated by the OS's scheduler.
Problem: These kernel threads require more resources, though still not as many as an entire process would. The overhead of creating kernel threads can push a system to its scalability limit.
Figure 5: Combination of user-space and kernel-space approaches
The best way to implement POSIX threads is through a combination of the user-space and kernel-space approaches. Under this method, a pool of kernel threads is created, and user threads are assigned a kernel thread as needed. The scheduling problem is handled differently under different implementations, and the methods used can affect efficiency.
The GNU C Library - Child Processes http://www.linuxpowered.com/archive/gnuman/glibc-manual-0.02/library_23.html
Unix Processes http://userpages.umbc.edu/~jack/ifsm498d/processes.html
Three POSIX Threads' Implementations by Bruce McCormick http://www.nswc.navy.mil/cosip/feb99/cots0299-1.shtml