Child Processes and Threads

I. Child Processes in UNIX

Creation of a Child Process

There are three steps to creating a child process in UNIX:

The fork() command spawns a new process. The child process continues to run the parent processes' program, unless a exec function is used to run a different program. A wait function can be used to make the parent process wait for the child to finish.

Fork returns values to both the parent process and to the child, if it was successfully created. The child receives a value of 0, while the parent gets the childs PID, or a value of -1 if the child was not created, usually because there were not enough system resources or the user reached his process limit.

Identification of Child Processes

Besides their own unique PID, all processes record their parent's id, the PPID. However there is not an easy way to kill all the child and grandchild processes of a given process.

umbc9[8]# ps -l
 F S  UID   PID  PPID  C PRI NI P   SZ:RSS     WCHAN TTY       TIME COMD
30 S    0 11660   145  1  26 20 *   66:20   88249f10 ttyq6     0:00 rlogind
30 S14066 11662 11661  0  26 36 *  129:43   88249f10 ttyq6     0:00 zwgc
30 S    0 11681 11663  0  39 36 *   85:27   88246890 ttyq6     0:00 csh
30 S14066 11661 11660  0  29 36 *   86:33   8815012c ttyq6     0:00 login.kr
30 S14066 11663 11661  0  39 36 *   86:27   88246890 ttyq6     0:00 csh
30 R    0 12539 11681 46  98 36 0  207:171           ttyq6     0:01 ps

Figure 1: Process Listing

A child process has copies of the parent's file descriptors, but not its file locks, alarms or signals. It has its own records of elapsed processor time, beginning from zero.

In UNIX, process 0 is the swapper, which moves processes in and out of the CPU. Process 1 is init, which, when entering multiuser mode, starts the daemons that provide system services. All processes are forked from process 1 or can be traced back to process 1. Process 2 is the page daemon, which improves paging in virtual memory. Process 3 is the system cache flusher.

Communication between Processes

Once a child process has been created, communication with the parent process or other children becomes an issue. Separate processes can communicate through shared memory or message passing. Shared memory is more efficient.

Problems: Sharing of memory must be carefully coordinated, since the programmer cannot know in what order the communicating processes receive timeslices (CPU time). On the other hand, message passing requires more resources for copying the message and switching between user and kernel modes to make the "send" and "receive" system calls.

II. Threads in UNIX

Efficiency and Processes

The UNIX operating system handles multitasking (a.k.a. timeslicing) by forking child processes which can communicate with the parent in various ways. The scheduler (swapper) is responsible for apportioning timeslices.

Problem: Swapping entire processes in and out can be time-consuming; there are several approaches to reducing the overhead of swapping.

One early approach used a command called vfork, which suspends the parent process when it creates the child process. Rather than create a new address space for the child, the child is allowed to use the parent's address space.

Problem: The child process must therefore be very careful what it does to the parent's variables, etc., before returning the address space to the parent on exiting. Since the efficiency of fork has been improved, vfork should no longer be used and is often not even implemented in UNIX systems.

Implementations of Threads in UNIX

Another solution to the problem of process overhead is to implement threads in UNIX. There have been various proprietary implementations, such as Solaris threads, but these efforts led to non-portable code. The IEEE (Institute of Electrical and Electronic Engineers) made POSIX (Portable Operating System Interface) threads a part of the POSIX standards. POSIX threads are implemented through the pthreads library.

Problem: Again, the terminology used in UNIX differs from that commonly used to describe NT threads. A UNIX lightweight process (LWP) corresponds to an NT thread, while the word 'thread' is sometimes used to indicate a user-space implementation of threads.

UNIX  process LWP thread
NT processprocess & primary threadthread fiber

Figure 2: Process and thread correspondence between UNIX and NT

Kernel and User Space Threads

Recall that UNIX processes and NT threads have a kernel mode and a user mode. In UNIX, POSIX threads can be implemented either at the user level or at the kernel level, or both. Each approach has its advantages and disadvantages.

Figure 3: User-space threads

The user-level approach to threads usually doesn't require any changes in the underlying OS. It is similar to implementing threads through Java or Perl. The POSIX threads package treats the processes it uses as so many virtual processor. It distributes its virtual threads among them, requiring no help from the kernel. The underlying OS executes these processes in the normal fashion. This can be a very efficient approach.

Problem: The virtual processors are also real, physical processes. The OS's real-time processing (I/O, paging, etc.) can disrupt the performance of the virtual threads. The virtual threads must compete for the timeslices of the real process which houses them, and are restricted by the priority level of that genuine process. Also, user space threads cannot take advantage of multiple processors, since the process that houses them runs on only one CPU at a time.

Figure 4: Kernel-space threads

With kernel-space threads, the POSIX library creates a new thread in the kernel corresponding to each user thread. The kernel threads compete for CPU time in the usual way, regulated by the OS's scheduler.

Problem: These kernel threads require more resources, though still not as many as an entire process would. The overhead of creating kernel threads can push a system to its scalability limit.

Figure 5: Combination of user-space and kernel-space approaches

The best way to implement POSIX threads is through a combination of the user-space and kernel-space approaches. Under this method, a pool of kernel threads is created, and user threads are assigned a kernel thread as needed. The scheduling problem is handled differently under different implementations, and the methods used can affect efficiency.

References

The GNU C Library - Child Processes http://www.linuxpowered.com/archive/gnuman/glibc-manual-0.02/library_23.html

Unix Processes http://userpages.umbc.edu/~jack/ifsm498d/processes.html

Three POSIX Threads' Implementations by Bruce McCormick http://www.nswc.navy.mil/cosip/feb99/cots0299-1.shtml


mcdemarco@earthlink.net