A Forking Driver

Matthew Fulkerson
3 min readMar 23, 2021

UNIX has for many decades had the fork() function call, which allows parent processes to spawn independent child processes. These days many people use multithreading within a singe process, for example using OpenMP. This has the advantage that all threads share the same memory (address space). Fork differs from multithreading in a major way. A child process forked from a parent process inherits the same variables in memory, but thereafter the child process proceeds independently and its writes to memory cannot affect the parent process or other child processes.

Why would you want to do this? Well, before the advent of OpenMP, threading was arguably more difficult to program and easier to shoot yourself in the foot. But more importantly, not all programs are thread safe. This means that separate threads of a program may try to modify the same memory. With fork, this is not possible, and one can use fork for parallelism even for programs that are not thread safe.

So today I had the intention to parallelize a program that is not thread safe. But rather than just diving into the program and figuring out how to use fork, instead I wrote a very simple program in order to verify the functionality. That C++ program is called fork_loop_example.cpp and the code is below:

// for fork and getpid and wait
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
// cout, cerr
#include <iostream>
// std vector
#include <vector>
using namespace std;int main(int argc, char **argv)
{
// get process id
pid_t pid = getpid();
cout << "parent pid from getpid is " << pid << endl;
// iterations
int numIterations = 13;
std::vector<int> iteration (numIterations);
// number of child processes to run concurrently
int numChildren = 4;
// loop over iterations
int child_count = 0;
for (int n=0; n<numIterations; n++) {
child_count++; // call fork
pid_t newpid;
newpid = fork();
if (newpid == 0) { // child process, do something cout << "I am a child process with pid = "
<< getpid() << endl;
// Now paste your main code here to do the real work.
// Presumably some calculation, followed by saving
// the data.
// exit
abort();
} else if (newpid == -1) { // parent process, error
cerr << "child pid is -1, aborting." << endl;
return -1;
} else { // parent process, wait // compute number of active children
int numActiveChildren = numChildren;
if (n == numIterations-1) {
numActiveChildren = child_count;
}
// wait for children if child_count has reached
// number of active children
if (child_count == numActiveChildren) {
for (int m=0; m<numActiveChildren; m++) {
int wait_return;
int wait_status;
wait_return = wait (&wait_status);
cout << "wait returned "
<< wait_return << endl;
cout << "wait status is "
<< wait_status << endl;
}
// reset child_count
child_count = 0;
}
}
} // end iteration loop return 0;}

This program executes a for loop. In each iteration, fork is called so that a child process launches that actually does the work. (See the comment “Now paste your code here to do the real work.”) The job of the parent process is waiting for numChildren child processes to finish using the UNIX system call wait(). Then after waiting, another numChildren processes are forked into existence. The process repeats until the for loop is done.

If you are a programmer using Linux or any other UNIX, and would like to try out using fork for parallel processing, feel free to adapt this program to your needs!

--

--