This post is NOT initially designed for how to hack Bash. But it does tell the truth that hacking Bash is not that hard, by adding a useful feature to Bash itself – setting CPU affinity support. Have fun and K.R.K.C.
One of my friends tries to parallel a program processing by chopping the 8G file into 8 1G files and running the program into 8 independent processes. Apparently, this should be faster than a process handling the 8G file sequentially. However, one big concern here would be the context switch and memory contention, which could degrade the effect of parallelism (Amdahl’s law). Fortunately, the machine my friend uses has 8 cores… So, it would sounds better if we could bind each process with certain core, by which the number of context switches could be reduced. Can we do it without changing the program? Yes, hack the Shell!
From Linux kernel 2.6, GNU C library had an extension API to limit the program execution to certain CPUs, which is called CPU affinity. For application developers, a call to sched_setaffinity() within the program would bind itself to certain CPUs/cores. But what if we do not have the source code to play with?
The answer is an example of K.I.S.S. – we hack the Shell! Here is something why this would work. 1st, all user applications are forked by a Shell eventually, which means all user applications are child processes of the invoking Shell. Because of this, 2nd, not only the signals and file descriptors are inherited by the children but also the CPU affinity. For example, if an invoking Shell has a CPU affinity to core1, then the program forked by this Shell would also only run on core1 without being switched to other cores.
Once the background is clear, the hack is trivial (OK, I admit that it needs 2-hour debugging…). Essentially, all we need to do is to call sched_setaffinity() within the Shell before it starts executing the commands (fork() and execve()). I have hacked Bash to add this new feature and the code is down below.
Remember, parallel does not always improve the performance as you would expect. The ideal parallel would be per process per core without extra context switches or memory contentions. Also, please note that CPU affinity API is Linux ONLY!
GNU C library on CPU affinity – http://www.gnu.org/software/libc/manual/html_node/CPU-Affinity.html
The Linux Programming Interface