parallel
in BashCreated: 2025-01-06
Topics:
does not take much effort: I've been speeding up builds by running commands simultaneously with an added &
ampersand:
# stuff can happen concurrently
# use `&` to run in a sub-shell
cmd1 &
cmd2 &
cmd3 &
# wait on sub-processes
wait
# these need to happen sequentially
cmd3
cmd4
echo Done!
Job control is a shell feature. Each command is put into a background- or sub-shell and run at the same time.
Now assuming you want to loop over more than a few commands, e.g. converting files:
for file in *.jpg; do
# start optimizing every file at once
jpegoptim -m 45 "${file}" &
done
# finish queue
wait
Running a lot of processes this way is still faster than a regular loop. But compared to just a few concurrent jobs there are no speed gains – even possible slowdowns on async disk I/O [Quotation needed].
So you'll want to use
by either 1) installing custom tools like parallel or xjobs or 2) relying on xargs, which is a feature-rich tool but more complicated.
Transforming wait
to xargs
code is described here: an example for parallel batch jobs. The article notes small differences between POSIX flavours – e.g. different handling of separators on BSD/MacOS.
Luckily, we'll be covering option 3) – digging into specs of wait
and jobs
to manage processes with their features.
Quoting this great summary, here are some example commands for
# run child process, save process id via `$!`
cmd3 & pid=$!
# get job list
jobs
# get job ids only
# note: not available on zsh
jobs -p
# only wait on job at position `n`
# note: slots may turn up empty while
# newer jobs rest in the queue's tail
wait %n
# wait on last job in list
wait %%
# wait on next finishing process
# note: needs Bash 4.3
wait -n
Taking our example from before, we make sure to
each time a process is finished using wait -n
:
for file in *.jpg; do
jpegoptim -m 45 "${file}" &
# still < 3 max job -l ines? continue loop
if [[ $(jobs|wc -l) -lt 3 ]]; then continue; fi
# with 3 jobs, wait for -n ext, then loop
wait -n
done
# finish queue
wait
Sadly, this won't work in MacOS, because Bash environments are frozen on old versions. We replace the wait -n
command with wait %%
to loop on the 3rd/last job in queue – an ok compromise on small groups (1/3 chance of fastest/slowest/medium job):
for file in *.jpg; do
jpegoptim -m 45 "${file}" &
# still < 3 max job -l ines? continue loop
if [[ $(jobs|wc -l) -lt 3 ]]; then continue; fi
# with 3 jobs, wait for last in line, then loop
wait %%
done
# finish queue
wait
To further develop the code, one could check for Bash version or alternative shells (zsh on MacOS) to switch code depending on context. I keep using these:
# sequential, slow
time ( for file in *.jpg; do jpegoptim -m 45 "${file}" ; done )
# concurrent, messy
time ( for file in *.jpg; do jpegoptim -m 45 "${file}" & done; wait )
# concurrent, fast/compatible
time ( for file in *.jpg; do jpegoptim -m 45 "${file}" & if [[ $(jobs|wc -l) -lt 3 ]]; then continue; fi; wait %%; done; wait )
# concurrent, fastest
time ( for file in *.jpg; do jpegoptim -m 45 "${file}" & if [[ $(jobs|wc -l) -lt 3 ]]; then continue; fi; wait -n; done; wait )
As the 20th birthday post by parallel
author Ole Tange explains, the original version was leveraging make
because it allows asynchronous processes as well.
--
👉 Found an error or have a proposition?
Contact me via mail below – thanks!
--
Cover image: Paris bibliothéques, via Clawmarks
© 2020–25 Studio Agenturbüro — kontakt@studioagenturbuero.de — Impressum, Datenschutz, Bildnachweise.