Job Control & Traps
& background, wait, jobs, fg/bg, trap on EXIT/ERR/INT — making your scripts behave like good UNIX citizens.
Job Control & Traps
Bash inherits UNIX's process model: every command is a process, processes can be backgrounded, signals can interrupt them, and the parent shell can wait on them. Master this small set of mechanisms and you can write scripts that handle Ctrl-C gracefully, parallelise work, and never leak resources.
Foreground vs background
Append & to run a command in the background:
sleep 30 & # detaches; you get the prompt back immediately
echo "spawned: $!" # $! is the PID of the most-recent background job
The job is in the same process group as your shell, but you're not waiting for it. The PID lands in $! for one command — capture it now if you need it later.
wait — synchronise
sleep 1 &
sleep 2 &
sleep 3 &
wait # blocks until ALL background jobs finish
echo "all done"
wait (no args) waits for everything. wait $pid waits for one specific PID and returns its exit code. wait -n (bash 4.3+) waits for the next job to finish (any of them).
The classic parallel-fan-out pattern:
pids=()
for url in "${urls[@]}"; do
curl -fsS "$url" > "out-${url//[^a-z]/_}.txt" &
pids+=($!)
done
for pid in "${pids[@]}"; do
wait "$pid" || echo "PID $pid failed" >&2
done
jobs, fg, bg — interactive controls
In an interactive shell:
sleep 100 &
sleep 200 &
jobs # [1]+ Running sleep 100 &
# [2]- Running sleep 200 &
fg %2 # bring job 2 to foreground
# Ctrl-Z (suspend) # → SIGTSTP, returns shell prompt
bg %2 # resume in background
kill %1 # signal job 1
In scripts you usually don't need this — you spawn with &, capture $!, and wait.
nohup and disown
To survive logout:
nohup long-job > out.log 2>&1 &
disown # remove from shell's job table
nohup makes SIGHUP (sent on terminal close) a no-op. disown removes the job from the shell's tracking, so even without nohup it won't be HUP'd on shell exit. Use both for "fire and forget".
A modern alternative for long-running interactive sessions: tmux or screen. Either lets you reattach a session from another terminal — far better than nohup for anything you'd want to come back to.
Signals — the OS interrupt mechanism
Signals are POSIX's way for the kernel (or other processes) to interrupt a running process. The ones you care about:
| Signal | Number | Default action | Catchable | Sent by |
|---|---|---|---|---|
SIGINT |
2 | terminate | yes | Ctrl-C |
SIGTERM |
15 | terminate | yes | kill PID, polite shutdown |
SIGKILL |
9 | terminate | NO | kill -9 PID (last resort) |
SIGHUP |
1 | terminate | yes | terminal close, also "reload config" idiom |
SIGUSR1 / SIGUSR2 |
10 / 12 | terminate | yes | user-defined |
SIGSTOP |
19 | suspend | NO | kill -STOP PID |
SIGTSTP |
20 | suspend | yes | Ctrl-Z |
SIGCONT |
18 | resume | yes | bg, fg |
SIGCHLD |
17 | ignored | yes | when a child exits |
Bash convention: a script killed by signal N exits with code 128+N. Ctrl-C → exit 130, SIGTERM → 143.
trap — handle signals and exit
trap '<command>' SIGNAL [SIGNAL ...]
The most important pattern — cleanup on any exit:
WORKDIR=$(mktemp -d)
trap 'rm -rf "$WORKDIR"' EXIT
# ... work that might fail or be interrupted ...
EXIT is a pseudo-signal. It fires on every exit path: normal completion, exit N, error, signal. The cleanup runs once.
Add INT TERM to also clean up if the process gets killed by Ctrl-C or the OOM killer:
trap 'rm -rf "$WORKDIR"; exit 130' INT
trap 'rm -rf "$WORKDIR"; exit 143' TERM
trap 'rm -rf "$WORKDIR"' EXIT
…but actually, with the EXIT trap registered, INT and TERM also fire EXIT on their way out, so the simpler form is:
trap 'rm -rf "$WORKDIR"' EXIT
trap 'exit 130' INT
trap 'exit 143' TERM
The exit-code traps tell bash what code to return; the EXIT trap does the cleanup.
Single quotes vs double quotes in trap
trap "rm -rf $WORKDIR" EXIT # ❌ $WORKDIR expands NOW (at trap-set time)
trap 'rm -rf "$WORKDIR"' EXIT # ✅ $WORKDIR expands LATER (at trap-fire time)
Always use single quotes in trap commands unless you specifically want immediate expansion. The double-quote form bakes in whatever value $WORKDIR had at the time you called trap.
Trap pseudo-signals
| Pseudo-signal | Fires when |
|---|---|
EXIT |
the shell exits, for any reason |
ERR |
any command fails (with set -e semantics — same exemptions) |
DEBUG |
before every command |
RETURN |
a function returns |
trap '<cleanup>' ERR is useful for error logging:
trap 'echo "error at line $LINENO" >&2' ERR
Reaping zombies
When a child process exits, it becomes a "zombie" in the kernel until the parent calls wait on it. Bash automatically reaps children when you use wait. If you don't, the OS cleans them up when your shell exits — but in container entrypoints, you can accumulate zombies forever.
If your bash script is a container entrypoint that spawns long-lived children, either:
- Call
waitafter each spawn (synchronous), or - Use
tiniordumb-initas the actual PID 1 and have your bash run under it.
tini -- ./entrypoint.sh is the standard.
Backgrounded subshells and signal propagation
Backgrounded subshells don't receive SIGINT from the parent's terminal:
sleep 100 &
# Ctrl-C only kills the foreground (your shell, but it ignores INT interactively)
# The sleep keeps running until SIGTERM/SIGKILL.
In a script with set -e, a backgrounded job's failure does NOT abort the parent — you have to wait $pid and check the exit code. This is a common surprise.
Locking and flock
For "only one instance at a time":
exec 9> /var/lock/myscript.lock
flock -n 9 || { echo "already running"; exit 1; }
trap 'rm -f /var/lock/myscript.lock' EXIT
# ... script body ...
flock is the right answer for inter-process locking. Don't roll your own with PID files — they get stale, race, and miss kills.
Bash vs zsh
Job control is essentially identical: &, wait, jobs, fg, bg, disown. Differences:
- zsh's
waitsupports negative arguments to wait for jobs by job-spec (wait %1), bash needs the PID. - zsh has
coproc(coprocess) — bidirectional pipe to a backgrounded command. Bash also hascoproc(4+) but the syntax differs. - zsh's
TRAPxxxfunctions —TRAPINT() { ... }is an alternative totrap '...' INTand feels nicer if you have a function. - Default behaviour: zsh by default warns when you exit with running jobs ("you have running jobs"). Bash silently sends SIGHUP unless the
huponexitshopt is off.
Common bugs
Forgot the trap. Script crashes, leaks /tmp/tmp.XXXX. Fix: trap-and-rm at top of script.
Double-quoted trap with $WORKDIR. The variable expanded at trap-set time, capturing the empty string before $WORKDIR=$(mktemp -d) ran. Fix: use single quotes.
Backgrounded job's exit silently ignored. set -e doesn't abort on background failures. Fix: wait $pid || die "background job failed".
Trying to trap SIGKILL. It's uncatchable by design (so misbehaving processes can always be killed). Trap SIGTERM and design your handler to be quick.
Race between trap and concurrent jobs. EXIT runs once, after everything else. Background children that haven't been waited on may still be running when EXIT fires. wait first, then clean up.
trap '' SIGNAL — empty command IGNORES the signal entirely. Different from trap - SIGNAL which RESETS to default. The empty-string form is what nohup does internally for SIGHUP.
Tools in the wild
3 tools- librarytinifree tier
Tiny init for containers. Reaps zombies and forwards signals; pairs with bash entrypoints.
- librarysupervisordfree tier
When your bash entry script becomes a process supervisor, it's time for supervisord (or systemd).
- clipstree / htopfree tier
Visualise the job tree your script spawned. Indispensable for diagnosing zombies and orphan processes.