Find & Locate
Walk the live filesystem or query an index — and how to feed xargs safely.
Find & Locate
Two tools, two philosophies. find walks the filesystem live every time you ask. locate queries a database that's only as fresh as the last nightly index. Knowing when to reach for which is the whole skill.
Analogy
find is asking the librarian to walk every shelf in the library and write down which books match your description. locate is consulting last night's catalogue printout — instant, but missing anything that arrived this morning. The librarian is always right; the printout is almost always right and answers in a second.
find — the live walker
find recursively walks a directory tree, evaluating predicates against each entry. The shape is always:
find <where> <predicates> <action>
The classics:
find /var/log -name '*.log' # by name
find /home -mtime -1 # modified < 24h ago
find /tmp -size +100M # bigger than 100 MB
find . -type f -name '*.py' -path '*/tests/*' # combine — implicit AND
Predicates chain with implicit AND. Add -o for OR (and parentheses, which the shell needs you to escape: \( \)).
Time predicates
Three units, all in 24-hour buckets:
| Flag | Meaning |
|---|---|
-mtime |
Content modified. |
-atime |
Last accessed (often disabled with noatime for perf). |
-ctime |
Inode changed (perms, ownership, rename — not just mtime). |
-mtime -7 = last 7 days. -mtime +30 = older than 30 days. -mtime 0 = today.
For finer control: -mmin, -amin, -cmin work in minutes.
Acting on matches with -exec
The textbook example — delete every .tmp older than a week:
find /var/cache -name '*.tmp' -mtime +7 -exec rm {} \;
{} is the placeholder for the matched path. The terminator matters:
\;— run the command once per match. Slow on big result sets.+— batch matches into one command (likexargs). Much faster.
find . -name '*.bak' -exec rm {} +
-path and -prune for excluding directories
-name matches the basename. -path matches the whole path string:
find . -path '*/node_modules/*' -prune -o -name '*.js' -print
Translation: descend, but don't enter node_modules; for everything else, print .js files. Without -prune, find walks the whole node_modules tree even though you don't want it.
locate — the indexed query
locate does no filesystem walking. It searches a flat database of every path on the system, built nightly by updatedb:
locate nginx.conf
# /etc/nginx/nginx.conf
# /usr/share/man/man8/nginx.conf.8.gz
# ...
Sub-second results, regardless of system size. The catch: anything created or deleted since the last updatedb run is invisible. Force a refresh manually:
sudo updatedb
When locate wins
- "Where is
php.ini?" - "What were all those config files I installed weeks ago?"
- "Which paths contain the substring
letsencrypt?"
When locate is wrong
- Anything time-sensitive ("what changed in the last hour").
- Filtering by size or owner.
- Right after extracting a tarball, before
updatedbruns. - On systems without the index installed (some minimal containers).
The -print0 | xargs -0 pattern
Filenames can contain anything: spaces, newlines, even Unicode. The naive pipeline silently breaks:
# WRONG — breaks on filenames with spaces or newlines
find . -name '*.log' | xargs rm
The safe form uses NUL-separated records:
find . -name '*.log' -print0 | xargs -0 rm
-print0 writes results separated by \0 (the only byte that can never appear in a filename). xargs -0 reads them. This is the one shape you should burn into muscle memory — anywhere you pipe filenames between processes, use it.
xargs flags worth knowing:
-0— NUL-delimited input (pair with-print0).-n 1— one argument per command invocation.-P 8— run 8 commands in parallel.-I {}— replace{}in the command (rather than appending).
find . -name '*.jpg' -print0 | xargs -0 -P 8 -I {} convert {} -resize 50% small/{}
A decision tree
| Situation | Tool |
|---|---|
| "Just find me X anywhere on this box" | locate |
| "Find recently changed files" | find with -mtime |
| "Filter by size/owner/permissions" | find |
| "Apply a command to every match" | find -exec or pipe to xargs |
"I want regex by default and .gitignore aware" |
fd (a modern find rewrite) |
fd — the modern alternative
If you can install fd, you'll wonder how you tolerated find for so long. Sensible defaults:
fd '\.log$' /var/log # regex matching by default
fd -t f -e py # only files, only .py extension
fd -X rm # apply command (like find -exec ... +)
fd -H # include hidden + .gitignore'd files
It's find with the rough edges sanded off. Same core idea — live filesystem walk — just nicer to type.
Common bugs
Forgetting -type f. Without it, find . -name '*.log' will also match a directory named something.log. Add -type f for files, -type d for directories.
Not escaping the pattern. find . -name *.log lets the shell expand *.log first. If exactly one .log file is in the current dir, the command silently does the wrong thing. Always quote: -name '*.log'.
Using -exec ... \; for huge result sets. Runs the command once per match. -exec ... + (or piping through xargs) batches and is orders of magnitude faster.
Trusting locate after a fresh install. The DB is stale until updatedb next runs. Run it manually if the answer matters.
Tools in the wild
4 tools- clifindfree tier
POSIX-standard live filesystem walker — every box has it.
- clilocate / plocatefree tier
Indexed filename query — sub-second answers from a nightly-built database.
- clifdfree tier
Modern find replacement — regex by default, .gitignore-aware, parallel.
- clixargsfree tier
Turns stdin into command arguments. `-0` for NUL-separated, `-P` for parallel.