macos · level 7

Spotlight & mdfind

Querying the Spotlight index from the terminal — kMDItem attributes, exclusions, volume indexing.

150 XP

Spotlight & mdfind

Spotlight is the indexing engine behind macOS search — the cmd-space dialog, Finder's search bar, Mail's filter, all of it. Once you can drive it from the terminal with mdfind, you have the most powerful file-search tool any operating system has shipped, and it's been there all along.

Analogy

Spotlight is like having a librarian who has read every page of every book in your library and remembers everything. Open the cmd-space dialog and you're asking her a question — "do you remember a document about Q3 revenue?". She points at the shelf instantly. mdfind is the same librarian, except now you're slipping her a written request through the back door at 3am with very specific filters: "any PDF over 50 pages, authored by Sarah, modified this month, on this drive only." She still answers in milliseconds.

How it works

Three components:

  • mds — the metadata server. Indexes file metadata in real-time as files change. Runs as a daemon launched by launchd.
  • mdworker — workers spawned by mds to extract metadata from specific file types. Each format (PDF, PNG, .docx, .pages) has a metadata importer plugin in /System/Library/Spotlight and /Library/Spotlight.
  • The index — stored under /.Spotlight-V100 on each indexed volume. Contains both metadata and an inverted index of file content text.

Querying:

  • mdfind — terminal interface. Read-only.
  • The cmd-space dialog — UI interface. Same index.
  • Finder search — same index, with a different UI on top.

All three read the same store; queries return the same results.

The basic mdfind queries

mdfind "Annual Report"                  # full-text search across indexed content
mdfind -name budget.xlsx                # filename only
mdfind -onlyin ~/Documents "draft"      # restrict to a directory
mdfind -live "TODO"                     # streaming results — updates as new matches appear
mdfind -count "TODO"                    # just the count
mdfind -literal "C++"                   # treat the query as a literal string

The -onlyin flag is one to remember — without it, Spotlight searches every indexed volume and returns thousands of results.

kMDItem attributes — the metadata schema

Every file Spotlight indexes gets a set of metadata attributes. They're prefixed with kMDItem:

$ mdls ~/Downloads/example.pdf | head -20
_kMDItemDisplayNameWithExtensions = "example.pdf"
kMDItemAuthors                    = ("Sam Bailey")
kMDItemContentCreationDate        = 2026-04-28 14:30:00 +0000
kMDItemContentModificationDate    = 2026-04-28 14:35:00 +0000
kMDItemContentType                = "com.adobe.pdf"
kMDItemContentTypeTree            = ("com.adobe.pdf", "public.composite-content", "public.content", "public.data", "public.item")
kMDItemFSCreationDate             = 2026-04-28 14:30:00 +0000
kMDItemFSName                     = "example.pdf"
kMDItemFSSize                     = 12345
kMDItemKind                       = "Portable Document Format (PDF)"
kMDItemNumberOfPages              = 42
kMDItemPageHeight                 = 792
kMDItemPageWidth                  = 612
...

Hundreds of attributes are defined. Different file types use different subsets — images have kMDItemPixelHeight, audio has kMDItemAudioBitRate, etc.

Querying by attribute

The query language is Apple's predicate format:

# Find every PDF:
mdfind 'kMDItemContentType == "com.adobe.pdf"'

# All images bigger than 4000px wide:
mdfind 'kMDItemPixelWidth >= 4000'

# Everything authored by me, modified in the last week:
mdfind 'kMDItemAuthors == "Sam Bailey" && kMDItemContentModificationDate > $time.now(-7d)'

# Combination — PDFs in Documents, larger than 5MB:
mdfind 'kMDItemContentType == "com.adobe.pdf" && kMDItemFSSize > 5000000' -onlyin ~/Documents

The string match supports ==, !=, >=, <=, <, >. For prefix/wildcard matching, use == with *:

mdfind 'kMDItemDisplayName == "report*"'   # files starting with "report"
mdfind 'kMDItemDisplayName == "*2026*"'    # files with 2026 anywhere in the name

The c modifier makes a comparison case-insensitive:

mdfind 'kMDItemDisplayName ==[c] "Report*"'   # case-insensitive prefix match

Content type — the most useful attribute

kMDItemContentType is a UTI (Uniform Type Identifier) string. The major ones to know:

UTI What it is
public.image Any image
public.movie Any video
public.audio Any audio
public.text Any text-based file
public.html HTML
com.adobe.pdf PDF
com.microsoft.word.doc / .docx Word docs
com.apple.iwork.pages.pages Pages
public.python-script .py files
public.shell-script .sh files
public.source-code Source code in any language

kMDItemContentTypeTree is the inheritance chain — a PDF is ("com.adobe.pdf", "public.composite-content", "public.content", "public.data", "public.item"). Querying against the tree lets you say "any image" with 'kMDItemContentTypeTree == "public.image"'.

Excluded paths

Spotlight respects user-configured exclusions:

System Settings → Spotlight → Privacy → drop folder onto list

The exclusion list is stored at ~/Library/Preferences/com.apple.spotlight.plist. Anything in an excluded path is invisible to mdfind, the cmd-space dialog, and Finder search — by design.

System-wide exclusions are at /.Spotlight-V100/VolumeConfiguration.plist (root-owned). And finally, files marked with the extended attribute com.apple.metadata:kMDItemUserOwnedFolderState or with a .noindex extension on the parent folder are skipped automatically.

If a search returns nothing for a file you can see in Finder, the exclusion list is the first thing to check.

Volume indexing

Each volume has its own index, controlled with mdutil:

mdutil -s /                            # status of root volume
# /:
#     Indexing enabled.

mdutil -as                             # status of all mounted volumes

sudo mdutil -i off /Volumes/Scratch    # disable indexing for a volume
sudo mdutil -i on  /Volumes/Scratch    # enable
sudo mdutil -E /Volumes/Backup         # erase and rebuild a volume's index

External drives default to indexing-enabled when first connected. If a colleague hands you a USB drive and search returns nothing, mdutil -as will tell you why — often the index is missing or stalled.

A common fix sequence when search "feels stale":

sudo mdutil -i off /
sudo mdutil -i on /
# spotlight reindexes — takes minutes-to-hours depending on disk size

Getting metadata on a single file

mdls /path/to/file
# lists every kMDItem* attribute on that file

mdls is the read-side counterpart to mdfind. Useful for:

  • Discovering which attributes a particular file format exposes (just mdls an example file).
  • Debugging "why isn't this file matching my query?" — check the actual attribute values.
  • Programmatic access — many of these attributes are richer than stat() provides.

mdls -name <attr> returns just one attribute:

mdls -name kMDItemAuthors example.pdf
# kMDItemAuthors = ("Sam Bailey")

Streaming results — -live

mdfind -live "TODO"

This blocks and streams results as they appear or disappear. Useful for "watch this query and tell me when something matches" — drop a file matching the query into Spotlight's reach and the line shows up. CTRL-C to exit.

Common patterns

Find every screenshot taken this week:

mdfind 'kMDItemIsScreenCapture == 1 && kMDItemContentCreationDate > $time.now(-7d)'

All movies bigger than 1GB on the external drive:

mdfind -onlyin /Volumes/Media 'kMDItemContentTypeTree == "public.movie" && kMDItemFSSize > 1000000000'

Code files you've touched today:

mdfind 'kMDItemContentTypeTree == "public.source-code" && kMDItemContentModificationDate > $time.today'

The opposite of full-text search — find files NOT containing a string:

mdfind doesn't have a "not containing" operator on full text. Combine mdfind ... with grep -L to filter post-hoc:

mdfind -onlyin ~/projects "imports" | xargs grep -L "vitest"

When mdfind isn't right

  • Files outside indexed volumes. No index, no results. Use find or fd.
  • Very recent files (seconds-old). Index propagation has a small lag.
  • Network filesystems. Spotlight doesn't index NFS / SMB by default; some configurations do, most don't.
  • Excluded paths. Configurable invisibility — see above.

For "find every .py file under this directory tree right now," find or fd is faster and more reliable. For "any file across my entire system matching this complex predicate," mdfind wins.

What to internalise

  • mdfind is the terminal entry point to the same Spotlight index the cmd-space dialog uses.
  • kMDItem* attributes are the metadata schema; mdls <file> shows all of them.
  • mdutil controls indexing per-volume; mdutil -E is the rebuild button.
  • The Privacy list and .noindex folders are the explicit "skip me" mechanisms.
  • For network drives, recent files, and unindexed volumes, fall back to find/fd.

Tools in the wild

5 tools
  • mdfindfree tier

    Built-in. Spotlight queries from the terminal.

    cli
  • mdlsfree tier

    Built-in. Show every Spotlight metadata attribute on a file.

    cli
  • mdutilfree tier

    Built-in. Turn indexing on/off per volume; rebuild the index.

    cli
  • Spotlight replacement with workflows, custom searches, clipboard history.

    service
  • Raycastfree tier

    Modern Spotlight replacement with extensions and team sync.

    service