networking · level 8

Traceroute and MTR

How traceroute works, and where it lies.

175 XP

Traceroute and MTR

Every networking debug starts with the same question: "where, exactly, is the traffic going?" Traceroute is the foundational tool. It is also one of the most misread tools in the box — output looks definitive, but it lies in subtle, important ways.

How traceroute works

Each IP packet has a TTL field — Time To Live. Routers decrement TTL by one at every hop. When TTL hits zero, the router drops the packet and sends back an ICMP Time Exceeded (Type 11) message. The source of that ICMP reply is the router's IP — that's the hop traceroute discovered.

Traceroute exploits this:

Probe 1: TTL=1   →  router 1 decrements → 0 → ICMP from router 1
Probe 2: TTL=2   →  routers 1,2 decrement → 0 → ICMP from router 2
Probe 3: TTL=3   →  routers 1,2,3 → 0 → ICMP from router 3
...
Final probe with TTL high enough to reach the destination — destination
   responds with ICMP Port Unreachable or TCP RST or HTTP, depending on
   the probe type and what was sent.

Each hop is discovered by the SOURCE address of the ICMP Time Exceeded reply. By default traceroute sends 3 probes per hop and records the round-trip time (RTT) of each. A typical line:

  3  72.14.215.182  4.231 ms  4.501 ms  4.412 ms

That's hop 3, IP 72.14.215.182, three RTT samples in milliseconds.

Reading the asterisks

  5  10.0.0.5  3.2 ms  3.1 ms  3.4 ms
  6  *  *  *
  7  *  *  *
  8  72.14.215.182  18.2 ms  18.1 ms  18.0 ms
  9  72.14.215.182  18.5 ms  *  17.9 ms

What looks alarming usually isn't. Three asterisks mean the probes did not get a reply within the timeout. Reasons:

  • The router refuses to send ICMP (security policy).
  • The router's ICMP rate-limit kicked in (sends 1/s but you sent 3/s).
  • The router IS forwarding fine — the probe just didn't elicit a reply.
  • The path uses MPLS and the router doesn't surface itself in the hop count.

A row of * * * does not mean the path is broken. The next hop replied, so the packets ARE getting through. Diagnose loss with MTR (continuous probing), not a single traceroute snapshot.

Where traceroute lies

Asymmetric paths. The path your packet takes to a destination is rarely the same as the path the reply takes back. Traceroute shows the forward path. RTT includes the (unseen) reverse path. If your RTT spikes at hop 7, that might be the forward path slowing down OR the reverse path through some completely different network.

MPLS hop hiding. Backbone ISPs run MPLS — labels are switched at the edge LSRs and the routers in between forward by label, not IP. Some routers don't increment TTL inside the LSP, so traceroute sees fewer hops than physically exist. The latency at the LSP exit hop "absorbs" all the intermediate latency.

Variable per-probe paths. Equal-Cost Multi-Path (ECMP) routers hash the 5-tuple to pick an outgoing link. Successive probes (with different source ports) take different paths. Three probes from one hop showing wildly different RTTs is often ECMP variance, not chaos.

ICMP rate-limiting. A router sends one ICMP-per-source-per-second by default. If you traceroute fast (traceroute -q 10), the router drops most replies. The drops look like loss; they're not.

Firewall drops. UDP probes default; many firewalls drop UDP. TCP probes (tcptraceroute, or traceroute -T) sail through because they look like regular HTTP/HTTPS opens.

MTR — continuous traceroute

MTR runs traceroute over and over, accumulating per-hop stats:

                                  Packets               Pings
 Host                          Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. 192.168.1.1                0.0%    50    1.2   1.4   1.0   3.5  0.4
  2. 96.120.10.5                0.0%    50    9.4   9.8   9.0  12.4  0.6
  3. 162.151.5.245              0.0%    50   12.4  13.0  11.9  17.5  1.0
  4. 96.108.155.5              30.0%    50   18.9  19.4  18.0  29.0  2.5
  5. 162.151.5.121              0.0%    50   18.5  19.0  18.0  21.0  0.7

Read it column by column:

  • Loss% — per-hop packet loss. Real loss almost always shows up here AND on every hop after it (loss compounds downstream). If only one hop shows loss and downstream hops show 0%, that's ICMP rate-limiting on that one router, not real loss.
  • Last/Avg/Best/Wrst/StDev — per-hop latency, accumulated from the source. A jump in Wrst at one hop with low StDev on subsequent hops means that ONE hop is occasionally slow.

Sustained loss on hop 4 with hop 5 at 0% loss = router 4 is ICMP-rate-limiting, your traffic is fine. Sustained loss on hop 4 AND hop 5 AND hop 6 = real loss starting at hop 4.

When you actually use it

You're debugging a slow connection from your laptop to api.example.com:

# Quick path discovery
mtr -r -c 100 api.example.com

# If UDP/ICMP are filtered:
sudo mtr -T -P 443 -r -c 100 api.example.com

# Get the as-path / org per hop
sudo mtr --aslookup api.example.com

What you're looking for:

  1. Where does latency jump? A 5ms hop followed by a 50ms hop = the long-haul link is between those two routers.
  2. Where does loss start? Real loss persists on every downstream hop.
  3. Whose network is it? --aslookup shows the AS (Autonomous System) per hop. If hop 4 belongs to a transit ISP and hops 5+ belong to the destination's AS, the handoff between those is where you'd escalate.

RIPE Atlas — when one vantage point isn't enough

Sometimes the problem is your network — or only your network. RIPE Atlas is a global network of probe devices (10,000+ in 2026). You can run a traceroute from any of them via API:

curl -X POST https://atlas.ripe.net/api/v2/measurements/ \
  -H "Authorization: Key YOUR_KEY" \
  -d '{
    "definitions": [{"target": "api.example.com", "type": "traceroute", "af": 4}],
    "probes": [{"type": "country", "value": "DE", "requested": 5}],
    "is_oneoff": true
  }'

If five probes from Germany see clean paths and your path from Berlin is broken, the problem is local to your transit. If all five show the same break, it's the destination side.

Common patterns

  • High latency on hop 1 = your local router or LAN, not the internet.
  • High latency on hop 2-3 = your ISP's first/second hop. Probably nothing you can fix.
  • High latency at a "well-known" hub (an ix.net, deCIX, equinix.net hostname) = an internet exchange peering point. Probably congested.
  • High latency right before destination = the destination's transit provider or local network.
  • Cloud routing oddness — packets to AWS us-east-1 from Europe sometimes traverse Newark via Equinix or via direct fiber. RTT spikes that are 80ms vs 100ms swap depending on routing decisions.

Limits to remember

  • Traceroute shows the forward path only.
  • A row of asterisks doesn't mean the path is broken.
  • One probe round-trip is noise; MTR's accumulated stats are signal.
  • ICMP rate-limiting is the most common false positive.
  • Firewalls dropping UDP probes look identical to "internet broken" — switch to TCP probes.

Used carefully, traceroute is one of the few tools where you can pinpoint a problem to one router. Used carelessly, it tells confident lies.

Tools in the wild

4 tools
  • traceroutefree tier

    Classic UDP/ICMP/TCP path discovery. -n skips DNS. -T uses TCP.

    cli
  • mtrfree tier

    My TraceRoute — continuous probing with per-hop loss and latency stats.

    cli
  • tcptraceroutefree tier

    Sends TCP SYN probes — traverses firewalls that drop UDP/ICMP.

    cli
  • RIPE Atlasfree tier

    Global probe network — run traceroutes from 10000+ vantage points around the world.

    service