networking · level 3

TCP

Handshake, reliability, slow start.

200 XP

TCP

TCP turns an unreliable datagram network into an ordered, reliable byte stream. Every web request, database query, and API call rides on top of this machinery — understanding it is prerequisite for diagnosing latency, dropped connections, and performance cliffs.

Analogy

TCP is like shipping a novel chapter by chapter via a courier that loses roughly one envelope per hundred. Before anything ships, the two mailrooms exchange three notes confirming they can hear each other, the return addresses are correct, and the starting page number is agreed (the handshake). Then pages go out numbered in sequence; the receiver mails back receipts saying "I've got everything through page 47, send 48 next". If a receipt never comes, the sender reprints that page from its carbon copy and resends it. The sender also starts cautious — a single page at a time — and only speeds up once the receipts confirm the line is healthy, slamming the brakes the moment a receipt is missed. Closing the connection is a polite symmetric exchange: each side says "I'm done sending" and waits for the other to acknowledge before the phone actually hangs up.

The three-way handshake

Before a single byte of application data moves, TCP must establish a connection. This takes one and a half round trips:

Client                          Server
  │                               │
  │──── SYN (seq=1000) ──────────▶│  Client picks a random initial sequence number
  │                               │
  │◀─── SYN-ACK (seq=5000,        │  Server picks its own ISN, acknowledges client's
  │             ack=1001) ────────│
  │                               │
  │──── ACK (ack=5001) ──────────▶│  Client acknowledges server's ISN
  │                               │
  │══════ data flows ═════════════│

The handshake takes one round trip (RTT) before data can flow. This is why connection pooling and keep-alive matter: an RTT of 50ms to a geographically distant server adds 50ms to every new connection before the first byte of HTTP is sent.

Sequence numbers

Every byte in a TCP stream has a sequence number. The receiver sends acknowledgement numbers that say "I have received everything up to byte N, send me N+1 next." Sequence numbers allow:

  • Ordering: segments that arrive out of order are buffered and reassembled.
  • Deduplication: retransmissions of already-received bytes are silently discarded.
  • Loss detection: if the sender does not receive an ACK within a timeout, it retransmits.

Initial sequence numbers (ISN) are randomly chosen to prevent old segments from a previous connection being misinterpreted by a new one.

Retransmission

TCP detects loss in two ways:

Timeout: If no ACK arrives within the retransmission timeout (RTO), the sender retransmits. RTO starts around 200ms and doubles on each failure (exponential backoff). This is the worst case — a missed segment can stall the connection for hundreds of milliseconds.

Fast retransmit: If the receiver gets three segments after a gap (three duplicate ACKs), it has received data past the missing segment. The sender infers loss immediately and retransmits without waiting for a timeout. This is the common case on modern networks.

Slow start and congestion control

TCP does not know the capacity of the network path. It probes by starting slow:

  1. The sender begins with a congestion window of a few segments.
  2. Every ACK increases the window — doubling it each RTT (slow start).
  3. When the window reaches a threshold, growth slows to additive increase.
  4. On packet loss, the window is cut dramatically (multiplicative decrease).

The window tells the sender how much unacknowledged data it can have in flight. The actual send rate is min(congestion window, receive window) / RTT. A receiver with a full buffer advertises a small receive window, slowing the sender.

The four-way close

Closing a TCP connection is symmetric — either side sends a FIN, the other ACKs it, then it sends its own FIN. This allows half-close: one side can finish sending while the other still has data to send.

The TIME_WAIT state (lasting 2 × MSL, typically 60–120 seconds) prevents confusion if delayed segments from the closed connection arrive on a new connection that reuses the same 4-tuple (src-ip, src-port, dst-ip, dst-port).

What engineers see in practice

Connection refused: The destination host is up but nothing is listening on that port. The OS sends a TCP RST immediately.

Connection timed out: No SYN-ACK arrived. The host is unreachable, a firewall is silently dropping packets, or the route is broken.

ESTABLISHED sockets consuming no CPU: Keep-alive probes maintain idle connections. Check with ss -tp or netstat -an.

High retransmit rate: Indicates packet loss. Check your cloud provider's network metrics or netstat -s | grep retransmit.

Slow start after a pause: A connection that is idle for several RTTs has its congestion window reset. The first burst after a pause is slow. This is why HTTP/2 multiplexing over a single long-lived connection can outperform many short HTTP/1 connections.

Tools in the wild

6 tools
  • Wiresharkfree tier

    The packet analyzer — visualize SYN/SYN-ACK/ACK, retransmits, window scaling, RSTs.

    cli
  • tcpdumpfree tier

    Headless packet capture; pair with Wireshark for offline analysis on production hosts.

    cli
  • ssfree tier

    Modern netstat — show TCP states, queues, retransmits, congestion window per socket.

    cli
  • iperf3free tier

    Active TCP/UDP throughput tester between two hosts; great for diagnosing buffer issues.

    cli
  • mtrfree tier

    Combined traceroute + ping with rolling packet-loss stats per hop.

    cli
  • tc / netemfree tier

    Linux traffic shaper — inject latency, loss, reordering to test TCP behaviour locally.

    cli