TCP/IP Protocols – Part 1

This series will be mostly for personal reference as I go through W. Richard Stevens’ “TCP/IP Illustrated- Volume 1 – The Protocols” textbook. Some of the notes will appear random. I will most likely skip over large portions that are either very simple or uninteresting.

Seeing how protocols operate in varying circumstances provides a greater understanding of how they work and why certain design decisions were made. This book will cover Ping, Telnet, Rlogin, FTP, SMTP, X, Traceroute, DNS, TFTP, BOOTP, SNMP, NFS, and RPC. The majority of those protocols are implemented via TCP and/or UDP on top of IP. The only exception is Ping, which uses ICMP and does not use TCP or UDP. This is common trick interview question: “Which layer of the OSI model does ICMP reside?” Many will say Transport, but really ICMP is part of the Network layer. ICMP and IGMP messages are encapsulated in IP datagrams. A similar trick question could include ARP and RARP which are actually part of the Link layer, below IP.

Chapter 1 – Introduction

The TCP/IP protocol suite has far exceeded its original estimates. Initially it started as a government funded research project. There are 4 major layers to the suite: Link, Network, Transport, and Application. The OSI “Open Systems Interconnection” model further expands these layers to offer more granularity. Link layer contains the device driver and network interface card. Network layer handles routing. Transport provides data flow control, both reliable and unreliable. Application layer handles details of the particular application being used. The layers below application are the supporting framework; they are application-agnostic. The unit name used in networking is the octet. While today the octet is 1 byte in size (8 bits), that was not always the case. The early development for TCP/IP was done on a DEC-10 machine (aka PDP-10), which didn’t use 8-bit bytes.

Encapsulation

A physical property of an Ethernet frame is that the size of its data must be between 46 and 1500 bytes. Some internet routers allow jumbo frames, but not all. Sending a jumbo frame to a incapable router could result in fragmentation, or the packet may just be dropped (need to confirm).

RFC

RFCs “Request For Comment” are the official standards of the internet community. They are living design documents. The Assigned Numbers RFC specifies all the magic numbers and constants that are used in internet protocols. The Router Requirements RFC specifies the unique requirements of routers. There are some interesting sections in that particular RFC, for example the “robustness principle.” This particular RFC was last updated in 1995.

1.3.2 Robustness Principle

   At every layer of the protocols, there is a general rule (from
   [TRANS:2] by Jon Postel) whose application can lead to enormous
   benefits in robustness and interoperability:

                      Be conservative in what you do,
                be liberal in what you accept from others.

   Software should be written to deal with every conceivable error, no
   matter how unlikely.  Eventually a packet will come in with that
   particular combination of errors and attributes, and unless the
   software is prepared, chaos can ensue.  It is best to assume that the
   network is filled with malevolent entities that will send packets
   designed to have the worst possible effect.  This assumption will
   lead to suitably protective design.  The most serious problems in the
   Internet have been caused by unforeseen mechanisms triggered by low
   probability events; mere human malice would never have taken so
   devious a course!

   Adaptability to change must be designed into all levels of router
   software.  As a simple example, consider a protocol specification
   that contains an enumeration of values for a particular header field
   - e.g., a type field, a port number, or an error code; this
   enumeration must be assumed to be incomplete.  If the protocol
   specification defines four possible error codes, the software must
   not break when a fifth code is defined.  An undefined code might be
   logged, but it must not cause a failure.

Another interesting RFC to check out is RFC 1000, the “Request For Comments Reference Guide” which is designed to provide a historical account by categorizing and summarizing of the Request for Comments numbers 1 through 999 issued between the years 1969-1987. See it here.

Chapter 2 – Link Layer

The link layer uses 48-bit hardware addresses as opposed to the IPv4 layer’s usage of 32 bit addresses. PPP (Point-To-Point Protocol) is still used today, it fixes the shortcomings of the serial protocol SLIP (Serial Line IP). Each frame begins and ends with a flag byte 0x7e. That byte is followed by an address byte of 0xff, then a control byte 0x03. Most of these network protocols are just tag-length-value types describing the size of chunks, and/or they have defined unchanging offsets for protocol fields. It simple and fast to parse the bytes on the wire due to this design. Thinking about it, I’m not sure it could have been designed any other way.

MTU

The maximum transmission unit “MTU” limits the number of bytes that can be in a single Ethernet frame. This number is 1500 bytes. IP will fragment packets larger than this number. Note: This book is old as yesterday, IP today probably handles jumbo frames differently. Utilize netstat to see the MTU of a specific interface:

user@ubuntu:~/tcpip_journey$ netstat -in
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
ens33      1500 0    318426      0      0 0         67091      0      0      0 BMRU
lo        65536 0     14785      0      0 0         14785      0      0      0 LRU

You can see here that my ethernet interface has MTU of 1500 bytes, while the loopback allows 65536 bytes. This is probably because loopback just memory-maps the “sent” data and passes a pointer around (just a guess).

Chapter 3 – IP Routing

The options field in an IP datagram is a variable length list of optional information. The options defined in 1995 were:

security and handling restrictions for military applications

record route - each traversed router would record its IP address into the datagram

timestamp - each router records its IP and timestamp into the datagram

loose source routing - specify a list of IP addresses that must be traversed by the datagram

strict source routing - specify a list of IP address that the datagram can traverse. all other addresses are not allowed.

Record Route looks very interesting to me. Some researchers at Princeton did exploration of the RR option in 2017 with actual results. They found that a solid percentage of routers will acknowledge your Record Route request. Note that Record Route has a strict 9 hop limit. The paper’s abstract:

The IPv4 Record Route (RR) Option instructs routers to record their
IP addresses in a packet. RR is subject to a nine hop limit and,
traditionally, inconsistent support from routers. Recent changes in
interdomain connectivity—the so-called “flattening Internet”—and
new best practices for how routers should handle RR packets suggest
that now is a good time to reassess the potential of the RR Option.
We quantify the current utility of RR by issuing RR measurements
from PlanetLab and M-Lab to every advertised BGP prefix. We
find that 75% of addresses that respond to ping without RR also
respond to ping with RR, and 66% of these RR-responsive addresses
are within the nine hop limit of at least one vantage point. These
numbers suggest the RR Option is a useful measurement primitive
on today’s Internet.

Their results:

Finally, to test this yourself, check out the man page for iputils ping, then use the -R switch to enable record-route.

-R     ping  only.   Record  route.  Includes the RECORD_ROUTE option in the ECHO_REQUEST packet and
       displays the route buffer on returned packets.  Note that the IP header is only large  enough
       for nine such routes.  Many hosts ignore or discard this option.

My LAN router happened to acknowledge the flag with the following results. I’m performing the scan from a VM:

user@ubuntu:~/tcpip_journey$ ping -R 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(124) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=24.7 ms
RR: 	192.168.1.198
  192.168.1.1
  192.168.1.1
  192.168.1.198

64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.594 ms	(same route)
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.547 ms	(same route)

 

Leave a Reply