|
"..The
means can be likened to a seed, the end to a tree, and there is just
the same inviolable connection between the means and the end as there
is between the seed and the tree. They say: “Means are, after all,
just means.” I would say: “Means are, after all, everything.”
As the means, so the end.........If we take care of the means, we are
bound to reach the end sooner or later..."
- Mahatma Gandhi
There is a lot of information scattered across the Internet about the
different utilities used for network troubleshooting. Yet, few articles
detail the background processes taking place while using these
utilities. To get a complete and precise idea on how tracing and
troubleshooting
work in a network, this article analyzes the internal working of 3 network
utilities and checks what makes each of them unique.
Before starting with these utilities, let us have a look at one of the most
important protocol used by them - ICMP.
ICMP - Internet Control Message Protocol
Internet Control Message Protocol (ICMP) is a part of the RFC 792 defined
Internet protocol suite. ICMP is used for sending error messages
like "destination could not be reached", "time to live exceeded" etc. An ICMP
packet will have an IP header and the ICMP message data. The first 32 bits in
the ICMP message data contains the type of the ICMP packet, the code and the
checksum. The Data field contains the payload. It has variable size depending
on the type of the ICMP message.
The Type of the ICMP packet indicates its function like Destination
Unreachable, Time Exceeded, Echo etc. The Code is a subtype which indicates
further details related to the parent Type like Net Unreachable, Host
Unreachable etc. The commonly seen Types and Codes are mentioned below:
|
Type
|
Name
|
Code
|
|
0
|
Echo
Reply
|
0
No codes
|
|
3
|
Destination
Unreachable
|
0
Net Unreachable
1
Host Unreachable
2
Protocol Unreachable
3
Port Unreachable
4
Fragmentation Needed and Don't Fragment was Set
5
Source Route Failed
6
Destination Network Unknown
7
Destination Host Unknown
8
Source Host Isolated
9
Communication with Destination Network is Administratively
Prohibited
10 Communication with Destination Host is
Administratively Prohibited
11
Destination Network Unreachable for Type of Service
12
Destination Host Unreachable for Type of Service
13
Communication Administratively Prohibited
|
|
4
|
Source
Quench
|
0
No Code
|
|
5
|
Redirect
|
0
Redirect Datagram for the Network (or subnet)
1
Redirect Datagram for the Host
2 Redirect Datagram
for the Type of Service and Network
3
Redirect Datagram for the Type of Service and Host
|
|
8
|
Echo
|
0
No Code
|
|
11
|
Time
Exceeded
|
0
Time to Live exceeded in Transit
1
Fragment Reassembly Time Exceeded
|
The field "Checksum" in the ICMP message data is used to verify the
integrity of the incoming ICMP packet by the receiving host. The
Checksum is the 16-bit one's complement of the one's
complement sum. The complete ICMP message (starting from the Type
field to the end of the data field) is considered to calculate this
value.
The utilities being discussed below use different ICMP
messages for communication. Let us look into each of them in detail.
ping
The name "ping" was named after the sound of the sonar used to locate
objects.
It is the basic connectivity testing tool between 2 machines running
TCP/IP.
ICMP packets with a Type 8 Code 0 echo requests are send out by the ping
utility. Every packet's sequence number will be increased by 1 but each of
them will
have the same identifier. If a connection is established with the other host,
an ICMP Type 0 Code 0 echo reply packet having the same identifier will be
received.
A judgment on whether the connection is reliable or not can be made by
checking if all the packets are received back in sequence.
The fields in an ICMP packet are shown below:
The following is a ping
session which I did to google.com from my console.
chacko@server:~$ping google.com
PING google.com (63.233.167.99): 56 data bytes
64 bytes from 63.233.167.99: icmp_seq=0 ttl=247 time=56.9 ms
64 bytes from 63.233.167.99: icmp_seq=1 ttl=247 time=57.2 ms
64 bytes from 63.233.167.99: icmp_seq=2 ttl=247 time=57.0 ms
64 bytes from 63.233.167.99: icmp_seq=3 ttl=247 time=56.8 ms
64 bytes from 63.233.167.99: icmp_seq=4 ttl=247 time=57.0 ms
64 bytes from 63.233.167.99: icmp_seq=5 ttl=247 time=56.9 ms
64 bytes from 63.233.167.99: icmp_seq=6 ttl=247 time=56.6 ms
64 bytes from 63.233.167.99: icmp_seq=7 ttl=247 time=56.7 ms
64 bytes from 63.233.167.99: icmp_seq=8 ttl=247 time=56.5 ms
64 bytes from 63.233.167.99: icmp_seq=9 ttl=247 time=57.0 ms
--- google.com ping statistics ---
10 packets transmitted, 10 packets received, 0% packet loss
round-trip min/avg/max = 56.5/56.8/57.2 ms
Fields in the ping output:
The ping utility resolved the hostname google.com to the IP 63.233.167.99.
The next field in the output shows the number of data bytes to be send -
which is 56. Combined with the 8 bytes of ICMP header data, this translates
to 64 data bytes which is shown at the beginning of each ping request. The
Sequence Number of each request is denoted by the icmp_seq field in the ping
output, which gets incremented. The "ttl" or "Time To Live" field in the
Internet Protocol (IP) specifies how many more hops a packet can travel
before being discarded or returned.
The ping program shows the ttl value of the packet sent to it from a remote
location. These remote systems can change the TTL values to different ones in
the reply (values can be 255, 128, 60 etc). The value seen above is the
initial value minus the round-trip number of hops. The time shown in
milliseconds is the round trip time or round trip delay time(RTT). RTT is the
time required for a transmitted pulse to reach a target and for the echo
reply to return to the receiver.
At the end of the output, a statistics is displayed, which shows the
packet loss percentage, the minimum/maximum/average RTT.
traceroute
traceroute (tracert in Windows) prints the route which the packets takes
in a TCP/IP network on their way to destination.
The command traceroute hostname sends three UDP packets having a TTL
value
of 1. On arrival of the packets at the closest router, the router decreases
the TTL value by one, thus making it 0. When a packet with TTL value 0 is
noticed by the router, it responds by sending an ICMP packet "time exceeded"
(Type 11 Code 0) as "time to live exceeded in transit." The IP address of
the router that sends back the 3 ICMP packets is noted by the traceroute
utility. It will then calculate the time to receive each of the packets and
then sends out three more UDP packets, this time with a TTL value of 2.
Since these packets now have a TTL value of 2, they should be returned by
the
router that is away by 2 hops from the one sending the packets. Upon receiving
these packets, they will be noted by the traceroute utility and it sends out
three more UDP packets, with a TTL value of 3.This process is continued by
the traceroute utility until it either reaches the final destination or it
has gone through the default maximum value of 30 routers. Since these
datagrams try to access an invalid port at the destination host, the message
returned will be ICMP Port Unreachable, indicating an unreachable port. This
event signals the traceroute program that it is finished.
The outgoing packets from traceroute are sent towards the destination using
UDP at very high port numbers, typically in the range of 32,768 and higher.
This is because no one usually runs UDP services in those ports, so when the
packet finally reaches the destination, traceroute can know that.
Sometimes, we see timeout in the traceroute output. For example:
16 ge-0-0-0-p130.msr2.dcn.yahoo.com (216.115.108.13) 283.043 ms 285.184 ms
ge-0-0-0-p120.msr1.dcn.yahoo.com (216.115.108.9) 283.043 ms 283.853 ms
17 ge7-1.bas1-m.dcn.yahoo.com (216.109.120.205) 279.207 ms 279.348
ms ge10-2.bas2-m.dcn.yahoo.com (216.109.120.249) 290.857 ms
18 * * *
19 * * *
This means that there is no reply from the destination host. This can be
due to a variety of reasons and doesn't necessarily mean that the host is
down. In fact the destination host might be receiving the packets sent, but
not sending back a reply. The next host might be down or the network
connecting to it may be down. Or, there is chance of a routing issue on the
way back (which need not be the same route as the forward route). Some ISPs
set policies in their firewalls and routers as security measures such that
these ICMP reply packets are blocked.
Let me paste a traceroute session which I did to google.com from my
console.
chacko@server:~$traceroute google.com
traceroute:Warning: google.com has multiple addresses; using 63.238.197.99
traceroute to google.com (63.238.197.99), 30 hops max, 38 byte packets
1 * * *
2 xxxxxxxxxxxxxxxxxxxx (208.94.33.1) 0.671 ms 0.644 ms 0.767 ms
3 415.ge-5-2-1.mpr2.sfo3.us.above.net (63.123.129.43) 0.997 ms 1.044 ms
0.809 ms
4 so-3-3-0.mpr4.sjc2.us.above.net (63.123.30.213) 2.132 ms 2.218 ms 2.231
ms
5 so-6-0-0.mpr1.lax9.us.above.net (63.123.23.206) 10.766 ms 10.583 ms
10.782 ms
6 so-3-0-0.mpr2.lax9.us.above.net (63.123.31.102) 11.091 ms 10.913 ms
10.628 ms
7 so-4-1-0.mpr1.iah1.us.above.net (63.123.29.106) 40.584 ms 40.605 ms
49.805 ms
8 so-0-0-0.mpr2.iah1.us.above.net (63.123.31.62) 45.071 ms 40.432 ms
40.598 ms
9 so-5-1-0.mpr1.atl6.us.above.net (63.123.29.61) 53.759 ms 53.239 ms
53.203 ms
10 63.123.229.173.google.com (63.123.229.173) 55.633 ms 53.367 ms 53.653 ms
11 63.233.174.86 (63.233.174.86) 54.031 ms 64.233.174.84 (63.233.174.84)
53.647 ms 53.526 ms
12 70.12.236.173 (70.12.236.173) 71.348 ms 54.304 ms 54.479 ms
13 215.233.49.223 (215.233.49.223) 55.993 ms 56.158 ms 57.354 ms
14 jc-in-f99.google.com (64.233.187.99) 54.666 ms 54.179 ms 54.517ms
Fields in the traceroute output:
Since google.com has got multiple IP addresses pointed to it, some versions
of traceroute shows the warning message as in the above traceroute output.
The output shows the maximum number of hops traceroute attempts which is 30
in this case and a 38 byte packet has been used. The first hop in this output
shows timeout. In the subsequent hops, we can see 3 fields at the end of each
hop, which denotes the RTT of the 3 packets sent to each of the systems. In
the 11th hop, we can see that the 2nd packet was sent to a different IP. This is because of the load balancing setup there, which
takes each access to different systems.
mtr (My Traceroute)
mtr combines the functionalities of the 'traceroute' and 'ping'
utilities.
When mtr starts running, it investigates the network connection
between the host in which it runs, and a user-specified destination host.
After determining the address of each network hop between these machines, it
sends out a sequence of ICMP ECHO requests to each machine to check the
quality of the link to each of them. mtr uses ICMP Time Exceeded (type
11) packets returning from routers, or ICMP Echo Reply packets when the
packets have hit their destination host. Running statistics about each
machine is printed out as the process is being run.
The real advantage of mtr over ping or traceroute is, it shows where
exactly the packet loss is happening in the route to the destination host –
in realtime. It shows the loss percentage on each hosts, which can give us
valuable information on which specific provider is having a network issue.
Also, since mtr is using ICMP ECHO requests,
it will go through the routers which have blocked udp packets. So mtr may work
where traceroute is not working.
The following is the mtr output to yahoo.com which I did from my
local console.
My traceroute [v0.69]
machine.hostname.com (0.0.0.0)(tos=0x0 psize=64 bitpattern=0x00) Wed Jan 17
12:24:50 2007
Keys: Help Display mode Restart statistics Order of fields quit
Packets Pings
Host Loss% Snt Last Avg Best Wrst StDev
1.192.168.1.254 0.0% 23 0.2 0.2 0.2 0.5 0.1
2.illekm-static-203.197.145.137.vsnl.net.in 0.0% 23 2.3 5.9 2.3 26.4
5.4
3. 203.200.149.148 0.0% 23 5.5 4.6 2.4 30.6 5.9
4. 59.163.16.146.static.vsnl.net.in 0.0% 23 239.4 242.4 238.8 262.3 5.6
5. 219.64.229.1.mpls-vpn-ny.static.vsnl.net.in 0.0% 23 270.4 278.1 269.9
316.6 11.7
6. ge-2-0-9.p558.pat1.dce.yahoo.com 0.0% 23 268.2 279.8 267.7 353.5 20.2
ge-0-0-9.p815.pat2.dce.yahoo.com
ge-0-0-9.p815.pat2.dce.yahoo.com
ge-2-0-8.p426.pat1.dce.yahoo.com
ge-0-0-8.p170.pat2.dce.yahoo.com
ge-0-0-9.p815.pat2.dce.yahoo.com
7.ge-1-0-0-p111.msr2.dcn.yahoo.com 0.0% 22 272.6 277.7 272.1 299.5 7.8
ge-0-0-0-p110.msr2.dcn.yahoo.com
ge-0-0-0-p110.msr2.dcn.yahoo.com
ge-0-0-0-p111.msr2.dcn.yahoo.com
ge-0-0-0-p100.msr1.dcn.yahoo.com
ge-0-0-0-p111.msr2.dcn.yahoo.com
ge-0-0-0-p100.msr1.dcn.yahoo.com
8.ge9-3.bas2-m.dcn.yahoo.com 0.0% 22 271.6 276.4 271.6 288.5 5.8
ge10-2.bas1-m.dcn.yahoo.com
ge6-1.bas1-m.dcn.yahoo.com
ge7-1.bas2-m.dcn.yahoo.com
ge5-2.bas1-m.dcn.yahoo.com
ge3-1.bas2-m.dcn.yahoo.com
ge10-2.bas2-m.dcn.yahoo.com
9. w2.rc.vip.dcn.yahoo.com 0.0% 22 278.9 278.2 271.6 323.9 11.1
Since the mtr output is dynamic, it is difficult to copy the output from
konsole. For this either the --report option can be used from the konsole or,
just type “p” while mtr is running and the output will pause. Note that you
should have root privileges to run mtr.
These three utilities are good enough to get a basic information about the
host, network and reachability. There are a lot of other tools with specific
features, which can be used for advanced data collection and troubleshooting.
The following are some of them.
Tools:
3dtraceroute - http://hlembke.de/prod/3dtraceroute/
Winmtr - http://winmtr.sourceforge.net/
hping - http://hping.org/
tcptraceroute - http://michael.toren.net/code/tcptraceroute/
tcpping - http://www.vdberg.org/~richard/tcpping.html
For remote traceroute - http://www.traceroute.org/
References
http://www.faqs.org/docs/iptables/icmptypes.html
http://www.onlamp.com/pub/a/bsd/2001/04/04/FreeBSD_Basics.html?page=1
http://www.freesoft.org/CIE/Topics/81.htm
http://ftp.arl.mil/~mike/ping.html
http://www.securityfocus.com/infocus/1210
http://linux-ip.net/html/tools-ping.html
http://www.exit109.com/~jeremy/news/providers/traceroute.html
TCP/IP Illustrated - Volume 1 - The protocols : W. Richard Stevens
man pages of ping, traceroute & mtr
About the author: Chacko Cherian Poothicote has been working in Bobcares for more than 4 years. His experience is mainly in data center administration and related remote infra structure management. He is a passionate advocate of Linux. Apart from his technical expertise, Chacko finds interest in learning and practising system implementations for ISO standards.
|