cc/td/doc/product/software/ssr91
hometocprevnextglossaryfeedbacksearchhelp
PDF

Table of Contents

Troubleshooting TCP/IP Connectivity

Troubleshooting TCP/IP Connectivity

This chapter presents protocol-related troubleshooting information for TCP/IP connectivity problems. The emphasis here is on symptoms and problems associated with TCP/IP network connectivity.

This chapter consists of the following sections:

The problem/solution modules consist of the following sections:

TCP/IP Internet Diagnostic Overview

Some of the world's largest networks today rely on the TCP/IP suite of networking protocols. With a relatively small kit of basic tools, network managers can learn a lot about what is going on (or wrong) in an internetwork. The following tools (all available via the router's basic management interface) form the core of the administrator's internetwork toolkit:

Caution Throughout this and other chapters, the use of debug commands is suggested for obtaining information about network traffic and router status. Use these commands with great care. In general, it is recommended that these commands only be used under the direction of your router technical support representative when troubleshooting specific problems. Enabling debugging can disrupt operation of the router, when internets are experiencing high load conditions. When you finish using a debug command, remember to disable it with its specific undebug command or with the undebug all command.

When specific diagnostic commands are considered particularly useful for troubleshooting symptoms, they are listed with the specific symptom discussion in this chapter.

Refer to "Using Cisco Diagnostic Tools" in Chapter 1 of this publication for general information about using these tools. The debug commands discussed in this publication are described in Chapter 10, "Debug Command Reference," while the remainder of the diagnostic commands are detailed in Router Products Configuration and Reference publication.

Problem Isolation in TCP/IP Networks

One important consideration to remember when troubleshooting broken interconnections is that normally everything does not break at the same time. As a result, when trying to isolate a problem, you can typically work out from an operational node to the point of failure. The following basic steps should help when trying to isolate the source of connection disruption:

Step 1: First, determine whether your local host is properly configured (for instance, correct subnet mask configurations and correct default gateway specifications).

Step 2: Next, use the ping or trace EXEC commands to determine whether the routers through which you must communicate can respond. Start with the most local router and progressively "ping out" through the internet.

Step 3: If you cannot get through a particular node, examine the node's configuration and use the various show commands to determine the router's state.

Step 4: If you can get to all the routers in the path, check the host configuration at the remote host (or get someone's help to do so) and check its configuration.

Step 5: Check the routers' routing tables (show ip route) and ARP tables (show ip arp) for any anomalies (such as duplicate routes) and to see if the host(s) in question are appearing. Another useful diagnostic command is show ip cache, which shows the routing table cache used to fast-switch IP traffic.

TCP/IP Connectivity Symptoms

The symptom modules that follow pertain to TCP/IP internetwork problems. Unless otherwise indicated, each module is presented as a set of general problems. Where there are special considerations associated with a situation, notes are included.

Symptom Summary

The following TCP/IP connectivity symptoms are discussed in this section:


Note The symptoms that follow are generic in nature. However, when host configuration problems are discussed, they are addressed assuming UNIX end systems. Equivalent kinds of actions may be applicable to non-UNIX hosts as well, but the discussion here does not address non-UNIX end station problems.

Host Cannot Access Offnet Host(s)

Symptom: Host-A is unable to communicate with Host-B on another network. Here, when you attempt to make a connection to some intervening router, you may or may not be able to make a successful connection. In either case, you are unable to connect to the target host on the other side of the router. This situation is illustrated in Figure 6-1.




Figure 6-1: Host-A Cannot Communicate with Host-B over Routers

Possible Causes and Suggested Actions

Table 6-1 outlines possible causes of blocked access to a specific host on a remote network.


Causes and Actions for Blocked Access to Remote Hosts
Possible Cause Suggested Actions
No default gateway specification Step 1: Determine whether a default gateway is included in the routing table of the host attempting to make a connection (Host-A in Figure 6-1). Use the following UNIX command:
 
Step 2: Inspect the output of this command for a default gateway specification.
Step 3: If the specified default gateway is incorrect, or if it is not present at all, you can change or add a default gateway using the following UNIX command at the local host:
 
 
Step 4: To automate this as part of the boot process, specify the default gateway's IP address in the following UNIX host file:
 
Misconfigured subnet mask Step 1: Check the following two locations for possible subnet mask errors:
n /etc/netmasks
n /etc/rc.local
Step 2: Fix if incorrectly specified or add if not included.
Router between hosts is down Step 1: Use the ping command to determine whether the router is reachable.
Step 2: If the router does not respond, isolate the problem and repair the broken interconnection.
Step 3: Refer to the brief discussion at the beginning of this chapter entitled "Problem Isolation in TCP/IP Networks" and to discussions in Chapter 1 for more information.

Host Cannot Access Certain Networks

Symptom: Host cannot access certain other networks on the other side of a router. Some networks might be accessible.

Possible Causes and Suggested Actions

Table 6-2 outlines possible causes of unreachable network problems.


Causes and Actions for Unreachable Network Problems
Possible Cause Suggested Actions
No default gateway Step 1: Check the host for proper default gateway specification as described in the preceding symptom section "Host Cannot Access Offnet Host(s)."
Step 2: Modify or add default gateway specification as required. Table 6-1 provides more details regarding default gateway specification.
Bad access list (getting routing information for some routes, but not others) Step 1: Check routing table with show ip route command and protocol exchanges with the appropriate debug command (such as debug ip-igrp and debug ip-rip).
Step 2: Look for information concerning the specific network with which you are unable to communicate.
Step 3: Check the use of access lists on the routers in the path and make sure that a distribute-list or distance command does not filter out the route.
Step 4: Temporarily disable access lists (by removing ip access-group commands) and use the trace or ping command with record route option set to determine whether traffic can get through when the access list is removed.
Discontinuous network addressing due to poor design Step 1: Use the show ip route command to examine which routes are known and how they are being learned.

Step 2: Use the trace or ping commands to see where traffic is stopping.

Step 3: Fix topology or reassign addresses to include all appropriate network segments into the same major network. Refer to the symptom section "Users Cannot Make Connections When One Path is Down" later in this chapter for additional information.
Discontinuous network addressing due to link failure Step 1: Restore disabled link.

Step 2: If the loss of a link occurs and you cannot use a parallel path, examine network address assignments.

Step 3: If link failure results in a discontinuous network because one network has different points of contact with two now isolated subnets of a different major net, assign secondary addresses along the backup path to restore major network connectivity.

Connectivity Available to Some Hosts, but Not Others

Symptom: Hosts on a network can communicate with specific hosts on the other side of a router, but are unable to communicate with certain other hosts.

Possible Causes and Suggested Actions

Table 6-3 outlines possible causes of selectively blocked access to hosts.


Causes and Actions for Selectively Blocked Host Access
Possible Cause Suggested Actions
Misconfigured subnet mask Step 1: Check subnet masks on hosts and routers.
Step 2: Look for mismatch between subnet mask. What may be a specific host address to one host may turn into a subnet broadcast when a different mask is applied at a router.
Step 3: Fix subnet mask on host or router as required.
Bad access list (host is denied by some router in the path) Step 1: Ping out through path to determine where packets are being dropped.

Step 2: If you can identify the router that is stopping traffic, use the write terminal EXEC command to see whether an access list is being used. You can also use the show access-lists and show ip interface commands in combination to determine whether access lists are being used.

Step 3: Temporarily disable the access list.
Step 4: See whether traffic can get through the router (ping or Telnet).
Step 5: If traffic can get through, carefully review the access list and its associated commands for proper authorization.
Missing default gateway specification on remote host Step 1: Determine whether you can access any offnet hosts from unaccessible remote host (may need system admin at remote end to log in to inaccessible hosts).
Step 2: Check the remote host for proper default gateway specification as described in the earlier symptom section "Host Cannot Access Offnet Host(s)."
Step 3: Modify or add default gateway specification as required.

Some Services Are Available,Others Are Not

Symptom: In some cases you may be able to get through to hosts using some protocols, but cannot get through using others. For instance, you may be able to ping a host and FTP to a host, but Telnet does not get through.

Possible Causes and Suggested Actions

Table 6-4 outlines possible causes of some host services being functional, while others are not.


Causes and Actions for Selective Service Availability
Possible Cause Suggested Actions
Misconfigured extended access list Step 1: Use the trace command to determine path taken to reach remote hosts.

Step 2: (Optional) On each router in the path, enable debug ip-icmp command.

 
Step 3: If you can identify the router that is stopping traffic, use the write terminal EXEC command to see whether an access list is being used.You can also use the show access-lists and show ip interface commands in combination to determine whether access lists are being used.
Step 4: Temporarily disable the access list.
Step 5: See whether traffic can get through the router.
Step 6: If traffic can get through, carefully review the access list and its associated commands for proper authorization.
Step 7: In particular, look for TCP port configured in extended access lists.
Step 8: If ports are specified, be sure that all needed ports are explicitly permitted by access lists.

Users Cannot Make Connections When One Path is Down

Symptom: In configurations featuring multiple paths between networks, when one of the parallel links breaks, there is no communication via the alternative routes.

Figure 6-2 illustrates one example of a situation in which this can occur. Here, one major network (Net-B) has two or more access points into another major network (Net-C), while a third link joins two separate subnets of Net-C. Details are provided in the "Possible Causes and Suggested Actions" discussion that follows.




Figure 6-2: Problem Parallel Path Configuration Example

Possible Causes and Suggested Actions

Table 6-5 outlines possible causes of completely blocked connectivity when only one parallel link is lost.


Causes and Actions for an Inadvertently Blocked Parallel Path
Possible Cause Suggested Actions
Discontinuous network due to failure. If Serial-Z is lost, traffic cannot traverse from Net-C1 to Net-C2 through Router-B1 Step 1: Bring link back up.

Step 2: As an alternative, ensure that all interfaces are included in the same major network using a secondary IP address configuration.

 
Routing has not converged Step 1: Assuming you have used secondary addresses, examine routing tables for routes that are listed as "possibly down." If this entry is found, routing protocol has not converged.
Step 2: Wait for the routing protocol to converge. Examine the routing table later.
Misconfigured access lists or other routing filters Step 1: Check for access lists in the secondary path.

Step 2: If present, disable and determine whether traffic is getting through.

Step 3: If traffic is getting through, this suggests access list and accompanying commands are causing traffic stoppage.
Step 4: Evaluate and fix access lists as necessary.
Errors on serial link Step 1: If the link is a serial link, look for input on the interface (using the show interfaces serial number command).
Step 2: Refer to the discussions regarding serial debugging in Chapter 7, "Troubleshooting WAN Connectivity," and Chapter 1, "Troubleshooting Overview," for more information.
Errors on Ethernet link Step 1: Use a TDR to find any unterminated Ethernet cable.
Step 2: Check host cables and transceiver cables to determine whether any are incorrectly terminated, overly long, or damaged.
Step 3: Look for a jabbering transceiver attached to a host (may require host-by-host inspection).

Router Sees Duplicate Routing Updates and Packets

Symptom: When the router sees duplicate routing updates, your network users are also apt to experience sudden loss of connections and extremely poor performance. Here, the router sees other routers and hosts on multiple interfaces.

Possible Causes and Suggested Actions

Table 6-6 outlines possible causes of routers seeing duplicate routing updates and packets.


Causes and Actions for Duplicate Routing Updates and Packets
Possible Cause Suggested Actions
Bridge or repeater in parallel with router, causing updates and traffic to be seen as coming from both sides of an interface Step 1: To determine whether this is the problem, use the show ip protocol EXEC command to get a list of routers from which the router is obtaining route information.

Step 2: Look for routers that are known to be remote to the network connected to the router.

Step 3: If you see routers listed but know them not to be attached to any directly connected networks, this is a likely problem.
Step 4: Another test is to use the show ip route command to examine routes for each interface.
Step 5: Look for paths to the same networks with the same cost on multiple interfaces.
Step 6: If you determine that there is a bridge in parallel, remove the bridge or configure access filters (on the bridge) that block routing updates.

Routing Works for Some Protocols, Not for Others

Symptom: For instance, Telnet works from a host on one network to a host on another network on the other side of a router. Perhaps Domain Name Service (DNS) works with your own domain, but does not work for external domains.

Possible Causes and Suggested Actions

Table 6-7 outlines possible causes for some protocols not working over a TCP/IP internetwork, when other protocols are working.


Causes and Actions for Some Protocols Not Being Routed
Possible Cause Suggested Actions
Misconfigured access list Step 1: Use ping and trace EXEC commands as described at the beginning of this chapter to isolate the router with the misconfigured access list.
Step 2: Once you determine where traffic is stopping, review the configuration of that router using the write terminal EXEC command.
Step 3: Look for any access list in the configuration.
Step 4: Temporarily disable the access list and monitor traffic to and through the suspect router.
Step 5: If the router is allowing previously blocked traffic through, some kind of error in the access list is probably at fault.
Step 6: Make sure you explicitly permit desired traffic, or that traffic will be blocked with the implicit deny statement ending all access lists.

Router/Host Cannot Reach Certain Parts of Its Own Network

Symptom: A router or host is unable to communicate with other routers or hosts known to be directly connected to the same network.

Possible Causes and Suggested Actions

Table 6-8 outlines possible causes of hosts/routers being unable to reach routers/hosts in the same major network.


Causes and Actions for Unreachable Hosts on Same Major Network
Possible Cause Suggested Actions
Subnet mask configuration mismatch between router and host Step 1: Ping out to destination from your host/router, as discussed in "TCP/IP Internet Diagnostic Overview" at the beginning of this chapter.

Step 2: If you can ping from the local host to the local router (but not to the remote host), and you can ping from the local router to the remote host, there is probably a subnet mask configuration problem on your local host or router.

Step 3: Check host and router configurations for subnet mask mismatch. Make sure that all subnet masks match.
 
 
Misconfigured access list Step 1: Use ping and trace EXEC commands as described at the beginning of this chapter to isolate the router with the misconfigured access list.
Step 2: Once you determine where traffic is stopping, review the configuration of that router using the write terminal EXEC command.
Step 3: Look for any access lists in the configuration.
Step 4: Temporarily disable the access list and monitor traffic to and through the suspect router.
Step 5: If the router is allowing previously blocked traffic through, some kind of error in the access list is probably at fault.
No default gateway specified Step 1: Check the remote host for proper default gateway specification as described in the earlier symptom section "Host Cannot Access Offnet Host(s)."
Step 2: Check configuration on hosts and routers for static routes.
 
n Specify a default gateway on your host as described in the symptom section "Host Cannot Access Offnet Host(s)" earlier in this chapter.
n Enable proxy ARP on the host; make the local cable the default network (network 0 for RIP).
n Run the Gateway Discovery Protocol (GDP) on the host (BSD UNIX host only). Allows dynamically defined default gateway.
n Run a routing protocol (such as RIP) on the host. (Note that there is high host processing overhead associated with this option.)

Note About IP Addresses and Subnet Masks

In most IP networks, routers and hosts should agree on their common subnet mask. If a router and a host disagree on the length of the subnet mask, then packets may not be routed correctly. Consider the situation in Table 6-9.

A host interprets a particular address (192.31.7.49) as being Host 1 on the third subnet (subnet address 48). However, the router interprets this same address as belonging to Host 17 on the first subnet (subnet address 32). The result is that any packet destined for 192.31.7.49 will either be sent out an incorrect interface or dropped, depending on the configuration of the router.


Comparison of Host and Router Subnet Mask Effects
Routing Info Host Value Router Value
Destination IP Address 192.31.7.49 192.31.7.49
Subnet Mask 255.255.255.240 255.255.255.224
Interpreted Address Subnet address 48, host 1 Subnet address 32, host 17

Traffic Is Not Getting Through Router Using Redistribution

Symptom: Traffic is not getting through a router that is redistributing routes between two different routing domains--typically RIP and IGRP. Observed symptoms range from bad performance (if nonoptimal routes are being used because the best path is blocked by a misconfigured redistribution) to no communication at all.

Possible Causes and Suggested Actions

Table 6-10 outlines possible causes of routing problems stemming from route redistribution.


Causes and Actions for Route Redistribution Problems
Possible Cause Suggested Actions
Missing default-metric command Step 1: Check router configuration using write terminal EXEC command.

Step 2: If default metric command is missing, add the appropriate version. Refer to the Router Products Configuration and Reference publication for details.

Problem with the default administrative distance Step 1: Determine the policy for identifying how much you trust routes derived from different domains.
 
Step 2: As applicable, use the distance router subcommand to vary the level of trust associated with specific routing information.
 

hometocprevnextglossaryfeedbacksearchhelp
Copyright 1989-1997 © Cisco Systems Inc.