Troubleshoot Networking
    Main Page
    Lab Hardware
    Lab Software
 

Last updated: 18 Feb 2011

This is a checklist of tasks to be done in order to diagnose various networking problems. After basic network connectivity checks, we start with using IP addresses, since that eliminates domain name service problems, then move on to trying to troubleshoot problems using domain names.

  1. Are the two computers (e.g., A and B) that are communicating powered on and operational?

    If you don't have physical access to one of the computers (e.g., a remote server), you will need to continue and eventually check its network status via the network.

  2. Is there a network connection (via cable or wireless) between computer A and computer B?

    If a wired connection, make sure that:

    1. the cable is connected at one end to the computer
    2. the cable is connected at its other end to a hub or switch
    3. the hub/switch has power and is working
    4. there are usually lights that indicate connectivity at the jacks where both ends of the cable are connected.

    For a wireless connection, make sure that:

    1. your wireless card or built-in chip is working
    2. if the access point or wireless router is yours and should be connected to your ISP:
      • the access point or wireless router has power and is working
      • if the access point/wireless router requires recognizing the MAC address of the device trying to connect to it, that the MAC address of the wireless device that you want to connect to this access point/wireless router is on the list of valid devices
      • the access point or wireless router is connected by an ethernet cable to a cable or DSL modem
      • the cable or DSL modem is powered on and is working
      • a coax/DSL cable is connected from the cable/DSL modem to your internet service provider (ISP)
      • your ISP is providing internet service to you (call them, especially if new service or after a power outage)
    3. your wireless card can contact an access point/wireless router
    4. you have provided any needed security credentials to gain wireless access

    You will need to check this for both computer A and computer B, if you have physical access to computer B.

  3. Presuming from now on that there is some physical way to connect from computer A to computer B, is computer A able to communicate using IP -- does computer A have an IP address?
    • On Windows in a command shell, enter:
      ipconfig
      

      If there are values for "IP Address" and "Subnet Mask", you have an IP address. If not, try to get a dynamic one from some DHCP server that your wireless router or ISP is providing:

      ipconfig /renew
      

      If there is not a value for "Default Gateway", you won't be able to communicate outside of your subnet, and you'll have to either ask for another dynamic IP address (ipconfig /renew) or specify a static default gateway (that your ISP should have provided along with your static IP address).

    • On Linux in a terminal session, enter:
      ifconfig
      

      There are likely to be at least two entries (called interfaces). The interface marked "lo" with an "inet addr" of 127.0.0.1 can be ignored. The other interface is likely to be something like "eth0" or "eth1", depending on how many NICs you have. Let's say it is "eth0".

      If there are values for "inet addr" and "Mask", you have an IP address after the "inet addr:" field.

      If your IP address is on either the 10.0.0.0/8, 192.168.0.0/16, or 172.16.0.0/12 subnets, then note that it normally can't be used as a public address for a server that you want all of the rest of the internet world to connect to (though it could be a server IP addresses on the same subnet). Similarly for the 169.254.0.0/16 subnet of "link-local" IP addresses.

      If you did not see a value for "inet addr", and presuming you have a static IP address, ensure that the network is up:

      service network restart
      

      and check again (you could just use: ifconfig eth0).

      Check for a default gateway by entering:

      route
      

      The output is in a table. For the column heading "Destination", look for "default". For the line containing "default", if the value under the "Gateway" column has a valid IP address, then that is your default gateway address -- the address to which any network that is not in the subnets defined above will go to to route to other subnets.

  4. Is computer B pingable?

    Let's say that computer B's address is 172.28.244.70.

    1. Determine if computer B is pingable (i.e., it responds to ICMP echo requests):
      ping 172.28.244.70
      
    2. If the response is (or starts with):
      • Reply from 172.28.244.70 on Windows or 64 bytes from 172.28.244.70 on Linux

        Computer B is up and responds to ICMP echo requests

      • Request timed out on Windows or no response (use Ctrl-C to stop trying) on Linux

        Computer B is not responding to ICMP echo requests because it is either misconfigured (wrong IP address? firewall blocking ICMP echo requests?), there is a break in network connectivity between computer A and computer B (you could try the tracert/traceroute command), computer B is not powered on, or computer B's networking is not (yet? -- could be starting up) operational.

        If you don't have control over computer B, then about the only thing you can do is try to contact whoever does have that control and ask them to fix the problem.

      • Destination host unreachable

        It is likely that computer A can't find a way to route the network packets to computer B, which usually means that you don't have a valid or operational default gateway. You would need to add a default gateway or contact whoever (your ISP, most likely) owns the gateway networking device at the default gateway's address.

  5. Now that computer B is pingable, the question becomes: what are you trying to do between computer A and computer B?
  6. If you are trying to contact a network service (e.g., web service, domain name service, secure shell service, and mysql database service), then you should know the port number (e.g., 80, 53, 22, and 3306, respectively) and protocol (tcp and/or udp) of the network service.
  7. Let's say you are trying to contact the web service on computer B via a browser and the IP address given above:
    http://172.28.244.70
    

    If you don't receive a response:

    1. Is the remote service up?

      It appears that it might not be, but if you don't have control over computer B, about all you can do is try to contact who does and ask them to check for you. If you do have control over computer B, then force a start of the service and watch for any errors during startup. For example, using the httpd service on Linux:

      service httpd start
      

      If it was already running, then the service being down was not the problem, and there must be another issue elsewhere (continue on with the other steps). If it was not running, and there are no errors during startup of the service, try to contact it again from computer A. If there were startup errors, you will need to diagnose and fix the problem(s).

    2. If the service is running, then check to see if it is listening for network connections on the expected port.
      • On Linux, you can use this command on computer B (e.g., looking for port 80):

        netstat -nlp | grep 80
        

        If you see a line containing 0.0.0.0:80 and LISTEN and httpd, then the httpd service is listening for connections on port 80.

      • On Windows, you can use this command on computer B:

        netstat -anob
        

        Scan through the output to find the desired port and, immediately after that line and in square brackets, the name of the program that is listening for connections on that port.

    3. Is there anything, such as a firewall, blocking the desired port?

      If you still can't connect from computer A to the port on computer B, it could be that a firewall on the route to computer B is blocking that port, or computer B's firewall is blocking it. If you don't have control over computer B, ask the person responsible to check to see if the firewall is open for the desired port. If you do have control, and computer B is running Linux (e.g., checking port 80):

      iptables -nxL | grep 80
      

      If you see a line starting with ACCEPT and containing 0.0.0.0/0 0.0.0.0/0 and state NEW tcp dpt:80, the firewall is open for port 80. If you don't, use (from root):

      system-config-firewall
      

      to modify the firewall's rules to accept connections on the desired port (if not common, you may have to add it explicitly in "Other Ports"), apply the rules, and exit, then check it again using the iptables command above.

    4. Is there something else, like a service-specific access control list, that is blocking access?
      • In general, check the service's configuration file for access control settings.
      • For tcpd, check /etc/hosts.allow and /etc/hosts.deny
      • If xinetd is listening for connections and handing them off to the appropriate service, there may be IP address restrictions in /etc/xinetd.conf. or in the per-service files in /etc/xinetd.d/
    5. At this point, there is nothing left to check -- it should work.
  8. Now let's check for domain name issues, given that all is working correctly above.
  9. Try pinging a well-known domain name (e.g., google)
    ping google.com
    

    If it pings, everything is fine with domain names you don't own.

    If you don't receive a reply, then check your DNS server:

    1. On Windows:
      ipconfig /all
      

      Look for one or more lines starting with DNS Servers. ping them -- if no response, you may have to contact the DNS server owner (usually, your ISP) and tell them their DNS server is not working, though it is rare that it would be their problem if they've made their networking equipment redundant.

    2. On Linux:
      cat /etc/resolv.conf
      
      • If all you see is lines starting with the comment character "#" or blank lines, you don't have a name server defined.

        It is likely that the NetworkManager service was running (or maybe still is) and originally put a name server in there, but can't figure out what name server works for your current network configuration. To stop NetworkManager from controlling /etc/resolv.conf and other network settings:

        service NetworkManager off
        chkconfig NetworkManager off
        
      • If you see a line containing nameserver that is not commented, then what follows "nameserver" is the IP address of one or more name servers to use.

        Try to ping it to make sure it is reachable. If not, then you'll have to figure out who owns that IP address (usually, your ISP), and you may need to contact them.

      • If there is no nameserver line, add a valid IP address for a DNS server. If you don't know one, ask your ISP.

        For example:

        nameserver 172.28.244.70
        
  10. If you control the DNS service behind the IP address (presuming Linux and BIND's named):
    1. Make sure the firewall is open for port 53, for both the UDP and TCP protocols.
      system-config-firewall
      
    2. Make sure the root zone "." in /etc/named.conf is defined correctly.
    3. Make sure any domain names your domain name service is delegated to serve are defined (usually, in /var/named/named.db) and that the serial number has been incremented each time you save the file.
    4. Make sure your name service is running on the computer at the address listed in the nameserver line.
      chkconfig named on
      service named start
      

      Check the log files (usually, /var/log/messages unless it was changed in /etc/named.conf) to see if everything worked right.

    5. Use dig +trace to see if your name service is serving your domain names.

      For example:

      dig +trace iot.insttech.washington.edu
      
  11. At this point, about everything that can be done to specify a working domain name service has been done.

    Try to ping google.com again.

    ping google.com
    

    If that doesn't work:

    • the network could have gone done in the interim, (there could have been a hardware error on some network equipment used along the way)
    • google.com could be offline for some reason (try amazon.com)
    • your DNS service or the computer hosting it crashed
    • the cause is unknown -- trying checking everything from the first step down to here again.

Change Log

18 Feb 2011 Original document



Hours  |  Support Information  |  News  | 
Policies  |  Emergencies