|
|
Last updated: 18 Feb 2011
This is a checklist of tasks to be done in order to diagnose various networking problems.
After basic network connectivity checks, we start with using IP addresses, since that
eliminates domain name service problems, then move on to trying to troubleshoot problems using
domain names.
- Are the two computers (e.g., A and B) that are communicating powered on and operational?
If you don't have physical access to one of the computers (e.g., a remote server), you will
need to continue and eventually check its network status via the network.
- Is there a network connection (via cable or wireless) between computer A and computer B?
If a wired connection, make sure that:
- the cable is connected at one end to the computer
- the cable is connected at its other end to a hub or switch
- the hub/switch has power and is working
- there are usually lights that indicate connectivity at the jacks where both ends of the
cable are connected.
For a wireless connection, make sure that:
- your wireless card or built-in chip is working
- if the access point or wireless router is yours and should be connected to your ISP:
- the access point or wireless router has power and is working
- if the access point/wireless router requires recognizing the MAC address of the device
trying to connect to it, that the MAC address of the wireless device that you want to connect
to this access point/wireless router is on the list of valid devices
- the access point or wireless router is connected by an ethernet cable to a cable or DSL
modem
- the cable or DSL modem is powered on and is working
- a coax/DSL cable is connected from the cable/DSL modem to your internet service provider
(ISP)
- your ISP is providing internet service to you (call them, especially if new service or
after a power outage)
- your wireless card can contact an access point/wireless router
- you have provided any needed security credentials to gain wireless access
You will need to check this for both computer A and computer B, if you have physical access
to computer B.
- Presuming from now on that there is some physical way to connect from computer A to
computer B, is computer A able to communicate using IP -- does computer A have an IP address?
- On Windows in a command shell, enter:
ipconfig
If there are values for "IP Address" and "Subnet Mask", you have an IP address. If not, try
to get a dynamic one from some DHCP server that your wireless router or ISP is providing:
ipconfig /renew
If there is not a value for "Default Gateway", you won't be able to communicate outside of
your subnet, and you'll have to either ask for another dynamic IP address (ipconfig /renew) or
specify a static default gateway (that your ISP should have provided along with your static
IP address).
- On Linux in a terminal session, enter:
ifconfig
There are likely to be at least two entries (called interfaces). The interface marked "lo"
with an "inet addr" of 127.0.0.1 can be ignored. The other interface is likely to be something
like "eth0" or "eth1", depending on how many NICs you have. Let's say it is "eth0".
If there are values for "inet addr" and "Mask", you have an IP address after the "inet
addr:" field.
If your IP address is on either the 10.0.0.0/8, 192.168.0.0/16, or 172.16.0.0/12 subnets,
then note that it normally can't be used as a public address for a server that you want all of
the rest of the internet world to connect to (though it could be a server IP addresses on the
same subnet). Similarly for the 169.254.0.0/16 subnet of "link-local" IP addresses.
If you did not see a value for "inet addr", and presuming you have a static IP address,
ensure that the network is up:
service network restart
and check again (you could just use: ifconfig eth0).
Check for a default gateway by entering:
route
The output is in a table. For the column heading "Destination", look for "default". For the
line containing "default", if the value under the "Gateway" column has a valid IP address,
then that is your default gateway address -- the address to which any network that is not in
the subnets defined above will go to to route to other subnets.
- Is computer B pingable?
Let's say that computer B's address is 172.28.244.70.
- Determine if computer B is pingable (i.e., it responds to ICMP echo requests):
ping 172.28.244.70
- If the response is (or starts with):
- Reply from 172.28.244.70 on Windows or 64 bytes from
172.28.244.70 on Linux
Computer B is up and responds to ICMP echo requests
- Request timed out on Windows or no response (use Ctrl-C to stop trying)
on Linux
Computer B is not responding to ICMP echo requests because it is either misconfigured
(wrong IP address? firewall blocking ICMP echo requests?), there is a break in network
connectivity between computer A and computer B (you could try the tracert/traceroute command),
computer B is not powered on, or computer B's networking is not (yet? -- could be starting up)
operational.
If you don't have control over computer B, then about the only thing you can do is try to
contact whoever does have that control and ask them to fix the problem.
- Destination host unreachable
It is likely that computer A can't find a way to route the network packets to computer B,
which usually means that you don't have a valid or operational default gateway. You would need
to add a default gateway or contact whoever (your ISP, most likely) owns the gateway
networking device at the default gateway's address.
- Now that computer B is pingable, the question becomes: what are you trying to do between
computer A and computer B?
- If you are trying to contact a network service (e.g., web service, domain name service,
secure shell service, and mysql database service), then you should know the port number (e.g.,
80, 53, 22, and 3306, respectively) and protocol (tcp and/or udp) of the network service.
- Let's say you are trying to contact the web service on computer B via a browser and the IP
address given above:
http://172.28.244.70
If you don't receive a response:
- Is the remote service up?
It appears that it might not be, but if you don't have control over computer B, about all
you can do is try to contact who does and ask them to check for you. If you do have control
over computer B, then force a start of the service and watch for any errors during startup.
For example, using the httpd service on Linux:
service httpd start
If it was already running, then the service being down was not the problem, and there must
be another issue elsewhere (continue on with the other steps). If it was not running, and
there are no errors during startup of the service, try to contact it again from computer A. If
there were startup errors, you will need to diagnose and fix the problem(s).
- If the service is running, then check to see if it is listening for network connections on
the expected port.
- On Linux, you can use this command on computer B (e.g., looking for port 80):
netstat -nlp | grep 80
If you see a line containing 0.0.0.0:80 and LISTEN and
httpd, then the httpd service is listening for connections on port 80.
-
- On Windows, you can use this command on computer B:
netstat -anob
Scan through the output to find the desired port and, immediately after that line and in
square brackets, the name of the program that is listening for connections on that port.
- Is there anything, such as a firewall, blocking the desired port?
If you still can't connect from computer A to the port on computer B, it could be that a
firewall on the route to computer B is blocking that port, or computer B's firewall is
blocking it. If you don't have control over computer B, ask the person responsible to check to
see if the firewall is open for the desired port. If you do have control, and computer B is
running Linux (e.g., checking port 80):
iptables -nxL | grep 80
If you see a line starting with ACCEPT and containing 0.0.0.0/0
0.0.0.0/0 and state NEW tcp dpt:80, the firewall is open for port
80. If you don't, use (from root):
system-config-firewall
to modify the firewall's rules to accept connections on the desired port (if not common,
you may have to add it explicitly in "Other Ports"), apply the rules, and exit, then check it
again using the iptables command above.
- Is there something else, like a service-specific access control list, that is blocking
access?
- In general, check the service's configuration file for access control settings.
- For tcpd, check /etc/hosts.allow and
/etc/hosts.deny
- If xinetd is listening for connections and handing them off to the
appropriate service, there may be IP address restrictions in
/etc/xinetd.conf. or in the per-service files in
/etc/xinetd.d/
- At this point, there is nothing left to check -- it should work.
- Now let's check for domain name issues, given that all is working correctly above.
- Try pinging a well-known domain name (e.g., google)
ping google.com
If it pings, everything is fine with domain names you don't own.
If you don't receive a reply, then check your DNS server:
- On Windows:
ipconfig /all
Look for one or more lines starting with DNS Servers. ping them -- if no
response, you may have to contact the DNS server owner (usually, your ISP) and tell them their
DNS server is not working, though it is rare that it would be their problem if they've made
their networking equipment redundant.
- On Linux:
cat /etc/resolv.conf
- If all you see is lines starting with the comment character "#" or blank lines, you don't
have a name server defined.
It is likely that the NetworkManager service was running (or maybe
still is) and originally put a name server in there, but can't figure out what name server
works for your current network configuration. To stop NetworkManager from controlling
/etc/resolv.conf and other network settings:
service NetworkManager off
chkconfig NetworkManager off
- If you see a line containing nameserver that is not commented, then what
follows "nameserver" is the IP address of one or more name servers to use.
Try to ping it to make sure it is reachable. If not, then you'll have to figure out who
owns that IP address (usually, your ISP), and you may need to contact them.
- If there is no nameserver line, add a valid IP address for a DNS server.
If you don't know one, ask your ISP.
For example:
nameserver 172.28.244.70
- If you control the DNS service behind the IP address (presuming Linux and BIND's named):
- Make sure the firewall is open for port 53, for both the UDP and TCP protocols.
system-config-firewall
- Make sure the root zone "." in /etc/named.conf is defined correctly.
- Make sure any domain names your domain name service is delegated to serve are defined
(usually, in /var/named/named.db) and that the serial number has been
incremented each time you save the file.
- Make sure your name service is running on the computer at the address listed in the
nameserver line.
chkconfig named on
service named start
Check the log files (usually, /var/log/messages unless it was changed in
/etc/named.conf) to see if everything worked right.
- Use dig +trace to see if your name service is serving your domain names.
For example:
dig +trace iot.insttech.washington.edu
- At this point, about everything that can be done to specify a working domain name service
has been done.
Try to ping google.com again.
ping google.com
If that doesn't work:
- the network could have gone done in the interim, (there could have been a hardware error
on some network equipment used along the way)
- google.com could be offline for some reason (try amazon.com)
- your DNS service or the computer hosting it crashed
- the cause is unknown -- trying checking everything from the first step down to here again.
Change Log
18 Feb 2011 |
Original document |
Hours
|
Support Information
|
News
|
Policies
|
Emergencies
|