**************************************** * Checklist to get the network working * * (assuming ethernet on ec0) * **************************************** OVERVIEW: ========= Proceed through directions and restart the network to check it after each step. Commands you should type at the shell begin "# ". The "# " represents the root prompt. To restart the network without rebooting, execute the following commands: # /etc/init.d/network stop # /etc/init.d/network start If this is the first time the network is set up on this machine, do the first six steps before trying anything else. TROUBLESHOOTING: ================ Symptom Start at step ------- ------------- Nothing works, no error messages 1 Local machines are reachable, but remote machines are not 8 Ping says "no route to host" or similar 9 No apparent reason for the network to fail 10 YP (NIS) doesn't work 7 Any other problem 1 STEPS: ====== 1 - Check to see that the ethernet cable is plugged in. 2 - # chkconfig network should be on yp may be on, off, or not there at all ypmaster and ypserv should be off unless the machine is a yp server If any options are incorrect, set them correctly. For example: # chkconfig network on 3 - # cat /etc/sys_id It should display the system's name, spelled correctly. 4 - Set your hostname with the "hostname" command to fix environment variables and the likes. # hostname 5 - # more /etc/hosts It should list all of the relevant local machines (at least a server) and their IP addresses. Check spelling and numbers! 6 - # nvram netaddr It should show your IP address. If not, type: # nvram netaddr # reboot 7 - If running yp (NIS), # cat /usr/etc/yp/ypdomain It should show the domain for yp (NIS) 8 - # cat /usr/etc/resolv.conf It should show something like this: domain huntsville.sgi.com hostresorder yp local bind nameserver 192.48.152.2 nameserver 192.26.51.194 nameserver 192.26.51.11 See the man page on resolv.conf for details. Do NOT copy this data unless you're working on a machine in the Huntsville SGI office. It's an example only. 9 - # netstat -r Does netstat show a route to get to the machine you're interested in? If not, add an appropriate route using the route(1M) command, for example: # route add net default 192.102.114.1 0 "add" says to add a route, "net" says to add a route to a network (as opposed to a route to a specific host), "default" says that to reach ALL networks (besides those explicitly specified by other routes (like the local network)), "192.102.114.1" is the address of the local router that connects to the other network(s). "0" is the metric - representative of the expense of using that router - most metrics are 0, but metrics can be tuned to have the system attempt to use a less expensive (less slow, less financial expense, whatever) route when one is available. If you had to add a route, make sure that it will always be added or that routed is doing its job. (Is routed chkconfig'ed on?) If you want routed off, you can edit the /etc/init.d/network script to add a route for you (just add an "else" clause to the if statement that starts routed), or you can create another startup script for routes. (Don't forget the symbolic link from the /etc/rc2.d directory.) 10- # ifconfig ec0 This checks the ethernet card to get all manner of information. It should look something like this: ec0: flags=c63 inet 192.102.114.65 netmask 0xffffff00 broadcast 192.102.114.255 Inet should be followed by your machine's IP address. Netmask should almost always be 0xffffff00. Broadcast should be .255 under most circumstances. If these numbers are wrong, edit the /etc/config/ifconfig-1.options file and set it up something like this: netmask 255.255.255.0 broadcast 192.102.114.255 Your "netmask" line should look the same. Your "broadcast" line should contain the first 3 numbers of your IP address followed by .255. For example, if your IP address is 102.122.203.4, the "broadcast" line should read "broadcast 102.122.203.255". Restart the network and repeat this step. If the network still doesn't work, see the man pages for ifconfig and network. 11- If the network STILL doesn't work, check /etc/config/network, /etc/config/netif.options, and /etc/init.d/network against those on a working machine. (5.2's files are slightly modified from 4.0.5's.) Spot checks should be sufficient to reveal problems on a high level (and there's probably a high level problem at this point). gdiff is an excellent program for comparing files quickly. 12- Verify that the network's hardware works by plugging in another system which is known to work on the network. Plug the system in to the SAME place as you have the faulty system plugged in. Try swapping out cables, connectors, and transceivers to isolate the problem. 13- Reload the operating system (eoe1.sw.unix or eoe.sw.unix). 14- Try swapping the ethernet board. 15- Don't suspect the SGI machine is at fault. 15- Try a sledgehammer. NOTES: ====== This file is not intended to provide a fix for all network problems. It addresses the problems that appear at the Huntsville office most often. If you have any comments, steps to add to the list, complaints that it's too cryptic, or whatever, send them to jims@iname.com. (Please note that I have finished my term of employment at SGI and no longer work with the machines daily, so I may be of limited assistance.) Last modification 1997 May 19