Contents:
How Many Name Servers?
Adding More Name Servers
Registering Name Servers
Changing TTLs
Planning for Disasters
Coping with Disaster
"What size do you want to be?" it asked.
"Oh, I'm not particular as to size," Alice hastily replied; "only one doesn't like changing so often, you know..."
"Are you content now?" said the Caterpillar.
"Well, I should like to be a little larger, sir, if you wouldn't mind...."
We set up two name servers in Chapter 4, Setting Up BIND. Two servers are as few as you'll ever want to run. Depending on the size of your network, you may need to run many more than just two servers. It is not uncommon to run from five to seven servers, with one of them off-site. How many name servers are enough? You'll have to decide that based on your network. Here are some guidelines to help out:
Have at least one name server available directly on each network or subnet you have. This removes routers as a point of failure. Make the most of any multihomed hosts you may have, since they're (by definition) attached to more than one network.
If you have a file server and some diskless nodes, run a name server on the file server to serve this group of machines.
Run name servers near, but not necessarily on, large time-sharing machines. The users and their processes probably generate a lot of queries, and, as administrators, you will work harder to keep a multiuser host up. But balance their needs against the risk of running a name server - a security-critical server - on a system that lots of people have access to.
Run one name server off-site. This makes your data available when your network isn't. You might argue that it's useless to look up an address when you can't reach the host. Then again, the off-site name server may be available if your network is reachable, but your other name servers are down. If you have a close relationship with an organization on the Internet - say another university or a business partner - they may consent to run a slave for you.
Figure 8.1 shows a sample topology and a brief analysis to show you how this might work.
Notice that if you follow our guidelines, there are still a number of places you could choose to run a name server. Host d, the file server for hosts a, b, c, and e, could run a name server. Host g, a big, multiuser host, is another good candidate. But probably the best choice is host f, the smaller host with interfaces on both networks. You'll only need to run one name server, instead of two, and it'll run on a closely-watched host. If you want more than one name server on either network, you can also run one on d or g.
In addition to giving you a rough idea of how many name servers you'll need, these criteria should also help you decide where to run name servers (e.g., on file servers, multihomed hosts). But there are other important considerations when choosing the right host.
Other factors to keep in mind are the host's connectivity, the software it runs (BIND and otherwise), and maintaining the homogeneity of your name servers:
It's important that name servers be well connected. Having a name server running on the fastest, most reliable host on your network won't do you any good if the host is mired in some backwater subnet of your network behind a slow, flaky serial line. Try to find a host close to your link to the Internet (if you have one), or find a well-connected Internet host to act as a slave for your zone. And on your own network, try to run name servers near the hubs of your network.
It's doubly important that your primary master name server be well connected. The primary needs good connectivity to all the slaves that update from it, for reliable zone transfers. And, like any name server, it'll benefit from fast, reliable networking.
Another factor to consider in choosing a host for a name server is the software the host runs. Software-wise, the best candidate for a name server is a host running a vendor-supported version of BIND 8.1.2 or 4.9.7 and a robust implementation of TCP/IP (preferably based on 4.3 or 4.4 BSD UNIX's networking - we're Berkeley snobs). You can compile your own 8.1.2 BIND from the sources - it's not hard, and the latest versions are very reliable - but you'll probably have a tough time getting your vendor to support it. If you don't absolutely need a feature of BIND 8.1.2, you may be able to get away with running your vendor's port of older BIND code, like 4.9.4, which will give you the benefit of your vendor's support, for what that's worth.
One last thing to take into account is the homogeneity of your name servers. As much as you might believe in "open systems," hopping between different versions of UNIX can be frustrating and confusing. Avoid running name servers on lots of different platforms, if you can. You can waste a lot of time porting your scripts (or ours!) from one operating system to another, or looking for the location of nslookup or named.conf on three different UNIXes. Moreover, different vendors' versions of UNIX tend to support different versions of BIND, which can cause all sorts of frustration. If you need BIND 8.1.2's security features on all your name servers, for example, choose a platform that supports 8.1.2 for all your name servers.
Since you would undoubtedly prefer that hackers not commandeer your name server to assist them in attacking your own hosts or other networks across the Internet, it's important to run your name server on a secure host. Don't run a name server on a big, multiuser system whose users you can't trust. If you have certain computers that are dedicated to hosting network services, but don't permit general logins, those are good candidates for running name servers. If you only have one or a few really secure hosts, consider running the primary master name server on one of those, since its compromise would be more significant than the compromise of the slaves.
Though these are really secondary considerations - it's more important to have a name server on a given subnet than to have it running on the perfect host - do keep these criteria in mind when making a choice.
If you have heavily populated networks, or users who do a lot of name-server-intensive work, you may find you need more name servers than we've recommended to handle the load. Or our recommendations may be fine for a little while, but as people add hosts to your nets or install new name-server-intensive programs, you may find your name servers bogged down by queries.
Just which tasks are "name-server-intensive"? Surfing the web can be name-server-intensive. Sending electronic mail, especially to large mailing lists, can be name-server-intensive. Programs that make lots of remote procedure calls to different hosts can be name-server-intensive. Even running certain graphical user environments can tax your name server. X Window-based user environments query the name server to check access lists (among other things).
The astute (and precocious) among you may be asking, "But how do I know when my name servers are overloaded? What do I look for?" An excellent question!
Memory utilization is probably the most important aspect of a name server's operation to monitor. named can get very large on a name server that is authoritative for many zones. If named's size, plus the size of the other processes you run, exceeds your real memory, your host may swap furiously ("thrash") and not get anything done. Even if your host has more than enough memory to run all its processes, large name servers are slow to start and slow to spawn new named processes (e.g., to handle zone transfers). Another problem: since named creates new named processes to handle zone transfers, it's quite possible to have more than one named process running at one time - one answering queries, and one or more servicing zone transfers. If your master name server already consumes five or ten megabytes of memory, count on two or three times that amount being used occasionally.
Another criterion you can use to measure the load on your name server is the load the name server process places on the host's CPU. Correctly configured name servers don't use much CPU time, so high CPU usage is often symptomatic of a configuration error. Programs like top can help you characterize your name server's average CPU utilization. Unfortunately, there are no absolute rules when it comes to acceptable CPU utilization. We offer a rough rule of thumb, though: 5% average CPU utilization is probably acceptable; 10% is a bit high, unless the host is dedicated to providing name service.[1]
[1] top is a very handy program, written by Bill LeFebvre, that gives you a continuous report of which processes are sucking up the most CPU time on your host. The most recent version of top is available via anonymous ftp from eecs.nwu.edu as /pub/top/top-3.4.tar.Z.
To get an idea of what normal figures are, here's what top might show for a relatively quiet name server:
last pid: 14299; load averages: 0.11, 0.12, 0.12 18:19:08 68 processes: 64 sleeping, 3 running, 1 stopped Cpu states: 11.3% usr, 0.0% nice, 15.3% sys, 73.4% idle, 0.0% intr, 0.0% ker Memory: Real: 8208K/13168K act/tot Virtual: 16432K/30736K act/tot Free: 4224K PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 89 root 1 0 2968K 2652K sleep 5:01 0.00% 0.00% named
Okay, that's really quiet. Here's what top shows on a busy (though not overloaded) name server:
load averages: 0.30, 0.46, 0.44 system: relay 16:12:20 39 processes: 38 sleeping, 1 waiting Cpu states: 4.4% user, 0.0% nice, 5.4% system, 90.2% idle, 0.0% unk5, 0.0% unk6, 0.0% unk7, 0.0% unk8 Memory: 31126K (28606K) real, 33090K (28812K) virtual, 54344K free Screen #1/3 PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 21910 root 1 0 2624K 2616K sleep 146:21 0.00% 1.42% /etc/named
Another statistic to look at is the number of queries the name server receives per minute (or second, if you have a busy name server). Again, there are no absolutes here: an HP9000 K460 can handle hundreds of queries per second without breaking into a sweat, while a 386 PC might have problems with more than a few queries a second.
To check the volume of queries your name server is receiving, it's easiest to look at the name server's internal statistics, which you can configure the server to write to syslog at regular intervals.[2] For example, you could configure your name server to dump statistics every hour (actually, that's the default for BIND 8 servers), and compare the number of queries received between hours:
[2] Some older BIND name servers needed coercion to dump their statistics: the ABRT signal (IOT on older systems). BIND 4.9 name servers automatically dumped stats every hour, but 4.9.4 and later name servers, once again, need to be coerced with ABRT.
options { statistics-interval 60; };
You should pay special attention to peak periods. Monday morning is often busy, because many people like to respond to mail they've received over the weekend first thing on Mondays.
You might also want to take a sample starting just after lunch, when people are returning to their desks and getting back to work - all at about the same time. Of course, if your organization is spread across several time zones, you'll have to use your own good judgment to determine a busy time.
Here's a snippet from the syslog file on a BIND 8.1.2 name server:[3]
[3] On a 4.9.4 through 4.9.7 server, you could dump stats like these to the named.stats file by sending named a SIGABRT, then move named.stats to another filename, wait an hour (with sleep 3600, for example), then send SIGABRT again.
Apr 22 07:40:37 denver named[150]: NSTATS 830180437 829791665 A=131686 PTR=8554 MX=187 ANY=339 Apr 22 07:40:37 denver named[150]: XSTATS 830180437 829791665RQ=140766
RR=4111 RIQ=0 RNXD=2045 RFwdQ=3671 RFwdR=3839 RDupQ=0 RDupR=7 RFail=0 RFErr=0 RErr=0 RTCP=0 RAXFR=0 RLame=0 ROpts=0 SSysQ=285SAns=137097
SFwdQ=3671 SFwdR=3839 SDupQ=92 SFail=4 SFErr=0 SErr=0 RNotNsQ=140721 SNaAns=7728 SNXD=55787 Apr 22 08:40:37 denver named[150]: NSTATS 830184037 829791665 A=132968 PTR=8633 MX=187 ANY=342 Apr 22 08:40:37 denver named[150]: XSTATS 830184037 829791665RQ=142130
RR=4144 RIQ=0 RNXD=2062 RFwdQ=3698 RFwdR=3870 RDupQ=0 RDupR=7 RFail=0 RFErr=0 RErr=0 RTCP=0 RAXFR=0 RLame=0 ROpts=0 SSysQ=287SAns=138434
SFwdQ=3698 SFwdR=3870 SDupQ=92 SFail=4 SFErr=0 SErr=0 RNotNsQ=142085 SNaAns=7778 SNXD=56284
The number of queries received is dumped in the RQ field (in bold). To calculate the number of queries received in the hour, just subtract the first RQ value from the second one: 142130 - 140766 = 1364.
Even if your host is fast enough to handle the number of queries it receives, you should make sure the DNS traffic isn't placing undue load on your network. On most LANs, DNS traffic will be too small a proportion of the network's bandwidth to worry about. Over slow leased lines or dial-up connections, though, DNS traffic could consume enough bandwidth to merit concern.
To get a rough estimate of the volume of DNS traffic on your LAN, multiply the number of queries received (RQ) plus the number of answers sent (SAns) in an hour by 800 bits (100 bytes, a rough average size for a DNS packet), and divide by 3600 (seconds per hour) to find the bandwidth utilized. This should give you a feeling for how much of your network's bandwidth is being consumed by DNS traffic.[4]
[4] For a nice package that automates the analysis of BIND's statistics, look for Nigel Campbell's bindgraph in the DNS Resources Directory's tools page, URL http://www.dns.net/dnsrd/tools.html.
To give you an idea of what's normal, the last NSFNET traffic report (in April, 1995) showed that DNS traffic constituted just over 5% of the total traffic volume (in bytes) on their backbone. The NSFNET's figures are based upon actual traffic sampling, not calculations like ours using the name server's statistics.[5] If you want to get a more accurate idea of the traffic your name server is receiving, you can always do your own traffic sampling with a LAN protocol analyzer.
[5] We're not sure how representative of the current state of the Internet these numbers are, but it's extremely difficult to wheedle equivalent numbers out of the commercial backbone providers that succeeded the NSFNET.
Once you've found that your name servers are overworked, what then? First, it's a good idea to make sure that your name servers aren't being bombarded with queries by a misbehaving program. To do that, you'll need to find out where all the queries are coming from.
If you're running a BIND 4.9 or 8.1.2 name server, you can find out which resolvers and name servers are querying your name server just by dumping the statistics. A modern server keeps statistics on a host-by-host basis, which is really useful in tracking down heavy users of your name server. For example, take these statistics:
+++ Statistics Dump +++ (829373099) Fri Apr 12 23:24:59 1996 970779 time since boot (secs) 471621 time since reset (secs) 0 Unknown query types 185108 A queries 6 NS queries 69213 PTR queries 669 MX queries 2361 ANY queries ++ Name Server Statistics ++ (Legend) RQ RR RIQ RNXD RFwdQ RFwdR RDupQ RDupR RFail RFErr RErr RTCP RAXFR RLame ROpts SSysQ SAns SFwdQ SFwdR SDupQ SFail SFErr SErr RNotNsQ SNaAns SNXD (Global) 257357 20718 0 8509 19677 19939 1494 21 0 0 0 7 0 1 0 824 236196 19677 19939 7643 33 0 0 256064 49269 155030 [15.17.232.4] 8736 0 0 0 717 24 0 0 0 0 0 0 0 0 0 0 8019 0 717 0 0 0 0 8736 2141 5722 [15.17.232.5] 115 0 0 0 8 0 21 0 0 0 0 0 0 0 0 0 86 0 1 0 0 0 0 115 0 7 [15.17.232.8] 66215 0 0 0 6910 148 633 0 0 0 0 5 0 0 0 0 58671 0 6695 0 15 0 0 66215 33697 6541 [15.17.232.16] 31848 0 0 0 3593 209 74 0 0 0 0 0 0 0 0 0 28185 0 3563 0 0 0 0 31848 8695 15359 [15.17.232.20] 272 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 272 0 0 0 0 0 0 272 7 0 [15.17.232.21] 316 0 0 0 52 14 3 0 0 0 0 0 0 0 0 0 261 0 51 0 0 0 0 316 30 30 [15.17.232.24] 853 0 0 0 65 1 3 0 0 0 0 2 0 0 0 0 783 0 64 0 0 0 0 853 125 337 [15.17.232.33] 624 0 0 0 47 1 0 0 0 0 0 0 0 0 0 0 577 0 47 0 0 0 0 624 2 217 [15.17.232.94] 127640 0 0 0 1751 14 449 0 0 0 0 0 0 0 0 0 125440 0 1602 0 0 0 0 127640 106 124661 [15.17.232.95] 846 0 0 0 38 1 0 0 0 0 0 0 0 0 0 0 809 0 37 0 0 0 0 846 79 81 -- Name Server Statistics -- --- Statistics Dump --- (829373099) Fri Apr 12 23:24:59 1996
Each host is broken out, after the Global entry, by IP address, in brackets. Looking at the legend, you can see that the first field in each record is RQ, or queries received. That gives us a very good reason to go look at the hosts 15.17.232.8, 15.17.232.16, and 15.17.232.94, which appear to be responsible for about 88% of our queries.
If you're running an older name server, the only way to find out which resolvers and name servers are sending all those darned queries is to turn on name server debugging. (We'll cover this in depth in Chapter 12, Reading BIND Debugging Output.) All you're really interested in is the source IP addresses of the queries your name server is receiving. When poring over the debugging output, look for hosts sending repeated queries, especially for the same or similar information. That may indicate a misconfigured or buggy program running on the host, or a foreign name server pelting your name server with queries.
If all the queries appear to be legitimate, add a new name server. Don't put the name server just anywhere, though; use the information from the debugging output to help you decide where best to run one. In cases where DNS traffic is gobbling up your Ethernet, it won't help to choose a host at random and create a name server there. You need to consider which hosts are sending all the queries, then figure out how to best provide them name service. Here are some hints to help you decide:
Look for queries from resolvers on hosts that share the same file server. You could run a name server on the file server.
Look for queries from resolvers on large, multiuser hosts. You could run a name server there.
Look for queries from resolvers on another subnet. Those resolvers should be configured to query a name server on their local subnet. If there isn't one on that subnet, create one.
Look for queries from resolvers on the same bridged segment (assuming you use bridging). If you run a name server on the bridged segment, the traffic won't need to be bridged to the rest of the network.
Look for queries from hosts connected to each other via another, lightly loaded network. You could run a name server on the other network.