So far, we've talked about the theoretical structure of the domain name space and what sorts of data are stored in it, and we've even hinted at the types of names you might find in it with our (sometimes fictional) examples. But this won't help you decode the domain names you see on a daily basis on the Internet.
The Domain Name System doesn't impose many rules on the labels in domain names, and it doesn't attach any particular meaning to the labels at a particular level. When you manage a part of the domain name space, you can decide on your own semantics for your domain names. Heck, you could name your subdomains A through Z and no one would stop you (though they might strongly recommend against it).
The existing Internet domain name space, however, has some self-imposed structure to it. Especially in the upper-level domains, the domain names follow certain traditions (not rules, really, as they can be and have been broken.) These traditions help domain names from appearing totally chaotic. Understanding these traditions is an enormous asset if you're trying to decipher a domain name.
The original top-level domains divided the Internet domain name space organizationally into seven domains:
Commercial organizations, such as Hewlett-Packard (hp.com), Sun Microsystems (sun.com), and IBM (ibm.com)
Educational organizations, such as U.C. Berkeley (berkeley.edu) and Purdue University (purdue.edu)
Government organizations, such as NASA (nasa.gov) and the National Science Foundation (nsf.gov)
Military organizations, such as the U.S. Army (army.mil) and Navy (navy.mil)
Networking organizations, such as NSFNET (nsf.net)
Noncommercial organizations, such as the Electronic Frontier Foundation (eff.org)
International organizations, such as NATO (nato.int)
Another top-level domain called arpa was originally used during the ARPAnet's transition from host tables to DNS. All ARPAnet hosts originally had host names under arpa, so they were easy to find. Later, they moved into various subdomains of the organizational top-level domains. However, the arpa domain remains in use in a way you'll read about later.
You may notice a certain nationalistic prejudice in the examples: all are primarily U.S. organizations. That's easier to understand - and forgive - when you remember that the Internet began as the ARPAnet, a U.S.-funded research project. No one anticipated the success of the ARPAnet, or that it would eventually become as international as the Internet is today.
Today, these original domains are called generic top-level domains, or gTLDs. By the time you read this, we may have quite a few more of these, such as firm, shop, web, and nom, to accommodate the rapid expansion of the Internet and the need for more domain name "space." For more information on a proposal to create new gTLDs, see http://www.gtld-mou.org/.
To accommodate the internationalization of the Internet, the implementers of the Internet name space compromised. Instead of insisting that all top-level domains describe organizational affiliation, they decided to allow geographical designations, too. New top-level domains were reserved (but not necessarily created) to correspond to individual countries. Their domain names followed an existing international standard called ISO 3166.[4] ISO 3166 establishes official, two-letter abbreviations for every country in the world. We've included the current list of top-level domains as Appendix C, Top-Level Domains, of this book.
[4] Except for Great Britain. According to ISO 3166 and Internet tradition, Great Britain's top-level domain name should be gb. Instead, most organizations in Great Britain and Northern Ireland (i.e., the United Kingdom) use the top-level domain name uk. They drive on the wrong side of the road, too.
Within these top-level domains, the traditions and the extent to which they are followed vary. Some of the ISO 3166 top-level domains closely follow the U.S.'s original organizational scheme. For example, Australia's top-level domain, au, has subdomains such as edu.au and com.au. Some other ISO 3166 top-level domains follow the uk domain's lead and have subdomains such as co.uk for corporations and ac.uk for the academic community. In most cases, however, even these geographically-oriented top-level domains are divided up organizationally.
That's not true of the us top-level domain, however. The us domain has fifty subdomains that correspond to - guess what? - the fifty U.S. states.[5] Each is named according to the standard two-letter abbreviation for the state - the same abbreviation standardized by the U.S. Postal Service. Within each state's domain, the organization is still largely geographical: most subdomains correspond to individual cities. Beneath the cities, the subdomains usually correspond to individual hosts.
[5] Actually, there are a few more domains under us: one for Washington, D.C., one for Guam, and so on.
Now that you know what most top-level domains represent and how their name spaces are structured, you'll probably find it much easier to make sense of most domain names. Let's dissect a few for practice:
You've got a head start on this one, as we've already told you that berkeley.edu is U.C. Berkeley's domain. (Even if you didn't already know that, though, you could have inferred that the name probably belongs to a U.S. university because it's in the top-level edu domain.) cchem is the College of Chemistry's subdomain of berkeley.edu. Finally, lithium is the name of a particular host in the domain - and probably one of about a hundred or so, if they've got one for every element.
This example is a bit harder, but not much. The hp.com domain in all likelihood belongs to the Hewlett-Packard Company (in fact, we gave you this earlier, too). Their corp subdomain is undoubtedly their corporate headquarters. And winnie is probably just some silly name someone thought up for a host.
Here you'll need to use your understanding of the us domain. ca.us is obviously California's domain, but mpk is anybody's guess. In this case, it would be hard to know that it's Menlo Park's domain unless you knew your San Francisco Bay Area geography. (And no, it's not the same Menlo Park that Edison lived in - that one's in New Jersey.)
We've included this example just so you don't start thinking that all domain names have only four labels. apollo.hp.com is the former Apollo Computer subdomain of the hp.com domain. (When HP acquired Apollo, it also acquired Apollo's Internet domain, apollo.com, which became apollo.hp.com.) ch.apollo.hp.com is Apollo's Chelmsford, Massachusetts, site. And daphne is a host at Chelmsford.