WEB DESIGN

We didn't spend hours studiously poring over some reference book before we wrote our first HTML document. You probably shouldn't, either. HTML is simple to read and understand, and it's simple to write. And once you've written an HTML document, you've nearly completed your first XHTML one, too. So let's get started without first learning a lot of arcane rules.

To help you get that quick, satisfying start, we've included this chapter as a brief summary of the many elements of HTML and its progeny, XHTML. Of course, we've left out a lot of details and some tricks that you should know. Read the upcoming chapters to get the essentials for becoming fluent in HTML and XHTML.

Even if you are familiar with the languages, we recommend that you work your way through this chapter before tackling the rest of the book. It not only gives you a working grasp of basic HTML/XHTML and their jargon, but you'll also be more productive later, flush with the confidence that comes from creating attractive documents in such a short time.

2.1 Writing Tools

Use any text editor to create an HTML or XHTML document, as long as it can save your work on disk in ASCII text file format. That's because even though documents include elaborate text layout and pictures, they're all just plain old ASCII text documents themselves. A fancier WYSIWYG editor or a translator for your favorite word processor are fine, too — although they may not support the many nonstandard features we discuss later in this book. You'll probably end up touching up the source text they produce, in any case.

While it's not needed to compose documents, you should have at least one version of a popular browser installed on your computer to view your work, preferably Netscape Navigator or Microsoft Internet Explorer. That's because, unless you use a special editor, the source document you compose won't look anything like what gets displayed by a browser, even though it's the same document. Make sure what your readers actually see is what you intended by viewing the document yourself with a browser. Besides, the popular ones are free over the Internet.

Also note that you don't need a connection to the Internet or the Web to write and view your HTML or XHTML documents. You can compose and view your documents stored on a hard drive or floppy disk that's attached to your computer. You can even navigate among your local documents with the languages' hyperlinking capabilities without ever being connected to the Internet, or any other network, for that matter. In fact, we recommend that you work locally to develop and thoroughly test your documents before you share them with others.

We strongly recommend, however, that you do get a connection to the Internet if you are serious about composing your own documents. You can download and view others' interesting web pages and see how they accomplished some interesting feature — good or bad. Learning by example is fun, too. (Reusing others' work, on the other hand, is often questionable, if not downright illegal.) An Internet connection is essential if you include in your work hyperlinks to other documents on the Internet.

2.2 A First HTML Document

It seems every programming language book ever written starts off with a simple example on how to display the message, "Hello, World!" Well, you won't see a "Hello, World!" example in this book. After all, this is a style guide for the new millennium. Instead, ours sends greetings to the World Wide Web:

<html>

<head>

<title>My first HTML document</title>

</head>

<body>

<h2>My first HTML document</h2>

Hello, <i>World Wide Web!</i>

 <!-- No "Hello, World" for us -->

<p>

Greetings from<br>

<a href="http://www.ppolite.blogspot.com">Ppolite & Associates</a>

<p>

Composed with care by: 

<cite>(insert your name here)</cite>

<br>&copy;2000 and beyond

</body>

</html>

Go ahead: type in the example HTML source on a fresh word-processing page and save it on your local disk as myfirst.html. Make sure you select to save it in ASCII format; word processor-specific file formats like Microsoft Word's .doc files save hidden characters that can confuse the browser software and disrupt your HTML document's display.

After saving myfirst.html (or myfirst.htm, if you are using archaic DOS- or Windows 3.11-based file-naming conventions) onto disk, start up your browser and locate and open the file from the program's File menu. Your screen should look like Figure 2-1.

Figure 2-1. A very simple HTML document

2.3 Embedded Tags

You probably noticed right away, perhaps in surprise, that the browser displays less than half of the example source text. Closer inspection of the source reveals that what's missing is everything that's bracketed inside a pair of less-than (<) and greater-than (>) characters. [Section 3.3.1]

HTML and XHTML are embedded languages: you insert their directions, or tags, into the same document that you and your readers load into a browser to view. The browser uses the information inside those tags to decide how to display or otherwise treat the subsequent contents of your document.

For instance, the <i> tag that follows the word "Hello" in the simple example tells the browser to display the following text in italics.^[1] [Section 4.5]

^[1] Italicized text is a very simple example and one that most browsers, except the text-only variety (e.g., Lynx), can handle. In general, the browser tries to do as it is told, but as we demonstrate in upcoming chapters, browsers vary from computer to computer and from user to user, as do the fonts that are available and selected by the user for viewing HTML documents. Assume that not all are capable or willing to display your HTML document exactly as it appears on your screen.

The first word in a tag is its formal name, which usually is fairly descriptive of its function, too. Any additional words in a tag are special attributes, sometimes with an associated value after an equals sign (=), which further define or modify the tag's actions.

2.3.1 Start and End Tags

Most tags define and affect a discrete region of your document. The region begins where the tag and its attributes first appear in the source document (a.k.a. the start tag ) and continues until a corresponding end tag. An end tag is the tag's name preceded by a forward slash (/ ). For example, the end tag that matches the "start italicizing" <i> tag is </i>.

End tags never include attributes. In HTML, most tags, but not all, have an end tag. And, to make life a bit easier for HTML authors, the browser software often infers an end tag from surrounding and obvious context, so you needn't explicitly include some end tags in your source HTML document. (We tell you which are optional and which are never omitted when we describe each tag in later chapters.) Our simple example is missing an end tag that is so commonly inferred and hence not included in the source that some veteran HTML authors don't even know that it exists. Which one?

The XHTML standard is much more rigid, insisting that all tags have corresponding end tags. [Section 16.3.2] [Section 16.3.3]

2.4 HTML Skeleton

Notice, too, that our simple example HTML document starts and ends with <html> and </html> tags. These tags tell the browser that the entire document is composed in HTML.^[2] The HTML and XHTML standards require an <html> tag for compliant documents, but most browsers can detect and properly display HTML encoding in a text document that's missing this outermost structural tag. [<html>]

^[2] XHTML documents also begin with the <html> tag, but they contain additional information to differentiate them from common HTML documents. See Chapter 16 for details.

Like our example, all HTML and XHTML documents have two main structures: a head and a body, each bounded in the source by respectively named start and end tags. You put information about the document in the head and the contents you want displayed in the browser's window inside the body. Except in rare cases, you'll spend most of your time working on your document's body content. [<head>] [<body>]

There are several different document header tags that you can use to define how a particular document fits into a document collection and into the larger scheme of the Web. Some nonstandard header tags even animate your document.

For most documents, however, the important header element is the title. Standards require that every HTML and XHTML document have a title, even though the currently popular browsers don't enforce that rule. Choose a meaningful title, one that instantly tells the reader what the document is about. Enclose yours, as we do for the title of our example, between the <title> and </title> tags in your document's header. The popular browsers typically display the title at the top of the document's window. [<title>]

2.5 The Flesh on an HTML or XHTML Document

Except for the <html>, <head>, <body>, and <title> tags, the HTML and XHTML standards have few other required structural elements. You're free to include pretty much anything else in the contents of your document. (The web surfers among you know that authors have taken full advantage of that freedom, too.) Perhaps surprisingly, though, there are only three main types of HTML/XHTML content: tags (which we described previously), comments, and text.

2.5.1 Comments

A raw document with all its embedded tags can quickly become nearly unreadable, like computer-programming source code. We strongly recommend that you use comments to guide your composing eye.

Although it's part of your document, nothing in a comment, which goes between the special starting tag  comment delimiters, gets included in the browser display of your document. You see a comment in the source, as in our simple HTML example, but you don't see it on the display, as evidenced by our comment's absence in Figure 2-1. Anyone can download the source text of your documents and read the comments, though, so be careful what you write. [Section 3.5.3]

2.5.2 Text

If it isn't a tag or a comment, it's text. The bulk of content in most of your HTML/XHTML documents — the part readers see on their browser displays — is text. Special tags give the text structure, such as headings, lists, and tables. Others advise the browser how the content should be formatted and displayed.

2.5.3 Multimedia

What about images and other multimedia elements we see and hear as part of our web browser displays? Aren't they part of the HTML document? No. The data that comprises digital images, movies, sounds, and other multimedia elements that may be included in the browser display are in documents separate from the main HTML/XHTML document. You include references to those multimedia elements via special tags. The browser uses the references to load and integrate other types of documents with your text.

We didn't include any special multimedia references in the previous example simply because they are separate, nontext documents that you can't just type into a text processor. We do, however, talk about and give examples of how to integrate images and other multimedia in your documents later in this chapter, as well as in extensive detail in subsequent chapters.

2.6 Text

Text-related HTML/XHTML markup tags comprise the richest set of all in the standard languages. That's because the original language — HTML — emerged as a way to enrich the structure and organization of text.

HTML came out of academia. What was and still is important to those early developers was the ability of their mostly academic, text-oriented documents to be scanned and read without sacrificing their ability to distribute documents over the Internet to a wide diversity of computer display platforms. (ASCII text is the only universal format on the global Internet.) Multimedia integration is something of an appendage to HTML and XHTML, albeit an important one.

Also, page layout is secondary to structure. We humans visually scan and decide textual relationships and structure based on how it looks; machines can only read encoded markings. Because documents have encoded tags that relate meaning, they lend themselves very well to computer-automated searches and also to the recompilation of content — features very important to researchers. It's not so much how something is said as what is being said.

Accordingly, neither HTML nor XHTML is a page-layout language. In fact, given the diversity of user-customizable browsers, as well as the diversity of computer platforms for retrieval and display of electronic documents, all these markup languages strive to accomplish is to advise, not dictate, how the document might look when rendered by the browser. You cannot force the browser to display your document in any certain way. You'll hurt your brain if you insist otherwise.

2.6.1 Appearance of Text

For instance, you cannot predict what font and what absolute size — 8- or 40-point Helvetica, Geneva, Subway, or whatever — will be used for a particular user's text display. Okay, so the latest browsers now support standard Cascading Style Sheets and other desktop publishing-like features that let you control the layout and appearance of your documents. But users may change their browser's display characteristics and override your carefully laid plans at will, quite a few of the older browsers out there don't support these new layout features, and some browsers are text-only with no nice fonts at all. What to do? Concentrate on content. Cool pages are a flash in the pan. Deep content will bring people back for more and more.

Nonetheless, style does matter for readability, and it is good to include it where you can, as long as it doesn't interfere with content presentation. You can attach common style attributes to your text with physical style tags, like the italic <i> tag in our simple example. More importantly and truer to the language's original purpose, HTML and XHTML have content-based style tags that attach meaning to various text passages. And you can alter text display characteristics, such as font style, size, color, and so on, with Cascading Style Sheets (CSS).

Today's graphical browsers recognize the physical and content-related text style tags and change the appearance of their related text passages to visually convey meaning or structure. You can't predict exactly what that change will look like.

The HTML 4 standard (and even more so, the XHTML 1.0 standard) stresses that future browsers will not be so visually bound. Text contents may be heard or even felt, for example, not read by viewers. Context clues surely are better in those cases than physical styles.

2.6.1.1 Content-based text styles

Content-based style tags indicate to the browser that a portion of your HTML/XHTML text has a specific usage or meaning. The <cite> tag in our simple example, for instance, means the enclosed text is some sort of citation — the document's author, in this case. Browsers commonly, although not universally, display the citation text in italic, not as regular text. [Content-Based Style Tags]

While it may or may not be obvious to the current reader that the text is a citation, someday someone might create a computer program that searches a vast collection of documents for embedded <cite> tags and compiles a special list of citations from the enclosed text. Similar software agents already scour the Internet for embedded information to compile listings, such as the infamous Google database of web sites.

The most common content-based style used today is that of emphasis, indicated with the <em> tag. And if you're feeling really emphatic, you might use the <strong> content style. Other content-based styles include <code>, for snippets of programming code; <kbd>, to denote text entered by the user via a keyboard; <samp>, to mark sample text; <dfn>, for definitions; and <var>, to delimit variable names within programming code samples. All of these tags have corresponding end tags.

2.6.1.2 Physical styles

Even the barest of barebones text processors conform to a few traditional text styles, such as italic and bold characters. While not word-processing tools in the traditional sense, HTML and XHTML provide tags that explicitly tell the browser to display (if it can) a character, word, or phrase in a particular physical style.

Although you should use related content-based tags, for the reasons we argued earlier, sometimes form is more important than function. Use the <i> tag to italicize text without imposing any specific meaning, the <b> tag to display text in boldface, or the <tt> tag so that the browser, if it can, displays the text in a teletype-style monospaced typeface. [Section 4.5]

It's easy to fall into the trap of using physical styles when you should really be using a content-based style instead. Discipline yourself now to use the content-based styles, because, as we argued earlier, they convey meaning as well as style, thereby making your documents easier to automate and manage.

2.6.1.3 Special text characters

Not all text characters available to you for display by a browser can be typed from the keyboard. And some characters have special meanings, such as the brackets around tags, which if not somehow differentiated when used for plain text — the less-than sign (<) in a math equation, for example — will confuse the browser and trash your document. HTML and XHTML give you a way to include any of the many different characters that comprise the ASCII character set anywhere in your text through a special encoding of its character entity.

Like the copyright symbol in our simple example, a character entity starts with an ampersand (&), followed by its name, and terminated with a semicolon (;). Alternatively, you may also use the character's position number in the ASCII table of characters, preceded by the pound or sharp sign (#), in lieu of its name in the character-entity sequence. When rendering the document, the browser displays the proper character, if it exists in the user's font. [Section 3.5.2]

For obvious reasons, the most commonly used character entities are the greater-than (>), less-than (<), and ampersand (&) characters. Check Appendix F to find out what symbol the character entity ¦ represents. You'll be pleasantly surprised!

2.6.2 Text Structures

It's not obvious in our simple example, but the common carriage returns we use to separate paragraphs in our source document have no meaning in HTML or XHTML, except in special circumstances. You could have typed the document onto a single line in your text editor, and it would still appear the same in Figure 2-1.^[3]

^[3] We use a computer programming-like style of indentation so that our source HTML/XHTML documents are more readable. It's not obligatory, nor are there any formal style guidelines for source HTML/XHTML document text formats. We do, however, highly recommend that you adopt a consistent style, so that you and others can easily follow your source documents.

You'd soon discover, too, if you hadn't read it here first, that except in special cases, browsers typically ignore leading and trailing spaces, and sometimes more than a few in between. (If you look closely at the source example, the line "Greetings from" looks like it should be indented by leading spaces, but it isn't in Figure 2-1.)

2.6.2.1 Divisions, paragraphs, and line breaks

A browser takes the text in the body of your document and "flows" it onto the computer screen, disregarding any common carriage-return or line-feed characters in the source. The browser fills as much of each line of the display window as possible, beginning flush against the left margin, before stopping after the rightmost word and moving on to the next line. Resize the browser window, and the text reflows to fill the new space, indicating HTML's inherent flexibility.

Of course, readers would rebel if your text just ran on and on, so HTML and XHTML provide both explicit and implicit ways to control the basic structure of your document. The most rudimentary and common ways are with the division (<div>), paragraph (<p>), and line-break (<br>) tags. All break the text flow, which consequently restarts on a new line. The differences are that the <div> and <p> tags define an elemental region of the document and text, respectively, the contents of which you may specially align within the browser window, apply text styles to, and alter with other block-related features.

Without special alignment attributes, the <div> and <br> tags simply break a line of text and place subsequent characters on the next line. The <p> tag adds more vertical space after the line break than either the <div> or <br> tags. [Section 4.1.1] [Section 4.1.2] [Section 4.6.1]

By the way, the HTML standard includes end tags for the paragraph and division tags, but not for the line-break tag.^[4] Few authors ever include the paragraph end tag in their documents; the browser usually can figure out where one paragraph ends and another begins.^[5] Give yourself a star if you knew that </p> even exists.

^[4] With XHTML, <br>'s start and end are between the same brackets: <br />. Browsers tend to be very forgiving and often ignore extraneous things, such as the forward slash in this case, so it's perfectly okay to get into the habit of adding that end-mark.

^[5] The paragraph end tag is being used more commonly now that the popular browsers support the paragraph-alignment attribute.

2.6.2.2 Headings

Besides breaking your text into divisions and paragraphs, you can also organize your documents into sections with headings. Just as they do on this and other pages in this printed book, headings not only divide and entitle discrete passages of text, they also convey meaning visually. And headings readily lend themselves to machine-automated processing of your documents.

There are six heading tags, <h1> through <h6>, with corresponding end tags. Typically, the browser displays their contents in, respectively, very large to very small font sizes, and usually in boldface. The text inside the <h4> tag typically is the same size as the regular text. [Section 4.2.1]

The heading tags also break the current text flow, standing alone on lines and separated from surrounding text, even though there aren't any explicit paragraph or line-break tags before or after a heading.

2.6.2.3 Horizontal rules

Besides headings, HTML and XHTML provide horizontal rule lines that help delineate and separate the sections of your document.

When the browser encounters an <hr> tag in your document, it breaks the flow of text and draws a line across the display window on a new line. The flow of text resumes immediately below the rule.^[6] [Section 5.1.1]

^[6] Similar to <br>, with XHTML the formal horizontal rule tag is <hr />.

2.6.2.4 Preformatted text

Occasionally, you'll want the browser to display a block of text as-is: for example, with indented lines and vertically aligned letters or numbers that don't change even though the browser window might get resized. The <pre> tag rises to those occasions. All text up to the closing </pre> end tag appears in the browser window exactly as you type it, including carriage returns, line feeds, and leading, trailing, and intervening spaces. Although very useful for tables and forms, <pre> text looks pretty dull; the popular browsers render the block in a monospace typeface. [Section 4.6.5]

2.7 Hyperlinks

While text may be the meat and bones of an HTML or XHTML document, the heart is hypertext. Hypertext gives users the ability to retrieve and display a different document in their own or someone else's collection simply by a click of the keyboard or mouse on an associated word or phrase (hyperlink ) in the document. Use these interactive hyperlinks to help readers easily navigate and find information in your own or others' collections of otherwise separate documents in a variety of formats, including multimedia, HTML, XHTML, other XML, and plain ASCII text. Hyperlinks literally bring the wealth of knowledge on the whole Internet to the tip of the mouse pointer.

To include a hyperlink to some other document in your own collection or on a server in Timbuktu, all you need to know is the document's unique address and how to drop an anchor into your document.

2.7.1 URLs

While it is hard to believe, given the millions, perhaps billions, of them out there, every document and resource on the Internet has a unique address, known as its uniform resource locator (URL; commonly pronounced "you-are-ell"). A URL consists of the document's name preceded by the hierarchy of directory names in which the file is stored (pathname ), the Internet domain name of the server that hosts the file, and the software and manner by which the browser and the document's host server communicate to exchange the document (protocol ):

protocol://server_domain_name/pathname

Here are some sample URLs:

http://www.kumquat.com/docs/catalog/price_list.html

price_list.html

http://www.kumquat.com/

ftp://ftp.netcom.com/pub/

The first example is an absolute or complete URL. It includes every part of the URL format: protocol, server, and the pathname of the document. While absolute URLs leave nothing to the imagination, they can lead to big headaches when you move documents to another directory or server. Fortunately, browsers also let you use relative URLs and automatically fill in any missing portions with respective parts from the current document's base URL. The second example is the simplest relative URL of all; with it, the browser assumes that the price_list.html document is located on the same server, in the same directory as the current document, and uses the same network protocol.

Relative URLs are also useful if you don't know a directory or document's name. The third URL example, for instance, points to kumquat.com's web home page. It leaves it up to the kumquat server to decide what file to send along. Typically, the server delivers the first file in the directory, one named index.html, or simply a listing of the directory's contents.

Although appearances may deceive, the last FTP example URL actually is absolute; it points directly at the contents of the /pub directory.

2.7.2 Anchors

The anchor (<a>) tag is the HTML/XHTML feature for defining both the source and the destination of a hyperlink.^[7] You'll most often see and use the <a> tag with its href attribute to define a source hyperlink. The value of the href attribute is the URL of the destination.

^[7] The nomenclature here is a bit unfortunate: the "anchor" tag should mark just a destination, not the jumping-off point of a hyperlink, too. You "drop anchor"; you don't jump off one. We won't even mention the atrociously confusing terminology the W3C uses for the various parts of a hyperlink, except to say that someone got things all "bass ackwards."

The contents of the source <a> tag — the words and/or images between it and its </a> end tag — is the portion of the document that is specially activated in the browser display and that users select to take a hyperlink. These anchor contents usually look different from the surrounding content (text in a different color or underlined, images with specially colored borders, or other effects), and the mouse-pointer icon changes when passed over them. The <a> tag contents, therefore, should be text or an image (icons are great) that explicitly or intuitively tells users where the hyperlink will take them. [Section 6.3.1]

For instance, the browser will specially display and change the mouse pointer when it passes over the "Kumquat Archive" text in the following example:

For more information on kumquats, visit 

<a href="http://www.kumquat.com/archive.html">

Kumquat Archive</a>

If the user clicks the mouse button on that text, the browser automatically retrieves from the server www.kumquat.com a web (http:) page named archive.html, then displays it for the user.

2.7.3 Hyperlink Names and Navigation

Pointing to another document in some collection somewhere on the other side of the world is not only cool, it also supports your own web documents. Yet the hyperlink's chief duty is to help users navigate your collection in their search for valuable information. Hence, the concept of the home page and supporting documents has arisen.

None of your documents should run on and on. First, there's a serious performance issue: the value of your work suffers, no matter how rich it is, if the document takes forever to download and if, once it is retrieved, users must endlessly scroll up and down through the display to find a particular section.

Rather, design your work as a collection of several compact and succinct pages, like chapters in a book, each focused on a particular topic for quick selection and browsing by the user. Then use hyperlinks to organize that collection.

For instance, use your home page — the leading document of the collection — as a master index full of brief descriptions and respective hyperlinks to the rest of your collection.

You can also use either the name variant of the <a> tag or the id attribute of nearly all tags to specially identify sections of your document. Tag ids and name anchors serve as internal hyperlink targets in your documents to help users easily navigate within the same document or jump to a particular section within another document. Refer to that id'd section in a hyperlink by appending a pound sign (#) and the section name as the suffix to the URL.

For instance, to reference a specific topic in an archive, such as "Kumquat Stew Recipes" in our example Kumquat Archive, first mark the section title with an id:

...preceding content... 

<h3 id="Stews">Kumquat Stew Recipes</h3>

in the same or another document, then prepare a source hyperlink that points directly to those recipes by including the section's id value as a suffix to the document's URL, separated by a pound sign:

For more information on kumquats, visit 

<a href="http://www.kumquat.com/archive.html">

  Kumquat Archive</a>, 

and perhaps try one or two of

<a href="http://www.kumquat.com/archive.html#Stews"> 

  Kumquat Stew Recipes</a>.

If selected by the user, the latter hyperlink causes the browser to download the archive.html document and start the display at our "Stews" section.

2.7.4 Anchors Beyond

Hyperlinks are not limited to other HTML and XHTML documents. Anchors let you point to nearly any type of document available over the Internet, including other Internet services.

However, "let" and "enable" are two different things. Browsers can manage the various Internet services, like FTP and Gopher, so that users can download non-HTML documents. They don't yet fully or gracefully handle multimedia.

Today, there are few standards for the many types and formats of multimedia. Computer systems connected to the Web vary wildly in their abilities to display those sound and video formats. Except for some graphics images, standard HTML/XHTML gives you no specific provision for display of multimedia documents except the ability to reference one in an anchor. The browser, which retrieves the multimedia document, must activate a special helper application, download and execute an associated applet, or have a plug-in accessory installed to decode and display it for the user right within the document's display.

Although HTML and most web browsers currently avoid the confusion by sidestepping it, that doesn't mean you can't or shouldn't exploit multimedia in your documents: just be aware of the limitations.

2.8 Images Are Special

Image files are multimedia elements that you can reference with anchors in your document for separate download and display by the browser. But, unlike other multimedia, standard HTML and XHTML have an explicit provision for image display "inline" with the text, and images can serve as intricate maps of hyperlinks. That's because there is some consensus in the industry concerning image file formats — specifically, GIF and JPEG — and the graphical browsers have built-in decoders that integrate those image types into your document.^[8]

^[8] Some browsers support other multimedia besides GIF and JPEG graphics for inline display. Internet Explorer, for instance, supports a tag that plays background audio. In addition, the HTML 4 and XHTML standards provide a way to display other types of multimedia inline with document text through a general tag.

2.8.1 Inline Images

The HTML/XHTML tag for inline images is <img>; its required src attribute is the URL of the GIF or JPEG image you want to insert in the document. [<img>]

The browser separately loads images and places them into the text flow as if the image were some special, albeit sometimes very large, character. Normally, that means the browser aligns the bottom of the image to the bottom of the current line of text. You can change that with the special <img> align attribute, whose value you set to put the image at the top, middle, or bottom of adjacent text. Examine Figure 2-2 through Figure 2-4 for the image alignment you prefer. Of course, wide images may take up the whole line and hence break the text flow. You can also place an image by itself, by including preceding and following division, paragraph, or line-break tags.

Experienced HTML authors use images not only as supporting illustrations, but also as quite small inline characters or glyphs, added to aid browsing readers' eyes and to highlight sections of the documents. Veteran HTML authors^[9] commonly add custom list bullets or more distinctive section dividers than the conventional horizontal rules. Images, too, may be included in a hyperlink, so that users may select an inline thumbnail sketch to download a full-screen image. The possibilities with inline images are endless.

^[9] XHTML is too new to call anyone a veteran or experienced XHTML author.

2.8.2 Image Maps

Image maps are images within an anchor with a special attribute: they may contain more than one hyperlink.

One way to enable an image map is by adding the ismap attribute to an <img> tag placed inside an anchor tag (<a>). When the user clicks somewhere in the image, the graphical browser sends the relative x,y coordinates of the mouse position to the server that is also designated in the anchor. A special server program then translates the image coordinates into some special action, such as downloading another document. [Section 6.5.1.1]

A good example of the use of an image map might be to locate a hotel while traveling. For example, when the user clicks on a map of the region he intends to visit, your image map's server program might return the names, addresses, and phone numbers of local accommodations.

While they are very powerful and visually appealing, these so-called server-side image maps mean that authors must have some access to the map's coordinate-processing program on the server. Many authors don't even have access to the server, let alone a program on the server. A better solution is to take advantage of client-side image maps.

Rather than depending on a web server, the usemap attribute for the <img> tag, along with the <map> and <area> tags, allows authors to embed all the information the browser needs to process an image map in the same document as the image. Because of their reduced network bandwidth and server independence, the client-side image maps are popular among document authors and system administrators alike. [Section 6.5.2]

2.9 Lists, Searchable Documents, and Forms

Thought we'd exhausted text elements? Headers, paragraphs, and line breaks are just the rudimentary text-organizational elements of a document. The languages also provide several advanced text-based structures, including three types of lists, "searchable" documents, and forms. Searchable documents and forms go beyond text formatting, too; they are a way to interact with your readers. Forms let users enter text and click checkboxes and radio buttons to select particular items and then send that information back to the server. Once received, a special server application processes the form's information and responds accordingly; e.g., filling a product order or collecting data for a user survey.^[10]

^[10] The server-side programming required for processing forms is beyond the scope of this book. We give some basic guidelines in the appropriate chapters, but please consult the server documentation and your server administrator for details.

The syntax for these special features and their various attributes can get rather complicated; they're not quick-start grist. We'll mention them here, but we urge you to read on for details in later chapters.

2.9.1 Unordered, Ordered, and Definition Lists

The three types of lists match those we are most familiar with: unordered, ordered, and definition lists. An unordered list — one in which the order of items is not important, such as a laundry or grocery list — gets bounded by <ul> and </ul> tags. Each item in the list, usually a word or short phrase, is marked by the <li> (list-item) tag and, particularly with XHTML, the </li> end tag. When rendered, the list item typically appears indented from the left margin and preceded by a bullet symbol. [<ul>] [<li>]

Ordered lists, bounded by the <ol> and </ol> tags, are identical in format to unordered ones, including the <li> tag (and </li> end tag with XHTML) for marking list items. However, the order of items is important — equipment assembly steps, for instance. The browser accordingly displays each item in the list preceded by an ascending number. [<ol>]

Definition lists are slightly more complicated than unordered and ordered lists. Within a definition list's enclosing <dl> and </dl> tags, each list item has two parts, each with a special tag: a short name or title, contained within a <dt> tag, followed by its corresponding value or definition, denoted by the <dd> tag (XHTML includes respective end tags). When rendered, the browser usually puts the item name on a separate line (although not indented), and the definition, which may include several paragraphs, indented below it. [<dl>]

The various types of lists may contain nearly any type of content normally allowed in the body of the document. So you can organize your collection of digitized family photographs into an ordered list, for example, or put them into a definition list complete with text annotations. The markup language standards even let you put lists inside of lists (nesting), opening up a wealth of interesting combinations.

2.9.2 Searchable Documents

The simplest type of user interaction provided by HTML and XHTML is the searchable document. You create a searchable document by including an <isindex> tag in its header or body. The browser automatically provides some way for the user to type one or more words into a text input box and to pass those keywords to a related processing application on the server.^[11] [<isindex>]

^[11] Few authors have used the tag, apparently. The <isindex> tag has been "deprecated" in HTML Version 4.0 — sent out to pasture, so to speak, but not yet laid to rest.

The processing application on the server uses those keywords to do some special task, such as perform a database search or match the keywords against an authentication list to allow the user special access to some other part of your document collection.

2.9.3 Forms

Obviously, searchable documents are very limited — one per document and only one user-input element. Fortunately, HTML and XHTML provide better, more extensive support for collecting user input through forms.

You can create one or more special form sections in your document, bounded with the <form> and </form> tags. Inside the form, you may put predefined as well as customized text-input boxes allowing for both single and multiline input. You may also insert checkboxes and radio buttons for single- and multiple-choice selections and special buttons that work to reset the form or send its contents to the server. Users fill out the form at their leisure, perhaps after reading the rest of the document, and click a special send button that makes the browser send the form's data to the server. A special server-side program you provide then processes the form and responds accordingly, perhaps by requesting more information from the user, modifying subsequent documents the server sends to the user, and so on. [<form>]

Forms provide everything you might expect of an automated form, including input area labels, integrated contents for instructions, default input values, and so on — except automatic input verification; your server-side program or client-side applets need to perform that function.

2.10 Tables

For a language that emerged from academia — a world steeped in data — it's not surprising to find that HTML (and now its progeny, XHTML) supports a set of tags for data tables that not only align your numbers but can specially format your text, too.

Five tags enable tables, including the <table> tag itself and a <caption> tag for including a description of the table. Special tag attributes let you change the look and dimensions of the table. You create a table row by row, putting between the table row (<tr>) tag and its end tag (</tr>) either table header (<th>) or table data (<td>) tags and their respective contents for each cell in the table (end tags, too, with XHTML). Headers and data may contain nearly any regular content, including text, images, forms, and even another table. As a result, you can also use tables for advanced text formatting, such as for multicolumn text and sidebar headers (see Figure 2-5). For more information, see Chapter 10.

2.11 Frames

Anyone who has had more than one application window open on her graphical desktop at a time can immediately appreciate the benefits of frames. Frames let you divide the browser window into multiple display areas, each containing a different document.

Figure 2-6 is an example of a frame display. It shows how the document window may be divided into independent windows separated by rule lines and scrollbars. What is not immediately apparent in the example, though, is that each frame displays an independent document, and not necessarily HTML or XHTML ones, either. A frame may contain any valid content that the browser is capable of displaying, including multimedia. If the frame's contents include a hypertext link that the user selects, the new document's contents, even another frame document, may replace that same frame, another frame's content, or the entire browser window.

Frames are defined in a special document, in which you replace the <body> tag with one or more <frameset> tags that tell the browser how to divide its main window into discrete frames. Special <frame> tags go inside the <frameset> tag and point to the documents that go inside the frames.

The individual documents referenced and displayed in the frame document window act independently, to a degree; the frame document controls the entire window. You can, however, direct one frame's document to load new content into another frame. In Figure 2-6, for example, selecting a Chapter hyperlink in the Table of Contents frame has the browser load and display that Chapter's contents in the frame on the right. That way, the Table of Contents is always available to the user as he browses the collection. For more information on frames, see Chapter 11.