HTML 4.0 Formal Public Identifiers (FPI)

HTML 4.0 Strict: Use this when you want really clean markup, free of presentational clutter.
<!DOCTYPE PUBLIC "-//W3C//DTD HTML 4.0//EN" http://validator.w3.org/sgml-lib/REC-html40-971218/strict.dtd>

HTML 4.0 Transitional: Use this when you need to take advantage of HTML's presentational features because many of your readers don't have the latest browsers that understand CSS.
<!DOCTYPE PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" http://validator.w3.org/sgml-lib/REC-html40-971218/loose.dtd>

HTML 4.0 Frameset: Use this when you want to use HTML Frames to partition the browser window into two or more frames.
<!DOCTYPE PUBLIC "-//W3C//DTD HTML 4.0 Frameset//EN" http://validator.w3.org/sgml-lib/REC-html40-971218/frameset.dtd>


The Differences Between HTML 3.2 and HTML 4.0

A complete description of all elements used in HTML 4.0 can be found at http://www.w3.org/TR/REC-html40/index/elements.html.

The new elements in HTML 4.0 are: ABBR, ACRONYM, BDO, BUTTON, COLGROUP, DEL, FIELDSET, FRAME, FRAMESET, IFRAME, INS, LABEL, LEGEND, NOFRAMES, NOSCRIPT, OBJECT, OPTGROUP, PARAM, SPAN, TBODY, TFOOT, THEAD, and Q.

Deprecated elements

The following elements are deprecated: APPLET, BASEFONT, CENTER, DIR, FONT, ISINDEX, MENU, S, STRIKE, and U.

Obsolete elements

The following elements are obsolete: LISTING, PLAINTEXT, and XMP. For all of them, authors should use the PRE element instead.

Changes to attributes

Almost all attributes that specify the presentation of an HTML document (e.g., colors, alignment, fonts, graphics, etc.) have been deprecated in favor of style sheets. The list of attributes in the appendix indicates which attributes have been deprecated. The id and class attribute allow authors to assign name and class information to elements for style sheets, as anchors, for scripting, for object declarations, general purpose document processing, etc.

Changes for accessibility

HTML 4.0 features many changes to promote accessibility, including:

  • The title attribute may now be set on virtually every element.
  • Authors may provide long descriptions of tables, images, and frames (see the longdesc attribute).
  • Changes for meta data
  • Authors may now specify profiles that provide explanations about meta specified with the META or LINK elements.

Changes for text

New features for internationalization allow authors to specify text direction and language.

  • The INS and DEL elements allow authors to mark up changes in their documents.
  • The ABBR and ACRONYM elements allow authors to mark up abbreviations and acronyms in their documents.

Changes for links

The id attribute makes any element the destination anchor of a link.

Changes for tables

The HTML 4.0 table model has grown out of early work on HTML+ and the initial draft of HTML3.0. The earlier model has been extended in response to requests from information providers as follows:

  • Authors may specify tables that may be incrementally displayed as the user agent receives data.
  • Authors may specify tables that are more accessible to users with non-visual user agents.
  • Authors may specify tables with fixed headers and footers. User agents may take advantage of these when scrolling large tables or rendering tables to paged media.

The HTML 4.0 table model also satisfies requests for optional column-based defaults for alignment properties, more flexibility in specifying table frames and rules, and the ability to align on designated characters. It is expected, however, that style sheets will take over the task of rendering tables in the near future. In addition, a major goal has been to provide backwards compatibility with the widely deployed Netscape implementation of tables. Another goal has been to simplify importing tables conforming to the SGML CALS model. The latest draft makes the align attribute compatible with the latest versions of the most popular browsers. Some clarifications have been made to the role of the dir attribute and recommended behavior when absolute and relative column widths are mixed.

A new element, COLGROUP, has been introduced to allow sets of columns to be grouped with different width and alignment properties specified by one or more COL elements. The semantics of COLGROUP have been clarified over previous drafts, and rules="basic" replaced by rules="groups".

The style attribute is included as a means for extending the properties associated with edges and interiors of groups of cells. For instance, the line style: dotted, double, thin/thick etc; the color/pattern fill for the interior; cell margins and font information. This will be the subject for a companion specification on style sheets.

The frame and rules attributes have been modified to avoid SGML name clashes with each other, and to avoid clashes with the align and valign attributes. These changes were additionally motivated by the desire to avoid future problems if this specification is extended to allow frame and rules attributes with other table elements.

Changes for images, objects, and image maps

The OBJECT element allows generic inclusion of objects.

The IFRAME and OBJECT elements allow authors to create embedded documents.

The alt attribute is required on the IMG and AREA elements.

The mechanism for creating image maps now allows authors to create more accessible image maps. The content model of the MAP element has changed for this reason.

Changes for forms

This specification introduces several new attributes and elements that affect forms:

  • The accesskey attribute allows authors to specify direct keyboard access to form controls.
  • The disabled attribute allows authors to make a form control initially insensitive.
  • The readonly, allows authors to prohibit changes to a form control.
  • The LABEL element associates a label with a particular form control.
  • The FIELDSET element groups related fields together and, in association with the LEGEND element, can be used to name the group. Both of these new elements allow better rendering and better interactivity. Speech-based browsers can better describe the form and graphic browsers can make labels sensitive.
  • A new set of attributes, in combination with scripts, allow form providers to verify user-entered data.
  • The BUTTON element and INPUT with type set to "button" can be used in combination with scripts to create richer forms.
  • The OPTGROUP element allows authors to group menu options together in a SELECT, which is particularly important for form accessibility.

Changes for style sheets

HTML 4.0 supports a larger set of media descriptors so that authors may write device-sensitive style sheets.

Changes for frames

HTML 4.0 supports frame documents and inline frames.

Changes for scripting

Many elements now feature event attributes that may be coupled with scripts; the script is executed when the event occurs (e.g., when a document is loaded, when the mouse is clicked, etc.).

Changes for internationalization

HTML 4.0 integrates the recommendations of [RFC2070] for the internationalization of HTML. However, this specification and [RFC2070] differ as follows:

  • The accept-charset attribute has been specified for the FORM element rather than the TEXTAREA and INPUT elements.
  • The HTML 4.0 specification makes additional clarifications with respect to the bidirectional algorithm.
  • The use of CDATA to define the SCRIPT and STYLE elements does not preserve the ability to transcode documents, as described in section 2.1 of [RFC2070].

- HTML Reference Specification Appendix A - Changes between HTML 3.2 and HTML 4.0 (http://www.w3.org/TR/REC-html40/appendix/changes.html)


NEW ELEMENTS

This is just a listing of the new elements and some brief examples of their usage. For a more complete description please see The Index of HTML 4.0 Elements and The Index of HTML 4.0 Attributes at the World Wide Web Consortium.

<ABBR> - Abbreviation - </ABBR>
<ACRONYM> - Duh... - </ACRONYM>

The ABBR and ACRONYM elements allow authors to clearly indicate acronyms and abbreviated expressions of various kinds. Western languages make extensive use of acronyms or "initialisms" such as "GmbH", "NATO", and "F.B.I.", as well as abbreviations like "M.", "Inc.", "et al.", "etc.". Both Chinese and Japanese use analogous abbreviation mechanisms, wherein a long name is referred to subsequently with a subset of the Han characters from the original occurrence. All of these expressions can be tagged with ABBR, providing useful information to user agents and tools such as spell checkers, speech synthesizers, translation systems and search-engine indexers. The content of the ABBR element specifies the abbreviated expression itself, as it would normally appear in running text. The title attribute on ABBR may be used to provide the full or expanded form of the expression.

Note that abbreviations and acronyms often have idiosyncratic pronunciations. For example, while "IRS" and "BBC" are typically pronounced letter by letter, "NATO" and "UNESCO" are pronounced phonetically. Still other abbreviated forms (e.g., "URI" and "SQL") are spelled out by some people and pronounced as words by other people. When necessary, authors should use style sheets to specify the pronunciation of an abbreviated form.

<ABBR title="World Wide Web">WWW</ABBR>
<ACRONYM title="Someone Else's Problem">SEP</ACRONYM>
<ABBR lang="fr" title="Soci&eacute;t&eacute; Nationale de Chemins de Fer">SNCF</ABBR>
<ABBR lang="es" title="Do&ntilde;a">D&ntilde;a</ABBR>
<ABBR title="Abbreviation">abbr.</ABBR>


<Q> - Short Inline Quotation, used like BLOCKQUOTE - </Q>

These two elements designate quoted text. BLOCKQUOTEis for long quotations (block-level content) and Q is intended for short quotations (inline content) that don't require paragraph breaks.

This example formats an excerpt from "The Two Towers", by J.R.R. Tolkien, as a blockquote.

<BLOCKQUOTE cite="http://www.servername.com/~tolkien/twotowers.html">
<P>They went in single file, running like hounds on a strong scent, and an eager light was in their eyes. Nearly due west the broad swath of the marching Orcs tramped its ugly slot; the sweet grass of Rohan had been bruised and blackened as they passed.</P>
</BLOCKQUOTE>


Visual user agents generally render BLOCKQUOTE as an indented block.

Visual user agents must add delimiting quotation marks when rendering Q; users must not put delimiting quotation marks inside a Q element. Furthermore, user agents should add quotation marks in a language-sensitive manner (see the lang attribute). Many languages use different quotation styles for outer and inner quotations, which should be respected by user-agents implementing this element.

The World Wide Web Consortium recommends that style sheet implementations provide a mechanism for inserting quotation marks before and after a quotation delimited by BLOCKQUOTE in a manner appropriate to the current language context and the degree of nesting of quotations. However, as some authors have used BLOCKQUOTE merely as a mechanism to indent text, in order to preserve the intention of the authors, user agents should not insert quotation marks in the default style.

The usage of BLOCKQUOTE to indent text is deprecated in favor of style sheets.

<BDO> - BiDirectional Override - </BDO>

The bidirectional algorithm and the dir attribute generally suffice to manage embedded direction changes. However, some situations may arise when the bidirectional algorithm results in incorrect presentation. The BDO element allows authors to turn off the bidirectional algorithm for selected fragments of text.

The BDO element should be used in scenarios where absolute control over sequence order is required (e.g., multi-language part numbers). The dir attribute is mandatory for this element.

Authors may also use special Unicode characters to override the bidirectional algorithm -- LEFT-TO-RIGHT OVERRIDE (202D) or RIGHT-TO-LEFT OVERRIDE (hexadecimal 202E). The POP DIRECTIONAL FORMATTING (hexadecimal 202C) character ends either bidirectional override.

<BUTTON> - Push button - </BUTTON>

A BUTTON element whose type is "submit" is very similar to an INPUT element whose type is "submit". They both cause a form to be submitted, but the BUTTON element allows richer presentational possibilities. When a BUTTON whose type is "submit" is selected, the name and value are paired and submitted with the form.

A BUTTON element whose type is "submit" and whose content is an image (e.g., the IMG element) is very similar to an INPUT element whose type is "image". They both cause a form to be submitted, but their presentation is different. In this context, a graphical user agent may render an INPUT element as a "flat" image, and render a BUTTON as a button (e.g., with relief and an up/down motion when clicked).

The following example expands a previous example by substituting the INPUT elements that create submit and reset buttons with BUTTON instances. The buttons contain images by way of the IMG element.

<FORM action="http://somesite.com/prog/adduser" method="post"><P>
First name: <INPUT type="text" name="firstname"><BR>
Last name: <INPUT type="text" name="lastname"><BR>
Email: <INPUT type="text" name="email"><BR>
<INPUT type="radio" name="sex" value="Male"> Male<BR>
<INPUT type="radio" name="sex" value="Female"> Female<BR>
<BUTTON name="submit" value="submit" type="submit">
Send<IMG src="/icons/wow.gif" alt="wow"></BUTTON>
<BUTTON name="reset" type="reset">Reset<IMG src="/icons/oops.gif" alt="oops"></BUTTON></P>
</FORM>


<FIELDSET> - Form control group - </FIELDSET>
<LEGEND> - Fieldset legend - </LEGEND>

The FIELDSET element allows form designers to group thematically related controls and labels. Grouping controls makes it easier for users to understand their purpose while simultaneously facilitating tabbing navigation for visual user agents and speech navigation for speech-oriented user agents. The proper use of this element makes documents more accessible to people with disabilities.

The LEGEND element allows authors to assign a caption to a FIELDSET. The legend improves accessibility when the FIELDSET is rendered non-visually.

<COLGROUP> - Column group - </COLGROUP>
<TBODY> - Table body - </TBODY>
<TFOOT> - Table footer - </TFOOT>
<THEAD> - Table header - </THEAD>

Information about extended markup for tables in HTML 4.0 can be found at http://www.w3.org/TR/PR-html40/struct/tables.html.

<DEL> - Deleted text - </DEL>
<INS> - Inserted text - </INS>

INS and DEL are used to markup sections of the document that have been inserted or deleted with respect to a different version of a document (e.g., in draft legislation where lawmakers need to view the changes).

These two elements are unusual for HTML in that they may serve as either block-level or inline elements (but not both). They may contain one or more words within a paragraph or contain one or more block-level elements such as paragraphs, lists and tables.

This example could be from a bill to change the legislation for how many deputies a County Sheriff can employ from 3 to 5.

<P>
A Sheriff can employ <DEL>3</DEL><INS>5</INS> deputies.
</P>


The INS and DEL elements must not contain block-level content when these elements behave as inline elements.

ILLEGAL EXAMPLE:

The following is not considered legal HTML.

<P><INS><DIV>...block-level content...</DIV></INS></P>


User agents should render inserted and deleted text in ways that make the change obvious. For instance, inserted text may appear in a special font, deleted text may not be shown at all or be shown as struck-through or with special markings, etc.

Both of the following examples correspond to November 5, 1994, 8:15:30 am, US Eastern Standard Time.
 
1994-11-05T13:15:30Z
 
1994-11-05T08:15:30-05:00

Used with INS, this gives:

<INS datetime="1994-11-05T08:15:30-05:00" cite="http://www.foo.org/mydoc/comments.html">Furthermore, the latest figures from the marketing department suggest that such practice is on the rise.</INS>


The document "http://www.foo.org/mydoc/comments.html" would contain comments about why information was inserted into the document.

Authors may also make comments about inserted or deleted text by means of the title attribute for the INS and DEL elements. User agents may present this information to the user (e.g., as a popup note). For example:

<INS datetime="1994-11-05T08:15:30-05:00" title="Changed as a result of Steve B's comments in meeting.">Furthermore, the latest figures from the marketing department suggest that such practice is on the rise.</INS>


<FRAME> - Subwindow - </FRAME>
<FRAMESET> - Window Subdivision - </FRAMESET>
<IFRAME> - Inline Subwindow - </IFRAME>
<NOFRAMES> - Alternate content for non-frame based rendering - </NOFRAMES>

Information about extended markup for frames in HTML 4.0 can be found at http://www.w3.org/TR/PR-html40/present/frames.html.

<LABEL> - Form field label - </LABEL>

The LABEL element may be used to attach information to control elements. Each LABEL element is associated with exactly one form control.

To associate a label with another control explicitly, set the for attribute of the LABEL.

This example creates a table that is used to align two INPUT controls and their associated labels. Each label is associated explicitly with one of the INPUT elements.

<FORM action="..." method="post">
<TABLE>
<TR>
<TD><LABEL for="fname">First Name</LABEL>
<TD><INPUT type="text" name="firstname" id="fname">
<TR>
<TD><LABEL for="lname">Last Name</LABEL>
<TD><INPUT type="text" name="lastname" id="lname">
</TABLE>
</FORM>


<NOSCRIPT> - Alternate content for non-script based rendering - </NOSCRIPT>

The NOSCRIPT element allows authors to provide alternate content when a script is not executed. The content of a NOSCRIPT element should only rendered by a script-aware user agent in the following cases:

  • The user agent is configured not to evaluate scripts.
  • The user agent doesn't support a scripting language invoked by a SCRIPT element earlier in the document.

User agents that do not support client-side scripts must render this element's contents.

In the following example, a user agent that executes the SCRIPT will include some dynamically created data in the document. If the user agent doesn't support scripts, the user may still retrieve the data through a link.

<SCRIPT type="text/tcl">
...some Tcl script to insert data...
</SCRIPT>
<NOSCRIPT>
<P>Access the <A href="http://someplace.com/data">data.</A>
</NOSCRIPT>


<OBJECT> - Generic embedded object - </OBJECT>

Most user agents have built-in mechanisms for rendering common data types such as text, GIF images, colors, fonts, and a handful of graphic elements. To render data types they don't support natively, user agents generally run external applications. The OBJECT element allows authors to control whether data should be rendered externally or by some program, specified by the author, that renders the data within the user agent.

In the most general case, an author may need to specify three types of information:

  • The implementation of the included object. For instance, if the included object is a clock applet, the author must indicate the location of the applet's executable code.
  • The data to be rendered. For instance, if the included object is a program that renders font data, the author must indicate the location of that data.
  • Additional values required by the object at run-time. For example, some applets may require initial values for parameters.

The OBJECT element allows authors to specify all three types of data, but authors may not have to specify all three at once. For example, some objects may not require data (e.g., a self-contained applet that performs a small animation). Others may not require run-time initialization. Still others may not require additional implementation information, i.e., the user agent itself may already know how to render that type of data (e.g., GIF images).

Authors specify an object's implementation and the location of the data to be rendered via the OBJECT element. To specify run-time values, however, authors use the PARAM element, which is discussed in the section on object initialization (http://www.w3.org/TR/PR-html40/struct/objects.html#object-init).

<OPTGROUP> - Option group - </OPTGROUP>

The OPTGROUP element allows authors to group choices into a hierarchy. This is particularly helpful to non-visual user agents when the user has many options to choose from; long flat lists are hard to remember It is generally easier to grasp hierarchical groupings of choices, for instance by expanding and collapsing levels of detail.

<PARAM> - Named property value

PARAM elements specify a set of values that may be required by an object at run-time. Any number of PARAM elements may appear in the content of an OBJECT or APPLET element, in any order, but must be placed at the start of the content of the enclosing OBJECT or APPLET element.

The syntax of names and values is assumed to be understood by the object's implementation. The HTML specification does not specify how user agents should retrieve name/value pairs nor how they should interpret parameter names that appear twice.

In the following example, run-time data for the object's "Init_values" parameter is specified as an external resource (a GIF file). The value of the valuetype attribute is thus set to "ref" and the value is a URL designating the resource.

<P><OBJECT classid="http://www.gifstuff.com/gifappli" standby="Loading Elvis...">
<PARAM name="Init_values"
value="./images/elvis.gif"
valuetype="ref">

</OBJECT>


Note that we have also set the standby attribute so that the user agent may display a message while the rendering mechanism loads.

<SPAN> - Generic language/style container - </SPAN>

The DIV and SPAN elements, in conjunction with the id and class attributes, offer a generic mechanism for adding structure to documents. These elements define content to be inline (SPAN) or block-level (DIV) but impose no other presentational idioms on the content. Thus, authors may use these elements in conjunction with style sheets, the lang attribute, etc., to tailor HTML to their own needs and tastes.

Suppose, for example, that we wanted to generate an HTML document based on a database of client information. Since HTML does not include elements that identify objects such as "client", "telephone number", "email address", etc., we use DIV and SPAN to achieve the desired structural and presentational effects. We might use the TABLE element as follows to structure the information:

<!-- Example of data from the client database: -->
<!-- Name: Stephane Boyera, Tel: (212) 555-1212, Email: [email protected] -->
<DIV id="client-boyera" class="client">
<P><SPAN class="client-title">Client information:</SPAN>
<TABLE class="client-data">
<TR><TH>Last name:<TD>Boyera</TR>
<TR><TH>First name:<TD>Stephane</TR>
<TR><TH>Tel:<TD>(212) 555-1212</TR>
<TR><TH>Email:<TD>[email protected]</TR>
</TABLE>
</DIV>
<DIV id="client-lafon" class="client">
<P><SPAN class="client-title">Client information:</SPAN>
<TABLE class="client-data">
<TR><TH>Last name:<TD>Lafon</TR>
<TR><TH>First name:<TD>Yves</TR>
<TR><TH>Tel:<TD>(617) 555-1212</TR>
<TR><TH>Email:<TD>[email protected]</TR>
</TABLE>
</DIV>


Later, we may easily add style sheet declaration to fine tune the presentation of these database entries.