How to Drive Yourself Crazy or Too Many Details About HTML Validation HTML Validation is an automated process by which a computerized device examines your finished HTML code, compares that code to all the rules it (the computer) knows about exactly how the HTML code is supposed to be written, and then tells you exactly where all the errors are in your code. You might think that this sounds like a very useful process, and for the most part it is. The big difficulty arises when you discover that most of the publically available validation services (Validator.W3.org is the big one) don't really have a very good handle on telling you exactly why your errors are errors. This is compounded many times over if you don't have a very clear understanding of Document Type Definitions. In order for a HTML document to be accepted by a validator, it must have a Formal Public Identifier (FPI) or a <!DOCTYPE> tag, which precedes the open <HTML> tag, at the very top of the document, before anything else. This identifier specifies exactly which version of HTML is being used in the document, and identifies the location of the Document Type Definition, which is used by the validator. The FPI for the documents in this site looks like this: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/1998/REC-html40-19980424/loose.dtd">This identifies the "flavour" of HTML that is used in this site as "HTML 4.0 Transitional", and specifies the location of the HTML 4.0 transitional DTD. The Document Type Definition (DTD) is basically a set of rules which define exactly how HTML of that specific version is to be used, which attributes are valid, which values can be used with which attributes, and which order tags can appear. You don't have to understand exactly what the DTD says, but if you write code which does not follow the rules set down by the DTD, your code will not validate, and the validator will not do more than point out where your errors are. A very good example of this can be seen in a web site I designed early in my career, which contained a <META> tag which identified the character set being used. The tag looked like this: <META HTTP-EQUIV="Content-Type" CONTENT="text/html" CHARSET="iso-8859-1">The response I got from the validation service when I tried to validate this document was "There is no such attribute: CHARSET". Now I know that it is possible to do this, I have seen META tags which indicate the character set being used in the document, but apparently I didn't look at such a tag closely enough before I tried to use it. In reality, the tag should look like this: <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=iso-8859-1">A very subtle difference, the quotation marks are in slightly different positions. The point is that the validation service was correct, there is no such attribute as CHARSET, but it didn't tell me that my quotation marks were in the wrong place, and it took me quite a while to search out and discover exactly the reason why my code was generating such an error. What it comes down to is that, in general, validation isn't necessary. I don't want to minimize the value of validation, because it is, in fact, very useful and very educational if nothing else, but you don't need to be able to write 100% perfect code all the time, and while you're learning HTML, validation can be and often is more confusing than it is helpful, primarily because the error messages returned by validation services are so incomprehensible. Here are some good sources of more information about HTML validation
|