Below are criteria by which I judge the quality of a site's
HTML/CSS
and its accessibility. It starts with a numerical analysis
of various features and then evaluates the meaning of those numbers within the given
context. There are few absolutes, but neither validation errors
nor <font> tags
should exist
The whole purpose of the HTML/CSS distinction is to separate structure from presentation through
judicious use of CSS in combination with well-structured HTML. HTML provides a semantic and
logical structure to the file while the CSS provides the appearance you want to achieve. The
separation also makes it possible to change the appearance of an element throughout a site from
one location, the CSS file(s), rather than having to edit each HTML file or some collection
of include files. This process also simplifies meeting accessibility standards.
For my own part, all code will validate as correct HTML and CSS and will address accessibility
issues. I use XHTML 1.0 Strict unless otherwise required. This most restrictive implementation
path ultimately leads to the most flexible, accessible, and easily maintainable code.
There is a glossary at the end of this document just in case any term
or especially acronym is unfamiliar.
HTML
- DOCTYPE
- Is there a DOCTYPE declaration and, more
importantly, does the code conform to it? Very often there is a declaration of HTML 4.01
but then the code includes XHTML tag closers on contentless tags (e.g., link, input)
or vice-versa. This throws the validator into a tizzy, producing spurious
errors and masking real ones.
- Validation errors
- With standards well-defined and agreed upon everyone should strive to meet those standards.
Adherence improves accessibility and uniformity across platforms and
user agents. It also reduces the chance of user agents giving
wildly different interpretations to the applied CSS (I've seen some really screwy looking
material when code violated standards).
- There are some recent sophisticated developments in accessibility which use techniques
which don't validate under the present system. Those are the only exceptions to
validation that should be accepted.
- <font> tags
- Totally unnecessary with CSS and a heavy-handed and inflexible way of controlling text.
CSS has more ways to achieve more effects.
- inter-element spacing achieved by using <br> tags, , and spacer images
- Also made obsolete by CSS. Margins and padding can be controlled better and margins can
be negative to achieve visual effects like these outdented paragraphs (effect achieved here
through a semantic/HTML method).
- <table> tags
- Generally superseded by <div> and CSS for layout purposes. CSS is far more flexible and
allows the HTML to be structured in a logical fashion for people who use alternate access
methods (e.g., the handicapped); tables often present material in an illogical
order. Any page with more than five tables for layout is using them badly. I have also seen
only one used badly. Tabular data should take full advantage of <th>, <colgroup>, and other
structural markup, though colgroup, thead, etc. are not well supported in CSS by
current browsers. We hope that will change in the new releases.
- For a demonstration of the flexibility provided by abandoning tables for layout purposes
see CSS Zen Garden. In the right-hand column you
will find alternate views of the same material—exactly the same material since there is
no difference in the HTML; the apparent changes are in the applied CSS. There are over 200
designs the webmaster has thought worthy of posting—my guess is that many times that were
rejected. My favorite is “Mozart”, number 189.
- <div> tags
- The preferred structural element, but can also be over-used with multiple nesting levels
just because someone didn't bother to think out structure (tables redux). Machine
generation (e.g., PHP, ColdFusion) often prompts people to stop thinking and overload
a page (especially when using tables but also with divs).
- <h#> tags
- Should be used to give structure to the page and never used solely to size text. They are
needed only if the page has a relevant structure. If used, must start with h1 and go in
proper nesting order. They serve the same purpose as an outline but on the
Web the page includes the text that fills in the outline. Accessibility
standards encourage their use—at least <h1> and a blind informant says he relies heavily
on levels of headings.
- For instance, this page uses “HTML and CSS Standards” as its <h1> and the page title,
“Webpage Analysis Criteria”, as its <h2>.
It then also has several <h3> and <h4> tags to organize subsidiary portions.
The size and other display characteristics are controlled by CSS.
- <ul> tags
- Should be used for all menus and for anything else that looks like a list. The fact that
the menu is horizontal or you don't want bullets is irrelevant; we're talking structure,
not appearance. CSS performs the bulletless horizontal magic (see
Zephyr Press for a horizontal menu—top and bottom—and
MGA or
Wheelchair Mobility for a vertical menu whose buttons are solely CSS creations).
- forms
- Is the form restricted in scope to its place of use? People commit one of two form sins—enclose
the whole page in a form even though the actual form is only a small part of the page or
break the form across multiple structural elements. Avoiding the latter by committing the
former is not a solution. A form whose only purpose is to present a search box should consist
of little more than the associated input tags. Feedback and data entry forms may require
extensive structural elements within the form—but not the whole page. Locating a form syntactically
correctly also seems to be a challenge.
- Skip links
- Extra credit. Provide on-page navigation aids to the handicapped, especially the blind (see
Zephyr Press).
CSS
- Validation errors
- Not generally a severe problem, but sometimes people invent values or even properties or
use a value that's not valid for that property.
- Efficiency/readability
- This is where WYSInWYG tools truly shine in their stupidity.
I have seen rules like “elementx { border-top: 1px solid red; padding-bottom: 4px; padding-top:
4px; margin-right: 7px; border-left: 1px solid red; margin-left: 7px; border-bottom: 1px
solid red; margin-top: 7px; border-right: 1px solid red; padding-left: 4px; padding-right:
4px; margin-bottom: 7px; }”. I've seen this sort of thing many times; it is not rare (and
often font or other information is included randomly just to add to the confusion). If someone
actually wants to read the code without having to feed it into some compatible WYSInWYG
tool, it will take many minutes to figure out that the rule says “elementx { margin: 7px;
border: 1px solid red; padding: 4px; }”. And the first way probably isn't even the right
way when there are differences in the TRBL values.
- External vs. page-level vs. inline
- As much as possible CSS should be put in external files for sharing across pages. A typical
page should have no inline styles and only a minimum of page-level styles. Inline styles
should be used only when there is a single usage of that style and it is unlikely to be
needed by any other element. Home pages are often different from the rest of a
site; a few page-level styles are okay but extensive CSS should be moved to a home page-specific
CSS file. Remember, an HTML page can reference as many CSS files as required to do the job
and one CSS file can reference others to help provide some coherent structure. I hate trying
to read through CSS files that are 20+ screens long (there are a lot of them) just to find
the code that applies to some limited portion of a page. Some of those bloated CSS files
are a result of the efficiency issue mentioned above, but not all.
- As an example of multiple CSS files, King's College London
uses a different colour scheme for each major portion of the
Website—Undergraduate,
Graduate, Research, etc. This is effected by the invocation of different external
CSS files whose only function is to control issues surrounding colour (background, border,
associated images, etc.). Other CSS files create a consistent appearance across
the whole Website.
- Text size
- Should be specified relatively rather than absolutely, so it can be resized by users (see
browser standards
for basic type information and two scalable methods).
- Class/id names
- Should be chosen to reflect the function of the matter covered and not its appearance (e.g.,
bad: class="bluebox"; good: id="special-note").
Some things, like class and id name choices, cannot be quantified. Another is where to place
rules and yet another is how much repetition to tolerate. These are judgment calls where I choose
the highest level that can easily be controlled. For example, people often specify the same
font-family for all paragraphs, headings, and table data when it could be specified in the body
rule (e.g., MIT home page eleven times, but at no point do they use any other font-family,
even the default; ditto WGBH—actually, these sites have recently updated their usage or are
in the process of doing so and I need to find other examples).
A large CSS file is not, by itself, an indication of good CSS usage. Very often CSS is over-specified
and underutilized. Bloating occurs from things mentioned already as well as creating many more
classes than needed. The BBC has at least seven CSS files, all large, and I couldn't find the
organizing element. One class name can occur in multiple files, making it frighteningly difficult
to find the rule that applies at any given moment. Yet many such sites still have a <body> tag
that specifies the non-standard attributes of marginheight, marginwidth, leftmargin,
and topmargin—which are correctly handled in CSS.
For the basics of CSS usage check A CSS Quick Reference.
JavaScript
Like CSS, as much as possible—all functions—should be in external files to reduce clutter and
load times, leaving only the function invocations (onload, etc.) in HTML. JavaScript
also needs to have a workaround for user agents that don't recognize JavaScript or have it turned
off. See J Korpela
for one example of how to fix a common problem and see the rest of the page for more JavaScript
advice—including having the introduction say “Specifically, one should never rely
on JavaScript alone in the processing of data entered by user” (my emphasis).
That being said, the Web is constantly evolving and where it started out as an alternate publishing
medium, it has recently also acquired the function of an alternate application platform. Instead
of writing a document in MSWord or some other word processor and then sending copies (print
or electronic) to interested parties, it is now possible to write the document on a word processor
accessed through the web, have it immediately available to others, and to allow them to contribute
to or modify/edit it themselves. You can also create and submit forms whose contents vary according
to initial and evolving conditions, where JavaScript changes the page dynamically, without going
back to the server. I don't think the standards have caught up with this situation.
Accessibility
The Web Accessibility Initiative (WAI) is the
W3C set of “Strategies, guidelines, resources to make the Web
accessible to people with disabilities.” The method of achieving accessibility is set out
in the Web Content Accessibility Guidelines (WCAG 1.0
– stable – and 2.0 – under development).
WCAG 1.0 consist of fourteen guidelines, each with several checkpoints which are grouped into
three priority levels. Some of these checkpoints can be checked with automated tools while others
must be checked manually.
The U.S. government has standards set forth in Section 508
that government sites and contractors are supposed to follow. They are similar to WAI, but not
as rigorous. The U.K.
also has its own set of standards, as do other governments.
Typically, simply converting from the old, table-based structure to modern structure and validating
the code will reduce the number of accessibility errors and warnings significantly. For instance,
the “alt” attribute for images is required for both validation and WCAG. In addition,
WCAG wants the contents of the alt attribute to be meaningful; that has to be a manual check.
Eliminating the use of images as spacers thus eliminates all associated errors and warnings.
Several online tools make it relatively easy to find and fix many errors.
Graphs
Page graph
Websites_as_Graphs
This tool graphs the tags on an individual page within a website, despite its name. It creates
a tree of color-coded nodes that gives some idea of how the page is put together. For instance,
lots of red says table-based structure and green indicates div-based. Lots of nodes off the
body tag indicates a probable lack of structure. All elements within a form should be clustered
together (apparently harder than one might think). I've also seen pages where the form tag encloses
everything (<body><form>…</form></body>), even though only a few lines,
if any, are the real form. Lots of images may indicate their use as spacers.
I'd like to see an additional color for lists, since they should be a strong structural element.
Not all table tags get colored red; caption, th, thead, tbody, tfoot, col, and colgroup are omitted.
Unbranched chains of red or green indicate nesting that is probably not well thought out and
therefore unnecessary.
What do the colors mean?
- black: the html tag, the root node
- green: the div tag
- red: tables (table, tr and td
tags; not th, tbody, etc., colgroup, etc.)
- orange: line breaks and block quotes (p,
br, and blockquote tags)
- blue: links (the a tag)
- violet: images (the img tag)
- yellow: forms (form, input,
textarea, select, and option tags)
- gray: all other tags
Site graph
Recommendations for a good one gratefully accepted.
http://www.touchgraph.com/ have to construct the tool?
VisVIP check this out
Glossary
- <…>
- Material enclosed between angle brackets constitute one of many HTML tags and associated
attributes which control what appears on the computer screen.
- Accessibility
- The concept that Web pages should be structured and constructed in such a way that they
are available to the widest range of people possible regardless of access method. Some accommodations
are directed at hand-held devices or text-only browsers. Others address disabilities ranging
from color-blindness to cognitive and physical impairments (perhaps 10% of the U. S. population).
- CSS
- Cascading Style Sheets—a tri-level system of applying rules to control the appearance of
Web pages. The rules consist of one or more property/value pairs and can be applied to multiple
pages with external files, to a single page, or to a single tag.
A CSS Quick Reference gives the basic outline for use.
- DOCTYPE
- A formal statement at the beginning of a conforming document of its Document Type Definition
(DTD)—a rigourous specification for a language so the user agent knows how to treat what
follows. Failure to include a DOCTYPE leaves the user agent to guess at what parsing rules
to use and how best to display the document.
- Browsers operate in “standards mode” or “quirks mode”
based on a correct DOCTYPE. The latter tries to match
the bad-old-methods that don't display the same under the former.
- HTML
- HyperText Markup Language—the basic language for writing pages that appear on the World
Wide Web (WWW). This includes XHTML (eXtensible HTML), which is a subset of XML (eXtensible
Markup Language), a more rigourous definition of how a computer language should be structured.
Until HTML version 4.0 the language did not have a clear definition that most players accepted
and agreed would be the basis for browser and other user agent development.
- Tag (W3C often uses “element” to refer to a tag)
- The basic structural element of HTML which may include several attribute/value pairs to
more precisely control the presentation of a Web page.
- TRBL
- Top, Right, Bottom, Left (TRouBLe—i.e., stay out of trouble by following this sequence);
the sequence for interpreting CSS shortcut properties. For example, the rule “img { margin-left:
5px; margin-right; 2px; margin-top: 10px; margin-bottom: 0px }” is more simply and clearly
written as “img { margin: 10px 2px 0px 5px; }”.
- User agent
- Any device through which a person accesses the Web, whether it be one of the standard browsers,
a handheld device (PDA, cell phone), a text-only browser, or screen reader or tactile device
for the blind (list not exhaustive).
- Validation
- The process of measuring HTML or other code against a precise syntactic definition or other
specification of a standard.
- W3C
- World Wide Web Consortium—the body responsible for setting
standards for the Web, i.e., HTML, CSS, etc. It's members constitute various
stakeholders in the Web.
- Web or WWW
- Shorthand for World Wide Web. (WWW is sometimes called “dub-dub-dub”
to avoid having to say so many syllables.)
- Website
- The collection of pages (one or many; static or dynamic) which originate at a single page
(generally designated the home page) which is itself uniquely identified by a WWW domain
name (e.g., NPR.org).
- WYSInWYG
- What you see is NOT what you get—my reformulation of the usual WYSIWYG (What You See Is
What You Get) description of a tool that, unlike original computer tools, purported, like
MSWord, to immediately reflect the appearance of the final product. A WYSInWYG tool, on
the other hand, mimics its namesake but has no hope of actually fulfilling that mission
because of internal and external constraints beyond its control. Any visual web authoring
tool is WYSInWYG because it uses an internal browser which is of necessity different from
all outside browsers.
- There is also another similar formulation called WYSINWOG—What You See Is Not What Others
Get. Again, the reason is that each browser interprets the code differently and not everyone
is using the standard visual browsers. That's why disciplined use of modern methods and
standards is necessary.