bestkungfu weblog

Strongly emphasizing semantics

Filed in: accessibility, design, Web, Tue, May 11 2004 18:45 PT

From: Atlanta, GA

I don’t know why the <b> and <i> elements are worth more than one post per year, but, whatever. I’m going for two.

Matthew Thomas responds.

HTML “must also cater for the ~99 percent of Web authors who don’t care about semantics and never will.”

Maybe. It has so far. It excelled in non-semantic implementations in the bad old days of the mid-to-late 90′s. (Hi, David Siegel!) But I don’t advocate jumping off bridges just because everybody else is doing it, either.

Those of us who represent good semantic practices in HTML have found value in it. If you haven’t, we are going to work on you. Of course most people will never get it. But that alone doesn’t destroy its value. Changing the default markup for a couple of elements doesn’t substantially alter the value proposition to people posting photos of their cats.

“What XHTML 1.1 does do is remove the name attribute and… ooh look, it removes the lang attribute! The very attribute Matt May suggested we use with span! So much for that forward-compatibility schtick, eh?”

Uh. So? In the same document Matthew points to, it says, “XHTML 1.1 represents a departure from both HTML 4 and XHTML 1.0.” A departure, see. To the extent that lang is not forward-compatible, I would say that — oh, how does that go? — his statement “is entirely correct, but almost entirely irrelevant.” The lang attribute has been replaced by the very similar xml:lang attribute for compatibility with XML. In any case, language forward-compatibility is not the issue for content creators: the issue is backwards compatibility in the browsers. And that’s reasonably guaranteed for HTML 4.01 and XHTML as long as people are browsing HTML. The lang attribute isn’t going away from HTML 4.01 or XHTML 1.0.

It seems that Matthew’s got a beef with using style based on attributes, instead favoring presentational elements. The thing is, irrespective of how user agents present it, the semantic value is still in the document. It doesn’t matter if it’s <i lang="fr"> or <span lang="fr">, it has the same semantics. It just has different default presentation. In the case of the lang attribute, it means that anything that can identify or translate it (like search engines and screen readers) will be able to do so.

If he intended in his original post to say, “use the <b> and <i> elements only when <strong> and <em> are not appropriate,” that’s one thing. I may even agree with that, in fact. But that’s not what he said. Or what he intended wasn’t clear enough. That’s not what I read, anyway, and apparently I’m not alone. (See: sidesh0w, James Craig.) It sounded more like “Don’t bother using <strong> and <em>“, which is not what is warranted here.

And his advice to authoring tool vendors is incorrect. Most of the time, those who intend to apply bold or italics are adequately well-served semantically by <strong> and <em>. From the W3C’s perspective, authoring tools should make semantic elements easier to use than presentational ones.

The <b> and <i> elements are valid HTML. They’re fine for transitional HTML applications, when they are not used interchangeably with <strong> and <em>. But they are not in the toolkits of semantic coders, and they’re misused 98% of the time. (That’s an underestimation.) If you can tell what 2% is right, go ahead and use them. If you can’t, here’s help.

The Chicago Manual of Style has one approved non-emphasis-related use of bold (abbreviated “dynamics in music”), and eight for italics. So these are the cases where using <i> is not the end of the world, with those covered by other HTML elements removed. (Note that four of them are practical subsets of the first, which is covered by the lang attribute.)

  • foreign words and phrases
  • enzyme names
  • gene names
  • genus and species
  • unabbreviated dynamics in music
  • rhyme schemes

Everything else should use <strong> and <em>. (As the WCAG techniques tell us.) Authoring tools that don’t know better should assume the semantic elements were what was intended, since almost all of the time they will be correct.

Comments are closed.

Powered by WordPress (RSS 2.0, Atom)