Defining Block Level Elements

I know what an HTML block level element is, but I’m damned if I can say it in a concise, correct, obvious way (which it so happens I need to do in Chapter 4 of Refactoring HTML). In HTML, block level elements include p, blockquote, div, table, ul, ol, dl, h1h6, and a few others. Generally speaking a block element has a line break before and after it, but that’s really only true in a particualr visual representation. The notion of line breaks doesn’t make a lot of sense in a screen reader, for example.

The HTML 4.0.1 specification defines block elements thusly:

Certain HTML elements that may appear in BODY are said to be “block-level” while others are “inline” (also known as “text level”). The distinction is founded on
several notions:

Content model
Generally, block-level elements may contain inline elements and other
block-level elements. Generally, inline elements may contain only data and
other inline elements. Inherent in this structural distinction is the idea that
block elements create “larger” structures than inline elements.
Formatting
By default, block-level elements are formatted differently than inline
elements. Generally, block-level elements begin on new lines, inline elements
do not. For information about white space, line breaks, and block formatting,
please consult the section on text.
Directionality
For technical reasons involving the [UNICODE] bidirectional
text algorithm, block-level and inline elements differ in how they inherit
directionality information. For details, see the section on inheritance of text direction.

That’s not a great definition though. These seem more to be consequences rather than defining characteristics of block level elements.

Can anyone offer a more precise definition of block element that does not presume a particular rendering? Just what is a block anyway?

5 Responses to “Defining Block Level Elements”

  1. Craig Walker Says:

    It seems to me that “block level” is really a formatting concept and a holdover from back when HTML wasn’t sure if it was (in practice) a presentational language or not. Thus, any attempt to define it in a way that doesn’t presume a particular rendering is bound to fail.

    Consider this: what’s a block for an audio rendering? It’s completely meaningless. It only makes any sense in a visual medium.

    If I were in your position, I’d explain it along these lines: in HTML’s early and late history, it was intended to be a representational language. In the middle part of its history, it picked up many presentational parts, and these parts had to be kept for backwards compatibility. Block and inline elements are part of this, and as such only make sense when considered as a part of the most-common rendering: in a browser window.

  2. Curtis Pew Says:

    If I wanted to be correct and concise, I’d probably focus on the first part (“Content Model”) of the specification’s definition: block elements generally contain other block elements and inline elements, while inline elements generally contain data and other inline elements. But really, I think the distinction really is about the default value of the element’s CSS display property, so you might as well mention that.

  3. John Cowan Says:

    I’d say that an element is a block element if it indicates the structural organization of the body, provided it is not constrained to appear only within another block element (tr, th, td are structural but not block elements). Inline elements indicate either properties of parts of the text or else non-textual inclusions within the text.

  4. Rob Koberg Says:

    1 boo 2…

    Anyway, I would say block elements should not be in mixed content.

  5. Nicolás Lichtmaier Says:

    Simple: Inline elements are the ones that can’t contain anything but inline elements. Block elements are the rest.
    This is a structural definition, it’s not a presentational thing. And CSS came afterwards.

Leave a Reply