Semantic Data Extractor

As Kevin Ryan pointed out at work yesterday, the W3's Semantic Data Extractor is a pretty sweet tool. I've been steadily promoting Layered Semantic Markup at work -- the importance of meaningful markup as the core of web development. This is a great tool to show that value, and remind that the reason you put meaning in is to get meaning out.

The tool tries to extract information from a semantically-rich HTML document. It only uses information available through the good usage of the semantics provided by HTML. “The aim is to show that providing semantically rich HTML gives much more value to your code: using semantically rich HTML allows a better use of CSS, and makes your HTML intelligible to a wider range of user agents (especially search engines bots).”

To see it in action, check out the new next.yahoo.com page. The Extractor handles it pretty well, showing a clear document hierarchy.

What is Layered Semantic Markup?

Today’s Wrong Solution is Tomorrow’s Constraint

Layered Semantic Markup (LSM) is not a technology, but a framework comprised of HTML, XHTML, Cascading Style Sheets (CSS), Javascript, DOM and other Web technologies. LSM allows for appropriately implemented principles and standards.

LSM is a development framework for creating Web documents and experiences. LSM builds for the least capable devices first, then enhances those documents with separate logic for presentation, in ways that do not place an undue burden on baseline devices but which allow a richer experience for those users with modern graphical browser software. LSM supports all user agents, and is inclusive by design. (Progressive Enhancement - Unobtrusive Javascript)

LSM has structural semantic markup at its core, which provides lean, meaningful, accessible pages. This well-built core and the clear separation of structural, presentational and behavioral layers make this development philosophy superior to many short-sighted approaches.

Today’s wrong solution is tomorrow’s constraint. A holistic vision - an underlying philosophy - must guide technical decisions. LSM provides the strategy for a sound and future-ready approach.

LSM embraces Graded Browser Support by using one markup document, subsequently layered with stylesheets and scripts that provide a gradually enhanced experience across a wide variety of browsers and devices.

This approach has profound advantages over other browser support approaches such as graceful degradation. Graded Browser Support recognizes that advanced technology support is not a guarantee of the future, and that legacy software as well as alternative devices (mobile) must always be considered. Graded Browser Support defines support in terms of current capabilities, not in terms of legacy or obsolete software; it embraces accessibility, universality, and peaceful coexistence with more feature-rich browsers/devices; and it allows for adoption of new technology and strategies without leaving any browser/device behind.

Credits

This work is heavily influenced and contains directly passages from Debra Chamra's "Progressive Enhancement: Paving the Way for Future Web Design", Steven Champeon and Nick Finck's presentation "Inclusive Web Design For the Future with Progressive Enhancement", and Steven Champeon's "Progressive Enhancement and the Future of Web Design", all of which may be found here.

Thanks also to the great people who have endlessly debated and developed these topics with me: James Berry, Sean Imler, Todd Kloots, Jon Koshi, Mike Lee, Thomas Sha, Matt Sweeney, Chanel Wheeler, and Christina Wodtke; and everybody else; and to everybody who puts their ideas online so that others may be inspired. Thanks.

TrackBack

TrackBack URL for this entry:
https://www.typepad.com/services/trackback/6a00d83456be4e69e200d83457ed5569e2

Listed below are links to weblogs that reference Semantic Markup - Create, Support and Extract:

» Today’s Wrong Solution is Tomorrow’s Constraint from Blog-Fu
Nate Koechley has some great ideas. The latest is about Layered Semantic Markup. LSM is a development framework for creating Web documents and experiences. LSM builds for the least capable devices first, then enhances those documents with separate logi... [Read More]

Tracked on Feb 10, 2005 7:20:51 AM

» Today’s Wrong Solution is Tomorrow’s Constraint from Blog-Fu
Nate Koechley has some great ideas. The latest is about Layered Semantic Markup. LSM is a development framework for creating Web documents and experiences. LSM builds for the least capable devices first, then enhances those documents with separate logi... [Read More]

Tracked on Feb 10, 2005 7:29:14 AM

» Milf sex. from Milf sex.
Milf sex. [Read More]

Tracked on Dec 20, 2009 12:18:08 PM

» Levitra. from Buy levitra online.
Buy sublingual levitra online. Viagra vs. levitra. How does levitra work. Levitra. Levitra.. [Read More]

Tracked on Dec 22, 2009 10:58:39 AM

» Carisoprodol. from Carisoprodol pharmacy.
Buy carisoprodol online lowest price guarantee. [Read More]

Tracked on Dec 25, 2009 10:10:06 PM

Comments

yes, i know this isn't the most semantic website. you're welcome to leave all the nasty comments you want. It will be more semantic someday, but for now this is just an informal little blog. ok, on with the comments. thanks.

Posted by: Nate Koechley | Feb 9, 2005 4:08:21 PM

Nate, this is an interesting idea. However, I'm confused by the term "layered semantic markup". As you state, there is only one document marked up semantically. This leads me to believe that neither the markup nor the semantics are layered, which would seem contrary to the term itself. Am I missing something?

Posted by: Joshua Porter | Feb 10, 2005 7:53:28 AM

If I had to garner a guess, I'd say that it's "layered", because we break down the components by function HTML (Structure), CSS (Presentation) DOM/Javascript (Behavior). The point being that you can strip away the presentation layer, and the document remains semantically the same. Take all those components, stack them on top of each other, and you have the concept of layers.

Posted by: lantzilla | Feb 10, 2005 2:56:05 PM

Hey Joshua, nice to see you here. I was just reading some of your stuff last night!

Lance's guess was pretty close. The idea is a "semantic" foundataion on which the rest of the development is "layered". It's not the world's best TLA, but it works (especially for it's internal uses here at work).

That said, we're also trying to do some layering of semantics. It's not particularily intuitive, and I'm not sure of the long-term value yet, but it seems like the right way to go. While most web semantics come from longstanding HTML (as it were), I also think it's possible to grant semantics through consistent use. If every required form element we create is class="required", well, I believe that's better than having 10 different conventions (or none at all).

Another example is how I'm coding generic modules. For every module, we're standardizing on head, body, and as necessary footer regions:

<div>
  <div class="hd"></div>
  <div class="bd"></div>
  <div class="ft"></div>
</div>

The cite element is good, but more precision by indicating publisher, dateline, byline, author can only help.

And so while this "granted semantics" isn't the key of LSM, it's an additional way to think about things, and an additional type of value to add.

Again, the main idea was to provide an understanding of meaningful markup and layered development, that didn't use any particular technology in it's description. Basically, saying "tableless" all the time wasn't helping my cause.

Posted by: Nate Koechley | Feb 11, 2005 3:29:40 PM

Erm ... the links to Progressive Enhancement and Unobtrusive Javascript are busted.

Posted by: Kent Brewster | Feb 14, 2005 7:35:45 PM

Good catch Kent, I fixed the typo. Thanks.

Posted by: Nate Koechley | Feb 14, 2005 9:44:47 PM

Hey Nate,

Good article! In response to your comment on Feb 11, I'm guessing that saying 'tableless' wouldn't help your cause since if a page is using semantic markup, wouldn't it mean that tables should be used for tabular data? So, it wouldn't exactly be 'tableless'...right?

Oh, and btw, my site isn't semantic either and I think it sucks too! Especially the part with the jokes that I try to hide by having a blog (the one that I interviewed with). I don't seem to get any nasty comments about it though. But you can leave some if you'd like! I won't get offended.

Posted by: strimble | Feb 21, 2005 5:07:01 PM

The comments to this entry are closed.