DocBook includes a set of elements to tag documents for search engine optimization, or SEO. In this article, I’ll show you how to set up a DocBook document so the DocBook stylesheets automatically embed metadata in the resulting HTML web page. Metadata – the encoded information that tells a search engine what your content is about – is a fundamental component of optimizing a web page for indexing by a search engine. Including metadata in the HTML header of a web page can help improve the page’s search engine ranking.
A web page that has been optimized for search engines includes three important pieces of metadata: a title, a description, and a set of keywords. The DocBook XSL stylesheets that transform DocBook articles and books into HTML automatically map the title of a DocBook document to the title tag in the HTML header of a web page, so you don’t have to worry about the title.
The description and the set of keywords, however, must be explicitly set by including the abstract
element and keywords
element in your DocBook XML file. To generate a description, the stylesheets map the content of DocBook’s abstract
element to HTML’s meta element for a description. To generate a list of keywords, the stylesheets add each keyword enclosed in DocBook’s keywords
element to HTML’s meta element for keywords. Thus, the key to optimizing your HTML output for SEO is to add the abstract
and keywords
elements to your XML.
To include a description in the metadata of the HTML page that you generate from your DocBook source, you must add the abstract
element to your DocBook article or book. The abstract element is a child of the articleinfo
element, which can appear near the beginning of your book or article. Here’s an example of how to add the abstract
element to a DocBook article:
<?xml version="1.0"?>
<!DOCTYPE article SYSTEM "/docbook-xml-4.5/docbookx.dtd">
<article>
<title>Ski Washington on a Budget</title>
<articleinfo>
<abstract>
<para>A concise overview of options available to skiers
traveling to Washington ski areas on a budget.</para>
</abstract>
</articleinfo>
After the code snippet above is transformed into HTML with the DocBook XSL stylesheets, it looks like this:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Ski Washington on a Budget</title>
<meta name="generator" content="DocBook XSL Stylesheets V1.73.0">
<meta name="description" content="A concise overview of options
available to skiers traveling to Washington ski areas on a budget.">
</head>
Tip: To avoid having your description cut off mid-sentence in Google’s search results (when it displays the description), a best practice is to limit your description to no more than 156 characters and spaces.
To include keywords in the metadata of your HTML page, you use DocBook’s keywordset
element to encompass several keyword
elements. Like abstract, the keywordset
element is also a child of the articleinfo
element. The keywordset
should include a few highly relevant words from the page’s text – terms that represent and categorize your document’s content. Here’s an example of how to add keywords to a DocBook article:
<?xml version="1.0"?>
<!DOCTYPE article SYSTEM "/docbook-xml-4.5/docbookx.dtd">
<article>
<title>Ski Washington on a Budget</title>
<articleinfo>
<keywordset>
<keyword>skiing</keyword>
<keyword>budget travel</keyword>
<keyword>Washington state</keyword>
</keywordset>
</articleinfo>
After the keywords in the code snippet above is transformed into HTML with the DocBook XSL stylesheets, it looks like this:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Ski Washington on a Budget</title>
<meta name="generator" content="DocBook XSL Stylesheets V1.73.0">
<meta name="keywords" content="skiing, budget travel,
Washington state skiing">
</head>
Of course, to generate both a description and keywords in the HTML, you would want to include both an abstract and a set of keywords in your DocBook XML, like this:
<?xml version="1.0"?>
<!DOCTYPE article SYSTEM "/docbook-xml-4.5/docbookx.dtd">
<article>
<title>Ski Washington on a Budget</title>
<articleinfo>
<abstract>
<para>A concise overview of options available to skiers
traveling to Washington ski areas on a budget.</para>
</abstract>
<keywordset>
<keyword>skiing</keyword>
<keyword>budget travel</keyword>
<keyword>Washington state</keyword>
</keywordset>
</articleinfo>
When the DocBook stylesheets transform this snippet, both the sentence in the abstract and the keywords are rendered in their corresponding metadata fields in the header of the HTML, which helps optimize the web page for search engines.
–Steve Hoenisch
Related
Search Engine Optimization Techniques: Tips to Improve Your Search Engine Rankings
XC: A Minimalist, Structural DTD for XML Documents
First Published: May 18, 2010. Last Updated: May 18, 2010.