What Is XML? A Metalanguage for Describing and Structuring Data

Steve Hoenisch

All this talk about XML raises the question: What is it anyway? And what about the jumble of abbreviations that cloaks XML from the curious eyes of an HTML coder or content author? How do XSLT, DTD, XLink and all the rest fit into the XML equation?

Extensible Markup Language is, first and foremost, a metalanguage for describing and structuring data with tags. “Metalanguage” means a language for how to describe other languages. Like HTML, XML uses tags (words bracketed by “<” and “>”), but unlike HTML, XML has neither a predefined set of tags nor rules for how to use them (though XML does have generic rules governing markup; for instance, tags may not overlap).

In XML, tags and the rules for them, or grammar, are defined by the users themselves. XML programmers use tags and their corresponding grammar to describe and structure data in a simple text format like that of HTML (as opposed to a binary format). Because users can define their own tags, XML, as its full name implies, is also extensible, meaning that, unlike HTML, it is capable of being extended. You will begin learning how to create your own XML tags today, in the tutorial section below.

A Family of Technologies

The hodgepodge of abbreviations and acronyms surrounding XML points up another of its characteristics: It is a family of technologies designed to be used over the Internet. Some of XML’s family members and their main functions are as follows:

These are just some of the XML-related technologies relevant to a broadly focused XML tutorial. As you may have guessed, there is a plethora of other XML-related specifications and applications, most of which are still evolving: XML Query, XML Base, XML Signature and Canonicalization, XML Protocol, MathML, SMIL, SVG, and FpML, to name a few. You can visit the World Wide Web Consortium’s (W3C) web site, http://www.w3c.org, for the rundown on most of these initiatives.

To get a more complete description of what XML is and what its core related technologies are, read two short pieces: an article on the W3C web site called “XML in 10 Points,” at http://www.w3.org/XML/1999/XML-in-10-points, and the first half of Chapter 1 in Brett McLaughlin’s book, Java and XML, published by O’Reilly. For a slighly more technical introduction to XML, visit http://www.xml.com/pub/a/98/10/guide0.html.

The tutorials in this series proceed as follows:

  1. An Introduction to XML
  2. Structuring Documents in XML
  3. Developing a Document Type Definition
  4. Attributes and Entities in DTDs
  5. An Introduction to XSL
  6. Using XSLT to Separate Content from Presentation