Validating sgml parser
October 3, 1998 Norman Walsh Author's Note: It is somewhat remarkable to think that this article, which appeared initially in the Winter 1997 edition of the World Wide Web Journal was out of date by the time the final XML Recommendation was approved in February.And even as this update brings the article back into line with the final spec, a new series of recommendations are under development.When finished, these will bring namespaces, linking, schemas, stylesheets, and more to the table.This introduction to XML presents the Extensible Markup Language at a reasonably technical level for anyone interested in learning more about structured documents.In addition to covering the XML 1.0 Specification, this article outlines related XML specifications, which are evolving.The article is organized in four main sections plus an appendix. XML is a markup language for documents containing structured information.Structured information contains both content (words, pictures, etc.) and some indication of what role that content plays (for example, content in a section heading has a different meaning from content in a footnote, which means something different than content in a figure caption or content in a database table, etc.). A markup language is a mechanism to identify structures in a document.The XML specification defines a standard way to add markup to documents.
For our purposes, the word "document" refers not only to traditional documents, like this one, but also to the myriad of other XML "data formats". SGML has been the standard, vendor-independent way to maintain repositories of structured documentation for more than a decade, but it is not well suited to serving documents over the web (for a number of technical reasons beyond the scope of this article).These include vector graphics, e-commerce transactions, mathematical equations, object meta-data, server APIs, and a thousand other kinds of structured information. In HTML, both the tag semantics and the tag set are fixed. Defining XML as an application profile of SGML means that any fully conformant SGML system will be able to read XML documents.However, using and understanding XML documents does not require a system that is capable of understanding the full generality of SGML.XML is, roughly speaking, a restricted form of SGML.For technical purists, it's important to note that there may also be subtle differences between documents as understood by XML systems and those same documents as understood by SGML systems.