XML is a flexible and extensible markup language derived from the Standard Generalised Markup Language (SGML, ISO 8879). SGML is the mother of all markup languages, and XML can be seen as SGML, without those parts that are hard to implement in software products, and targeted to Web applications.
XML is a so-called markup language. Historically, the editor scribbled markup in a text to describe how the text should be laid out. Nowadays, markup defines the meaning of a text. One of the major benefits of XML is that it separates the structure, content, and layout of documents, as opposed to HTML, which is primarily a formatting language. HTML is loosely based on SGML and can be seen as SGML with exactly one fixed document type definition (DTD), which is focused on the formatting of documents. In XML, the user can specify the document structure in a user-defined DTD, which is extensible. Structure and content are strictly separated, hence a DTD can be used for several XML documents as a template. Finally, the presentation is specified in a style sheet. Using different style sheets, the same content can be presented in numerous ways, without having to reorganise this content.
The content in XML, a document type definition or schema specifying structure, and a presentation specification that states how the information in an XML document should be displayed. Its simplicity, however, is also its power: instead of having a fixed language (e.g. HTML, which specifies the content of hypertext documents), XML is a meta-language that opens the possibility to define new languages, new formats. These can be used for many different purposes, including the specification of
.XML is so powerful because it can be used in a wide range of applications, yet the support (in terms of tools and code) can be very generic. For data structuring purposes, XML will be what ASCII is for uniform character encoding, and both simplify (or even enable) communication between totally different applications. It is good to keep in mind that XML is often the foundation layer upon which a myriad of higher-level standards has been built. In the W3C family alone, there are already roughly twenty XML based standards. (An impression is depicted in Figure 6‑2.)
Example
Below you see an example of an XML document specifying a Scalable Vector Graphic (SVG). It conforms to the DTD for SVG documents, and shows a simple button with the text ‘BehaviourDiagram’.
In the next sections we will discuss the most important XML related standards, including DTD, XML Schema, the XSL stylesheet language, Xlink, SVG, SOAP, and WSDL.
The XML Schema language extends and generalizes the DTD language. In fact, the DTD language is a schema language borrowed from SGML. The XML Schema language is the first attempt to replace DTD with something better. In May 2001, the XML Schema 1.0 language became a W3C recommendation.
In short, XML schema is a better choice than a DTD because XML schemas are expressed in XML. This means that they use elements and attributes to express the semantics of the schema and that they can be edited and processed with the same tools used to process other XML documents, e.g. authoring tools, or XSLT. The expressive power of XML schema is high: XML schema allows strong datatyping (both predefined and user-defined), reuse of elements by extension and derivation (inheritance), attribute grouping, and –last but not least– the use of namespaces for distributed environments.