caucho
 XML Introduction


XML, the eXtensible Markup Language, is just another file format. In a way, there's nothing special about it except that it's standard and it resembles HTML. In an alternate universe, the standard file format could be LISP-based and that universe would have the same advantages as using XML.

Because XML is a standard, it provides several advantages over a proprietary format

  • Avoid programer meetings to decide on a file format.
  • Simplify documentation by using a well-known syntax.
  • Use available parsers instead of writing new ones (SAX and JAXP).
  • Use standard datastructures instead of inventing new ones (the DOM).
  • Leverage tools built on the standards (XPath and XSL).

Of course, XML does not solve everything. You still need to create your own tags and attributes for your application.

Parsing

There are two standard XML parsing methods: parsing to a the standard DOM (XML Document Object Model) and parsing to user callbacks (SAX). In general, parsing to the DOM is often easier to use but parsing with SAX is more efficient and uses less memory. The choice really depends on the application.

Sun's JAXP (Java XML Parsing) API provides an implementation-independent way of parsing XML.

Data Structures

The W3C Document Object Model (DOM) is the standard datastructure for representing an XML file in memory. The structure is based on nodes, where elements (tags), attributes, comments, and text are all represented as different kinds of nodes.

Printing

Resin provides a convenient API for print XML. In addition to printing standard XML, it can handle pretty-printing and printing HTML files. For example, several HTML tags, like <img> print as non-standard XML.


Copyright © 1998-2002 Caucho Technology, Inc. All rights reserved.
Resin® is a registered trademark, and HardCoretm and Quercustm are trademarks of Caucho Technology, Inc.