caucho
 SAX Parsing


SAX parsers use a callback mechanism to parse XML. Applications register a ContentHandler to receive the parsing events. Although this is a little more complicated, it's more efficient because there's no need to build any data structures.

Because Sun's JAXP API only supports SAX 1.0, the following example instantiates the Resin parser directly. Of course, if you want to parse HTML with a SAX API, you'll need to instantiate Html instead.

The following example just prints the element names, properly indented, as they're parsed.

Input File
<top>
  <a/>
  <b>
    <b1/>
    <b2/>
  </b>
  <c/>
</top>

DefaultHandler is a convenient abstract implementation of the SAX handler APIs. By extending DefaultHandler, you can just implement the methods your application needs.

The SAX parser will call startElement() when it finishes parsing the open tag. In this case, I'm ignoring the namespace junk and just using the tag name, qName.

The endElement() callback just decrements the depth to get the correct value.

SAX Handler Class
import java.io.*;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
import com.caucho.xml.*;

public class MyHandler extends DefaultHandler {
  int depth;

  public void startElement(String namespaceURI, String localName,
                           String qName, Attributes atts)
  {
    for (int i = 0; i < depth; i++)
      System.out.print(" ");

    System.out.println(qName);

    depth += 2;
  }

  public void endElement(String namespaceURI, String localName,
                         String qName)
  {
    depth -= 2;
  }
}

To parse with SAX, you must register your handler using setContentHandler.

Parsing using SAX
// Create a new parser
Xml xml = new Xml();

xml.setContentHandler(new MyHandler());

xml.parse("test.xml");

And the result looks like:

top
  a
  b
    b1
    b2
  c


Copyright © 1998-2002 Caucho Technology, Inc. All rights reserved.
Resin® is a registered trademark, and HardCoretm and Quercustm are trademarks of Caucho Technology, Inc.