Download (direct link):
01: package org.javapitfalls.item9;
03: import javax.xml.parsers.*;
Listing 9.1 JaxpSave.java
The Saving-a-DOM Dilemma 75
04 import javax.xml.transform.*;
05 import javax.xml.transform.dom.*;
06 import javax.xml.transform.stream.*;
07 import java.io.*;
08 import org.w3c.dom.*;
10 class JaxpSave
12 public static void main(String args)
// ... command-line check omitted for brevity ...
23 // load the document
24 DocumentBuilderFactory dbf =
26 DocumentBuilder db = dbf.newDocumentBuilder();
27 Document doc = db.parse(new File(args));
28 String systemValue = doc.getDoctype().getSystemId();
30 // save to output file
31 File f = new File(args);
32 FileWriter fw = new FileWriter(f);
34 /* write method USED To be in Sun's XmlDocument class.
35 The XmlDocument class preceded JAXP. */
37 // Currently only way to do this is via a transform
38 TransformerFactory tff = 2
39 // Default transform is a copy
40 Transformer tf = tff.newTransformer();
41 tf.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, 2
42 DOMSource ds = new DOMSource(doc.getDocumentElement()) ;
43 StreamResult sr = new StreamResult(fw);
44 tf.transform(ds, sr);
46 } catch (Throwable t)
Listing 9.1 (continued)
76 Item 9
While JAXP is the "official" API distributed with Sun's JDK, using a transform to save a DOM is not recommended. It is nonintuitive and not close to the proposed W3C DOM Level 3 standard discussed next. The method to perform the save via JAXP uses the XSLT classes in JAXP. Specifically, a TransformerFactory is created, which in turn is used to create a transformer. A transform requires a source and a result. There are various different types of sources and results. Line 40 of Listing 9.1 is the key to understanding the source, since it creates the default transformer. This line is key because normally an XSLT transform is performed via an XSLT script; however, since we are only interested in a node-for-node copy, we don't need a script, which is provided by the no-argument newTransformer() method. This way the "default" transformer provides the copy functionality. There is another newTransformer(Source s) that accepts an XSLT source script. To sum up, once we have a transformer, a source, and a result, the transform operation will perform the copy and thus serialize the DOM. One extra step is necessary if we want to preserve the Document Type declaration from the source document. Line 41 sets an output property on the transformer to add a Document Type declaration to the result document, which we set to be the same value as the source document.
Why has JAXP not defined the standard for saving a DOM? Simply because the DOM itself is not a Java standard but a W3C standard. JAXP currently implements DOM Level 2. The W3C is developing an extension to the DOM called DOM Level 3 to enhance and standardize the use of the DOM in multiple areas. Table 9.1 details the various parts of the DOM Level 3 specification.
Table 9.1 DOM Level 3 Specifications
DOM LEVEL 3 SPECIFICATIONS DESCRIPTION
DOM Level 3 Core Base set of interfaces describing the document object model. Enhanced in this third version.
DOM Level 3 XPath Specification A set of interfaces and methods to access a DOM via XPATH expressions.
DOM Level 3 Abstract Schemas and Load and Save Specification This specification defines two sub-specifi-catons: the Abstract Schemas specification and the Load and Save specification. The Abstract Schemas specification represents abstract schemas (DTDs and schemas) in memory. The Load and Save specification specifies the parsing and saving of DOMs.
DOM Level 3 Events Specification This specification defines an event generation, propagation, and handling model for DOM events. It builds on the DOM Level 2 event model.
DOM Level 3 Views and Formatting This specification defines interfaces to represent a calculated view (presentation) of a DOM. It builds on the dOm Level 2 View model.
The Saving-a-DOM Dilemma 77
Table 9.2 DOM Level 3 Load and Save Interfaces
W3C DOM LEVEL 3 LS INTERFACES DESCRIPTION
DOMImplementationLS A DOMImplementation interface that provides factory methods for creating the DOMWriter, DOMBuilder, and DOMInputSource objects.
DOMBuilder A parser interface.
DOMInputSource An interface that encapsulates information about the document to be loaded.
DOMEntityResolver An interface to provide a method for applications to redirect references to external entities.
DOMBuilderFilter An interface to allow element nodes to be modified or removed as they are encountered during parsing.
DOMWriter An interface for serializing DOM Documents.
DocumentLS An extended document interface with built-in load and save methods.
ParseErrorEvent Event fired if there is an error in parsing.