Download (direct link):
114: } catch (Throwable t)
Listing 8.3 (continued)
The key class in GoodDomLookup is the DomUtil class that has three methods. Those three methods solve the DOM lookup problem in two ways. The first method is to retrieve the first child element (and not the first node) when performing a lookup. The implementation of the getFirstChildElement() method will skip any intermediate nodes that are not of type ELEMENT_NODE. The second approach to the problem is to eliminate all "blank" text nodes from the document. While both solutions will work, the second approach may remove some whitespace not considered ignorable.
A run of GoodDomLookup.java gives us the following:
e:\classes\org\javapitfalls >java org.javapitfalls.item8.GoodDomLookup myaddresses.xml
Method #1: Skip Ignorable White space...
# of "ADDRESS" elements: 2 This node name is: ADDRESS This node name is: NAME Method #2: Normalize document...
# of "ADDRESS" elements: 2 This node name is: ADDRESS This node name is: NAME
72 Item 8
A better way to access nodes in a DOM tree is to use an XPath expression. XPath is a W3C standard for accessing nodes in a DOM tree. Standard API methods for evaluating XPath expressions are part of DOM Level 3. Currently, JAXP supports only DOM Level 2. To demonstrate how easy accessing nodes is via XPath, Listing 8.4 uses the DOM4J open source library (which includes XPath support) to perform the same task as GoodDomLookup.java.
01 package 8 m e t i s l l a LM t i p a v a j g. r o
03 import javax.xml.parsers.*;
04 import java.io.*;
05 import org.w3c.dom.*;
06 import org.dom4j.*;
07 import org.dom4j.io.*;
09 public class XpathLookup
11 public static void main(String args)
15 if (args.length < 1)
17 System.out.println("USAGE: " +
18 "org.javapitfalls.item8.BadDomLookup xmlfile");
22 DocumentBuilderFactory dbf =
24 DocumentBuilder db = dbf.newDocumentBuilder();
25 org.w3c.dom.Document doc = db.parse(new File(args));
27 DOMReader dr = new DOMReader();
28 org.dom4j.Document xpDoc = dr.read(doc);
29 org.dom4j.Node node = xpDoc.selectSingleNode(
31 System.out.println("Node name : " + node.getName());
32 System.out.println("Node value: " + node.getText());
33 } catch (Exception e)
Listing 8.4 XpathLookup.java
The Saving-a-DOM Dilemma 73
A run of XpathLookup.java on myaddresses.xml produces the following output:
E:\classes\org\javapitfalls>javaorg.javapitfalls.item8.XpathLookup 2 myaddresses.xml
Node name : NAME Node value: Joe Jones
The XpathLookup.java program uses the selectSingleNode() method in the DOM4J API with an XPath expression as its argument. The XPath recommendation can be viewed at http://www.w3.org/TR/xpath. It is important to understand that evaluation of XPath expressions will be part of the org.w3c.dom API when DOM Level 3 is implemented by JAXP In conclusion, when searching a DOM, remember to handle whitespace nodes, or better, use XPath to search the DOM, since its robust expression syntax allows very fine-grained access to one or more nodes.
Item 9: The Saving-a-DOM Dilemma
One of the motivations for JAXP was to standardize the creation of a DOM. A DOM can be created via parsing an existing file or instantiating and inserting the nodes individually. JAXP abstracts the parsing operation via a set of interfaces so that different XML parsers can be easily used. Figure 9.1 shows a simple lifecycle of a DOM.
Figure 9.1 focuses on three states of a DOM: New, Modified, and Persisted. The New state can be reached either by instantiating a DOM object via the new keyword or by loading an XML file from disk. This "loading" operation action invokes a parser to parse the XML file. An edit, insert, or delete action moves the DOM to the modified
Figure 9.1 DOM lifecycle.
74 Item 9
|McDonald Bradley. Inc.
|8200 Greensboro Drive
Figure 9.2 DomEditor saving a DOM.
state. The Save action transitions the DOM to the Persisted state. The asterisk by the save operation indicates that Save is implemented by saving the DOM to an XML file (the "save" operation without the asterisk). An enhancement to the DomViewer program demonstrated in Item 8 is to allow editing of the DOM nodes and then save the modified DOM out to a file (persist it). Figure 9.2 is a screen shot of the DomEditor program that implements that functionality.
When implementing the save operation in the DomEditor, we run into a dilemma on how to implement saving the DOM. Unfortunately, we have too many ways to save a DOM, and each one is different. The current situation is that saving a DOM depends on which DOM implementation you choose. The dilemma is picking an implementation that will not become obsolete with the next version of the parser. In this item, we will demonstrate three approaches to saving a DOM. Each listing will perform the same functionality: load a document from a file and save the DOM to the specified output file. The toughest part of this dilemma is the nonstandard and nonintuitive way prescribed by Sun Microsystems and implemented in JAXP to perform the save. The JAXP method for saving a DOM is to use a default XSLT transform that copies all the nodes of the source DOM (called a DOMSource) into an output stream (called a StreamResult). Listing 9.1 demonstrates saving an XML document via JAXP.