posts - 20,  comments - 7,  trackbacks - 0

All about JAXP

http://www-128.ibm.com/developerworks/java/library/x-jaxp

Brett McLaughlin

 

Document Object Model (DOM)

Simple API for XML (SAX)

 

JDK1.4

javax.xml.parsers
javax.xml.transform
javax.xml.transform.dom
javax.xml.transform.sax
javax.xml.transform.stream

 

JDK5.0

javax.xml
javax.xml.datatype
javax.xml.namespace
javax.xml.parsers
javax.xml.transform
javax.xml.transform.dom
javax.xml.transform.sax
javax.xml.transform.stream
javax.xml.validation
javax.xml.xpath

 

 

严格来说, JAXP 并没有提供新的解析 XML 的方法,但是它使得我们更容易地使用 DOM 或者 SAX 来进行解析任务,更能以一种 vendor-neutral 的方式来使用 DOM SAX

 

JAXP SAX, DOM, JDOM dom4j (这四个都可以解析 XML )没有可比较性,它并没有提供一个新的解析 XML 的方法。

 

JAXP API, located in the javax.xml.parsers package. All of these classes sit on top of an existing parser. 其中的 6 个类都是建立在已有的解析上面的。

 

JDOM dom4j 都提供不同的模型来接受来自 SAX/DOM 的数据,他们从内部来讲都是使用了 SAX ,不过是做了些修改。

 

此外, Java 1.5 Xerces 的包 org.apache.xerces 被放到了 com.sun.org.apache.xerces.internal

 

First, com.sun.xml.tree.XMLDocument class is not part of JAXP. It is part of Sun's Crimson parser, packaged in earlier versions of JAXP.

 

Second, a major purpose of JAXP is to provide vendor independence when dealing with parsers. With JAXP, you can use the same code with Sun's XML parser, Apache's Xerces XML parser, and Oracle's XML parser.

 

先从 SAX 说起,我们只用继承 DefaultHandler (org.xml.sax.helpers 包中 ) 就能获得所有的 callbacks ,只用在需要的方法中加入实现的代码。

 

Here's the typical SAX routine:

1 Create a SAXParser instance using a specific vendor's parser implementation.

2 Register callback implementations (by using a class that extends DefaultHandler, for example).

3 Start parsing and sit back as your callback implementations are fired off.

 

SAX 必须 指定 XML 驱动(如 org.apache.xerces.parsers.SAXParser ),而 JAXP 提供了更好的选择,只要我们提供什么 XML 驱动(在 classpath 中配置),它就调用什么驱动,不需要改动代码。

   

new SAXParserFactory.newSAXParser() 就返回 JAXP SAXParser 类,这个类包装了 SAX parser (an instance of the SAX class org.xml.sax.XMLReader )

 

In Listing 1, you can see that two JAXP-specific problems can occur in using the factory: the inability to obtain or configure a SAX factory, and the inability to configure a SAX parser.

The first of these problems, represented by a FactoryConfigurationError, usually occurs when the parser specified in a JAXP implementation or system property cannot be obtained.

The second problem, represented by a ParserConfigurationException, occurs when a requested feature is not available in the parser being used. Both are easy to deal with and shouldn't pose any difficulty when using JAXP.

 

SAXParser parse 方法可以接受 SAX InputSource Java InputStream 或者 URL in String form

 

可以通过 SAXParser getXMLReader() 方法来获得底层的 SAX parser org.xml.sax.XMLReader 的实例),这样就可以使用各个 SAXParser 方法。 [ 参照 Listing2]

    

 

使用 DOM  

The only difference between DOM and SAX in this respect is that with DOM you substitute DocumentBuilderFactory for SAXParserFactory, and DocumentBuilder for SAXParser.

 

The major difference is that variations of the parse() method do not take an instance of the SAX DefaultHandler class. Instead they return a DOM Document instance representing the XML document that was parsed. The only other difference is that two methods are provided for SAX-like functionality:

  • setErrorHandler(), which takes a SAX ErrorHandler implementation to handle problems that might arise in parsing
  • setEntityResolver(), which takes a SAX EntityResolver implementation to handle entity resolution

JAXP 的使用

1. Source for input

The javax.xml.transform.Source interface is the basis for all input into JAXP and the transformation API. This interface defines only two methods -- getSystemId() and setSystemId(String systemId) .

 

 

JAXP 提供了三个实现 Source 接口的类:

  • javax.xml.transform.dom.DOMSource passes a DOM Node (and its children) into JAXP.
  • javax.xml.transform.sax.SAXSource passes the results of SAX callbacks (from an XMLReader ) into JAXP.
  • javax.xml.transform.stream.StreamSource passes XML wrapped in a File , InputStream , or Reader into JAXP.   

2. Result for output

  javax.xml.transform.Result 也有两个方法: getSystemId() setSystemId(String systemId) 同样有三个实现类:

  • javax.xml.transform.dom.DOMResult passes transformed content into a DOM Node .
  • javax.xml.transform.sax.SAXResult passes the results of a transformation to a SAX ContentHandler .
  • javax.xml.transform.stream.StreamResult passes the transformed *ML into a File , OutputStream , or Writer .

3. Performing transformations with JAXP

1)Getting a Factory     

2)Creating a Transformer   
3)Performing the transformation    
4)Caching XSL stylesheets

JAXP this way has two significant limitations:

  • The Transformer object processes the XSL stylesheet each and every time transform() is executed.
  • Instances of Transformer are not thread-safe. You can't use the same instances across multiple threads.

Transformer 实例不是线程安全的,不能通过多线程去使用同一个 Transformer 实例。

5)Loading a Template

javax.xml.transform.Templates .

 

The Templates interface is thread-safe (addressing the second limitation) and represents a compiled stylesheet (addressing the first limitation).

Templates 实例是线程安全的,可以处理一堆 XSL ,解决了上述两个限制。

6)From Transformer to Templates

如果只要对一个 stylesheet 进行一个 transformation ,那么用 Transformer 比较快,没有必要选择 Templates 对象。但是考虑到线程安全问题,还是推荐使用 Templates

7)Changing the XSL processor

JAXP 默认使用 Xalan-J ,如果要使用其它 parser ,可以通过 javax.xml.transform.TransformerFactory 修改。


java -D javax.xml.transform.TransformerFactory=[transformer.impl.class] TestTransformations 
simple.xml simple.xsl

 

posted on 2006-09-08 13:15 Lizzie 阅读(527) 评论(1)  编辑  收藏 所属分类: 专业积木

FeedBack:
# re: All about JAXP阅读笔记
2007-08-12 17:16 | dreamstone
写的不错  回复  更多评论
  

只有注册用户登录后才能发表评论。


网站导航:
 

<2007年8月>
2930311234
567891011
12131415161718
19202122232425
2627282930311
2345678

常用链接

留言簿(1)

随笔分类

随笔档案

文章分类

搜索

  •  

最新评论

阅读排行榜

评论排行榜