Java API解析名稱空間的幾種方法

作者：佚名 2009-06-23 14:23:00

本文介紹了使用Java API向名稱空間映射提供前綴的三種不同方式。本文亦包含了示例代碼以方便您編寫自己的 NamespaceContext。

前提條件和示例

本文所有的示例均使用如下這個XML文件：

清單1. 示例XML




  
    
    Michael Schmidt
  
  
    
    Johann Wolfgang von Goethe
  
  
    
    Johann Wolfgang von Goethe

這個 XML 示例包含三個在根元素內(nèi)聲明的名稱空間，一個在此結(jié)構(gòu)的更深層元素上聲明的名稱空間。您將可以看到這種設(shè)置所帶來的差異。

這個 XML 示例的第二個有趣之處在于元素 booklist 具有三個子元素，均名為 book。但是第一個子元素具有名稱空間 science，而其他子元素則具有名稱空間 fiction。這意味著這些元素完全有別于 XPath。在接下來的這些例子中，您將可以看到這種特性產(chǎn)生的結(jié)果。

示例源代碼中有一個需要注意之處：此代碼沒有針對維護(hù)進(jìn)行優(yōu)化，只針對可讀性進(jìn)行了優(yōu)化。這意味著它將具有某些冗余。輸出通過 System.out.println() 以最為簡單的方式生成。在本文中有關(guān)輸出的代碼行均縮寫為 “...”。

理論背景

名稱空間究竟有何意義？為何要如此關(guān)注它呢？名稱空間是元素或?qū)傩缘臉?biāo)識符的一部分。元素或?qū)傩钥梢跃哂邢嗤谋镜孛Q，但是必須使用不同的名稱空間。它們完全不同。請參考上述示例（science:book 和 fiction:book）。若要綜合來自不同資源的 XML 文件，就需要使用名稱空間來解決命名沖突。以一個 XSLT 文件為例。它包含 XSLT 名稱空間的元素、來自您自己名稱空間的元素以及（通常）XHTML 名稱空間的元素。使用名稱空間，就可以避免具有相同本地名稱的元素所帶來的不確定性。

名稱空間由 URI（在本例中為 http://univNaSpResolver/booklist）定義。為了避免使用這個長字符串，可以定義一個與此 URI 相關(guān)聯(lián)的前綴（在本例中為 books）。請記住此前綴類似于一個變量：其名稱并不重要。如果兩個前綴引用相同的 URI，那么被加上前綴的元素的名稱空間將是相同的（請參見清單 5 中的示例 1）。

XPath 表達(dá)式使用前綴（比如 books:booklist/science:book）并且您必須提供與每個前綴相關(guān)聯(lián)的 URI。這時，就需要使用 NamespaceContext。它恰好能夠?qū)崿F(xiàn)此目的。

本文給出了提供前綴和 URI 之間的映射的不同方式。

在此 XML 文件中，映射由類似 xmlns:books="http://univNaSpResolver/booklist" 這樣的 xmlns 屬性或 xmlns="http://univNaSpResolver/book"（默認(rèn)名稱空間）提供。

提供名稱空間解析的必要性

如果 XML 使用了名稱空間，若不提供 NamespaceContext，那么 XPath 表達(dá)式將會失效。清單 2 中的示例 0 充分展示了這一點(diǎn)。其中的 XPath 對象在所加載的 XML 文檔之上構(gòu)建和求值。首先，嘗試不用任何名稱空間前綴（result1）編寫此表達(dá)式。之后，再用名稱空間前綴（result2）編寫此表達(dá)式。

清單 2. 無名稱空間解析的示例 0

private static void example0(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Zero example - no namespaces provided ***");

        XPath xPath = XPathFactory.newInstance().newXPath();

...
        NodeList result1 = (NodeList) xPath.evaluate("booklist/book", example,
                XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/science:book", example, XPathConstants.NODESET);
...
    }

輸出如下所示。

清單 3. 示例 0 的輸出

*** Zero example - no namespaces provided ***
First try asking without namespace prefix:
--> booklist/book
Result is of length 0
Then try asking with namespace prefix:
--> books:booklist/science:book
Result is of length 0
The expression does not work in both cases.

在兩種情況下，XPath 求值并不返回任何節(jié)點(diǎn)，而且也沒有任何異常。XPath 找不到節(jié)點(diǎn)，因?yàn)槿鄙偾熬Y到 URI 的映射。

硬編碼的名稱空間解析

也可以以硬編碼的值來提供名稱空間，類似于清單 4 中的類：

清單 4. 硬編碼的名稱空間解析

public class HardcodedNamespaceResolver implements NamespaceContext {

    /**
     * This method returns the uri for all prefixes needed. Wherever possible
     * it uses XMLConstants.
     * 
     * @param prefix
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix == null) {
            throw new IllegalArgumentException("No prefix provided!");
        } else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return "http://univNaSpResolver/book";
        } else if (prefix.equals("books")) {
            return "http://univNaSpResolver/booklist";
        } else if (prefix.equals("fiction")) {
            return "http://univNaSpResolver/fictionbook";
        } else if (prefix.equals("technical")) {
            return "http://univNaSpResolver/sciencebook";
        } else {
            return XMLConstants.NULL_NS_URI;
        }
    }

    public String getPrefix(String namespaceURI) {
        // Not needed in this context.
        return null;
    }

    public Iterator getPrefixes(String namespaceURI) {
        // Not needed in this context.
        return null;
    }

}

請注意名稱空間 http://univNaSpResolver/sciencebook 被綁定到了前綴 technical（不是之前的 science）。結(jié)果將可以在隨后的示例（清單 6）中看到。在清單 5 中，使用此解析器的代碼還使用了新的前綴。

清單 5. 具有硬編碼名稱空間解析的示例 1

private static void example1(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** First example - namespacelookup hardcoded ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new HardcodedNamespaceResolver());

...
        NodeList result1 = (NodeList) xPath.evaluate(
                "books:booklist/technical:book", example,
                XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate("books:booklist/technical:book/:author",
                example);
...
    }

如下是此示例的輸出。

清單 6. 示例 1 的輸出

*** First example - namespacelookup hardcoded ***
Using any namespaces results in a NodeList:
--> books:booklist/technical:book
Number of Nodes: 1

  
    
    Michael Schmidt
  
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/technical:book/:author
Michael Schmidt

如您所見，XPath 現(xiàn)在找到了節(jié)點(diǎn)。好處是您可以如您所希望的那樣重命名前綴，我對前綴 science 就是這么做的。XML 文件包含前綴 science，而 XPath 則使用了另一個前綴 technical。由于這些 URI 都是相同的，所以節(jié)點(diǎn)均可被 XPath 找到。不利之處是您必須要在多個地方（XML、XSD、 XPath 表達(dá)式和此名稱空間的上下文）維護(hù)名稱空間。

從文檔讀取名稱空間

名稱空間及其前綴均存檔在此 XML 文件內(nèi)，因此可以從那里使用它們。實(shí)現(xiàn)此目的的最為簡單的方式是將這個查找指派給該文檔。

清單 7. 從文檔直接進(jìn)行名稱空間解析

public class UniversalNamespaceResolver implements NamespaceContext {
    // the delegate
    private Document sourceDocument;

    /**
     * This constructor stores the source document to search the namespaces in
     * it.
     * 
     * @param document
     *            source document
     */
    public UniversalNamespaceResolver(Document document) {
        sourceDocument = document;
    }

    /**
     * The lookup for the namespace uris is delegated to the stored document.
     * 
     * @param prefix
     *            to search for
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return sourceDocument.lookupNamespaceURI(null);
        } else {
            return sourceDocument.lookupNamespaceURI(prefix);
        }
    }

    /**
     * This method is not needed in this context, but can be implemented in a
     * similar way.
     */
    public String getPrefix(String namespaceURI) {
        return sourceDocument.lookupPrefix(namespaceURI);
    }

    public Iterator getPrefixes(String namespaceURI) {
        // not implemented yet
        return null;
    }

}

請注意如下這些事項(xiàng)：

•如果文檔在使用 XPath 前已更改，那么此更改還將反應(yīng)在名稱空間的這個查找上，因?yàn)橹概墒窃谛枰臅r候通過使用文檔的當(dāng)前版本完成的。

•對名稱空間或前綴的查找在所用節(jié)點(diǎn)的祖先節(jié)點(diǎn)完成，在我們的例子中，即節(jié)點(diǎn) sourceDocument。這意味著，借助所提供的代碼，您只需在根節(jié)點(diǎn)上聲明此名稱空間。在我們的示例中，名稱空間 science 沒有被找到。

•此查找在 XPath 求值時被調(diào)用，因此它會消耗一些額外的時間。

如下是示例代碼：

清單 8. 從文檔直接進(jìn)行名稱空間解析的示例 2

private static void example2(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Second example - namespacelookup delegated to document ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceResolver(example));

        try {
...
            NodeList result1 = (NodeList) xPath.evaluate(
                    "books:booklist/science:book", example,
                    XPathConstants.NODESET);
...
        } catch (XPathExpressionException e) {
...
        }
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }

此示例的輸出為：

清單 9. 示例 2 的輸出

*** Second example - namespacelookup delegated to document ***
Try to use the science prefix: no result
--> books:booklist/science:book
The resolver only knows namespaces of the first level!
To be precise: Only namespaces above the node, passed in the constructor.
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe

正如輸出所示，在 book 元素上聲明的、具有前綴 science 的名稱空間并未被解析。求值方法拋出了一個 XPathExpressionException。要解決這個問題，需要從文檔提取節(jié)點(diǎn) science:book 并將此節(jié)點(diǎn)用作代表（delegate）。但是這將意味著對此文檔要進(jìn)行額外的解析，而且也不優(yōu)雅。

從文檔讀取名稱空間并緩存它們

NamespaceContext 的下一個版本要稍好一些。它只在構(gòu)造函數(shù)內(nèi)提前讀取一次名稱空間。對一個名稱空間的每次調(diào)用均回應(yīng)自緩存。這樣一來，文檔內(nèi)的更改就變得無關(guān)緊要，因?yàn)槊Q空間列表在 Java 對象創(chuàng)建之時就已被緩存。

清單 10. 從文檔緩存名稱空間解析

public class UniversalNamespaceCache implements NamespaceContext {
    private static final String DEFAULT_NS = "DEFAULT";
    private Map prefix2Uri = new HashMap();
    private Map uri2Prefix = new HashMap();

    /**
     * This constructor parses the document and stores all namespaces it can
     * find. If toplevelOnly is true, only namespaces in the root are used.
     * 
     * @param document
     *            source document
     * @param toplevelOnly
     *            restriction of the search to enhance performance
     */
    public UniversalNamespaceCache(Document document, boolean toplevelOnly) {
        examineNode(document.getFirstChild(), toplevelOnly);
        System.out.println("The list of the cached namespaces:");
        for (String key : prefix2Uri.keySet()) {
            System.out
                    .println("prefix " + key + ": uri " + prefix2Uri.get(key));
        }
    }

    /**
     * A single node is read, the namespace attributes are extracted and stored.
     * 
     * @param node
     *            to examine
     * @param attributesOnly,
     *            if true no recursion happens
     */
    private void examineNode(Node node, boolean attributesOnly) {
        NamedNodeMap attributes = node.getAttributes();
        for (int i = 0; i < attributes.getLength(); i++) {
            Node attribute = attributes.item(i);
            storeAttribute((Attr) attribute);
        }

        if (!attributesOnly) {
            NodeList chields = node.getChildNodes();
            for (int i = 0; i < chields.getLength(); i++) {
                Node chield = chields.item(i);
                if (chield.getNodeType() == Node.ELEMENT_NODE)
                    examineNode(chield, false);
            }
        }
    }

    /**
     * This method looks at an attribute and stores it, if it is a namespace
     * attribute.
     * 
     * @param attribute
     *            to examine
     */
    private void storeAttribute(Attr attribute) {
        // examine the attributes in namespace xmlns
        if (attribute.getNamespaceURI() != null
                && attribute.getNamespaceURI().equals(
                        XMLConstants.XMLNS_ATTRIBUTE_NS_URI)) {
            // Default namespace xmlns="uri goes here"
            if (attribute.getNodeName().equals(XMLConstants.XMLNS_ATTRIBUTE)) {
                putInCache(DEFAULT_NS, attribute.getNodeValue());
            } else {
                // The defined prefixes are stored here
                putInCache(attribute.getLocalName(), attribute.getNodeValue());
            }
        }

    }

    private void putInCache(String prefix, String uri) {
        prefix2Uri.put(prefix, uri);
        uri2Prefix.put(uri, prefix);
    }

    /**
     * This method is called by XPath. It returns the default namespace, if the
     * prefix is null or "".
     * 
     * @param prefix
     *            to search for
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix == null || prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return prefix2Uri.get(DEFAULT_NS);
        } else {
            return prefix2Uri.get(prefix);
        }
    }

    /**
     * This method is not needed in this context, but can be implemented in a
     * similar way.
     */
    public String getPrefix(String namespaceURI) {
        return uri2Prefix.get(namespaceURI);
    }

    public Iterator getPrefixes(String namespaceURI) {
        // Not implemented
        return null;
    }

}

請注意在代碼中有一個調(diào)試輸出。每個節(jié)點(diǎn)的屬性均被檢查和存儲。但子節(jié)點(diǎn)不被檢查，因?yàn)闃?gòu)造函數(shù)內(nèi)的布爾值 toplevelOnly 被設(shè)置為 true。如果此布爾值被設(shè)為 false，那么子節(jié)點(diǎn)的檢查將會在屬性存儲完畢后開始。有關(guān)此代碼，有一點(diǎn)需要注意：在 DOM 中，第一個節(jié)點(diǎn)代表整個文檔，所以，要讓元素 book 讀取這些名稱空間，必須訪問子節(jié)點(diǎn)剛好一次。

在這種情況下，使用 NamespaceContext 非常簡單：

清單 11. 具有緩存了的名稱空間解析的示例 3（只面向頂級）

private static void example3(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Third example - namespaces of toplevel node cached ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceCache(example, true));

        try {
...
            NodeList result1 = (NodeList) xPath.evaluate(
                    "books:booklist/science:book", example,
                    XPathConstants.NODESET);
...
        } catch (XPathExpressionException e) {
...
        }
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }

這會導(dǎo)致如下輸出：

清單 12. 示例 3 的輸出

*** Third example - namespaces of toplevel node cached ***
The list of the cached namespaces:
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Try to use the science prefix:
--> books:booklist/science:book
The cache only knows namespaces of the first level!
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe

上述代碼只找到了根元素的名稱空間。更準(zhǔn)確的說法是：此節(jié)點(diǎn)的名稱空間被構(gòu)造函數(shù)傳遞給了方法 examineNode。這會加速構(gòu)造函數(shù)的運(yùn)行，因它無需迭代整個文檔。不過，正如您從輸出看到的，science 前綴不能被解析。XPath 表達(dá)式導(dǎo)致了一個異常（XPathExpressionException）。

從文檔及其所有元素讀取名稱空間并對之進(jìn)行緩存

此版本將從這個 XML 文件讀取所有名稱空間聲明。現(xiàn)在，即便是前綴 science 上的 XPath 也是有效的。但是有一種情況讓此版本有些復(fù)雜：如果一個前綴重載（在不同 URI 上的嵌套元素內(nèi)聲明），所找到的最后一個將會 “勝出”。在實(shí)際中，這通常不成問題。

在本例中，NamespaceContext 的使用與前一個示例相同。構(gòu)造函數(shù)內(nèi)的布爾值 toplevelOnly 必須被設(shè)置為 false。

清單 13. 具有緩存了的名稱空間解析的示例 4（面向所有級別）

private static void example4(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Fourth example - namespaces all levels cached ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceCache(example, false));
...
        NodeList result1 = (NodeList) xPath.evaluate(
                "books:booklist/science:book", example, XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }

其輸出結(jié)果如下：

清單 14. 示例 4 的輸出

*** Fourth example - namespaces all levels cached ***
The list of the cached namespaces:
prefix science: uri http://univNaSpResolver/sciencebook
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Now the use of the science prefix works as well:
--> books:booklist/science:book
Number of Nodes: 1

  
    
    Michael Schmidt
  
The fiction namespace is resolved:
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe

結(jié)束語

實(shí)現(xiàn)名稱空間解析，有幾種方式可供選擇，這些方式大都好于硬編碼的實(shí)現(xiàn)方式：

•如果示例很小并且所有名稱空間均位于頂部元素內(nèi)，指派到此文檔的方式將會十分有效。

•如果 XML 文件較大且具有深層嵌套和多個 XPath 求值，最好是緩存名稱空間的列表。

•但是如果您無法控制 XML 文件，并且別人可以發(fā)送給您任何前綴，最好是獨(dú)立于他人的選擇。您可以編碼實(shí)現(xiàn)您自己的名稱空間解析，如示例 1 （HardcodedNamespaceResolver）所示，并將它們用于您的 XPath 表達(dá)式。

在上述這些情況下，解析自此 XML 文件的 NamespaceContext 能夠讓您的代碼更少、并且更為通用。

【編輯推薦】