有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

Java使用XPath拆分XML,但带有其父标记

我有以下XML字符串:

<Aaaa>
    <Bbbb>
        <GroupC>
            <KeyId>10001</KeyId>
        </GroupC>
        <DetailC>
            <Dddd>
                <Eeee>Eeee 001</Eeee>
                <Ffff>Ffff 001</Ffff>
            </Dddd>
        </DetailC>
        <DetailC>
            <Dddd>
                <Eeee>Eeee 002</Eeee>
                <Ffff>Ffff 002</Ffff>
            </Dddd>
        </DetailC>
    </Bbbb>
</Aaaa>

我想将“DetailC”拆分为更小的XML:

XML 01:

<Aaaa>
    <Bbbb>
        <GroupC>
            <KeyId>10001</KeyId>
        </GroupC>
        <DetailC>
            <Dddd>
                <Eeee>Eeee 001</Eeee>
                <Ffff>Ffff 001</Ffff>
            </Dddd>
        </DetailC>
    </Bbbb>
</Aaaa>

XML 02:

<Aaaa>
    <Bbbb>
        <GroupC>
            <KeyId>10001</KeyId>
        </GroupC>
        <DetailC>
            <Dddd>
                <Eeee>Eeee 002</Eeee>
                <Ffff>Ffff 002</Ffff>
            </Dddd>
        </DetailC>
    </Bbbb>
</Aaaa>

我可以知道如何使用Java来实现这一点吗? 目前我只能拆分成单独的XML, 但是它没有<Aaaa><Bbbb><GroupC>

Java代码:

package message;

import java.io.IOException;
import java.io.StringReader;
import java.io.StringWriter;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.apache.xpath.CachedXPathAPI;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.traversal.NodeIterator;
import org.xml.sax.InputSource;

public class mainClass {

    public static void main(String[] args) throws Exception{
        // TODO Auto-generated method stub

        String path = "D:\\abc.xml";
        String xml = readFile(path);


        List<String> xmlList2 = splitXML(xml, "/Aaaa/Bbbb/DetailC");

        for (String xmlC : xmlList2) {
            System.out.println("xmlC: " + xmlC);
        }
    }

    private static List<String> splitXML(String xmlMessage, String xPath) throws Exception {

        List<String> xmlList = new ArrayList<>();

        Transformer xform = TransformerFactory.newInstance().newTransformer();
        xform.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");

        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        InputSource parameterSource = new InputSource(new StringReader(xmlMessage));
        Document doc = dBuilder.parse(parameterSource);
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true); // never forget this!

        CachedXPathAPI cachedXPathAPI = new CachedXPathAPI();
        NodeIterator nl = cachedXPathAPI.selectNodeIterator(doc, xPath);

        Node node;
        while ((node = nl.nextNode()) != null) {
            StringWriter buf = new StringWriter();
            DOMSource dom = new DOMSource(node);
            xform.transform(dom, new StreamResult(buf));
            xmlList.add(buf.toString());
        }

        return xmlList;
    }

    private static String readFile(String path) {
        String content = "";
        try (Stream<String> lines = Files.lines(Paths.get(path))) {

            content = lines.collect(Collectors.joining(System.lineSeparator()));

        } catch (IOException e) {
            e.printStackTrace();
        }
        return content;
    }
}

共 (1) 个答案

  1. # 1 楼答案

    如果您使用Saxon 9 HE(Sourceforge和Maven for Java上提供),您可以使用XSLT 3解决这个问题,请参阅Split XML file into multiple files using XSLT中的方法,您可以将代码更改为

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema" version="3.0"
        exclude-result-prefixes="xs">
    
    <xsl:template match="DetailC">
      <xsl:variable name="pos" as="xs:integer">
         <xsl:number/>
      </xsl:variable>
      <xsl:result-document href="XML{format-number($pos, '000')}.xml">
          <xsl:apply-templates select="/" mode="split">
             <xsl:with-param name="this-detail" select="." tunnel="yes"/>
          </xsl:apply-templates>
      </xsl:result-document>
    </xsl:template>
    
    <xsl:template match="@* | node()" mode="split">
      <xsl:copy>
        <xsl:apply-templates select="@* | node()" mode="#current"/>
      </xsl:copy>
    </xsl:template>
    
    <xsl:template match="DetailC" mode="split">
      <xsl:param name="this-detail" tunnel="yes"/>
      <xsl:if test=". is $this-detail">
        <xsl:next-match/>
      </xsl:if>
    </xsl:template>
    
    </xsl:stylesheet>
    

    要在Java中运行Saxon 9,可以使用JAXP转换API http://saxonica.com/html/documentation/using-xsl/embedding/jaxp-transformation.html或特定于Saxon 9的s9api http://saxonica.com/html/documentation/using-xsl/embedding/s9api-transformation.html

    请记住,Transformer可以使用StreamSource(例如https://docs.oracle.com/javase/8/docs/api/javax/xml/transform/stream/StreamSource.html#StreamSource-java.lang.String-https://docs.oracle.com/javase/8/docs/api/javax/xml/transform/stream/StreamSource.html#StreamSource-java.io.File-)直接转换文件,因此无需读取字符串中的文件内容或手动构建DOM,您可以直接加载任何XML文件作为XSLT的输入