使用Python或XSLT将复杂XML转换为CSV

2条回答

网友

1楼 · 编辑于 2024-06-26 14:04:32

我已经完成了类似于您的需求的案例，我已经基于untangle创建了一个包，这个包可以将XML解析为纯python对象，如：

<?xml version="1.0"?>
<root>
    <child name="child1"/>
</root>

到

obj.root.child['name'] # u'child1'

然后，您可以轻松地编写一些代码来遍历对象以获得所需的内容。例如，您可以执行类似get_items_by_tag(InvoiceRow)的操作。希望有帮助

网友

2楼 · 编辑于 2024-06-26 14:04:32

考虑下面的例子：

XML

<Invoice>
    <SellerDetails>
        <Identifier>1234-1</Identifier>
        <SellerAddress>
            <SellerStreet>Street1</SellerStreet>
            <SellerTown>Town1</SellerTown>
        </SellerAddress>
    </SellerDetails>
    <BuyerDetails>
        <BuyerIdentifier>1234-2</BuyerIdentifier>
        <BuyerAddress>
            <BuyerStreet>Street2</BuyerStreet>
            <BuyerTown>Town2</BuyerTown>
        </BuyerAddress>
    </BuyerDetails>
    <BuyerNumber>001234</BuyerNumber>
    <InvoiceDetails>
        <InvoiceNumber>0001</InvoiceNumber>
    </InvoiceDetails>
    <InvoiceRow>
        <ArticleName>Article1</ArticleName>
        <RowText>Product Text1</RowText>
        <RowText>Product Text2</RowText>
        <RowAmount AmountCurrencyIdentifier="EUR">10.00</RowAmount>
    </InvoiceRow>
    <InvoiceRow>
        <ArticleName>Article2</ArticleName>
        <RowText>Product Text11</RowText>
        <RowText>Product Text22</RowText>
        <RowAmount AmountCurrencyIdentifier="EUR">20.00</RowAmount>
    </InvoiceRow>
    <InvoiceRow>
        <ArticleName>Article3</ArticleName>
        <RowText>Product Text111</RowText>
        <RowText>Product Text222</RowText>
        <RowAmount AmountCurrencyIdentifier="EUR">30.00</RowAmount>
    </InvoiceRow>
    <EpiDetails>
        <EpiPartyDetails>
            <EpiBfiPartyDetails>
                <EpiBfiIdentifier IdentificationSchemeName="BIC">XXXXX</EpiBfiIdentifier>
            </EpiBfiPartyDetails>
        </EpiPartyDetails>
    </EpiDetails>
    <InvoiceUrlText>Some text</InvoiceUrlText>
</Invoice>

XSLT1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>

<xsl:template match="Invoice">
    <xsl:variable name="common-head">
        <xsl:value-of select="SellerDetails/Identifier"/>
        <xsl:text>,</xsl:text>
        <xsl:value-of select="BuyerDetails/BuyerIdentifier"/>
        <xsl:text>,</xsl:text>
        <xsl:value-of select="InvoiceDetails/InvoiceNumber"/>
        <xsl:text>,</xsl:text>
        <!  add more here  >
    </xsl:variable>
    <xsl:variable name="common-tail">
        <xsl:value-of select="EpiDetails/EpiPartyDetails/EpiBfiPartyDetails/EpiBfiIdentifier"/>
        <xsl:text>,</xsl:text>
        <!  add more here  >
        <xsl:value-of select="InvoiceUrlText"/>
    </xsl:variable>
    <!  header  >
    <xsl:text>SellerIdentifier,BuyerIdentifier,InvoiceNumber,ArticleName,RowText,RowText,RowAmount,EpiBfiIdentifier,InvoiceUrlText&#10;</xsl:text>
    <!  data  >
    <xsl:for-each select="InvoiceRow">
        <xsl:copy-of select="$common-head"/>
        <xsl:value-of select="ArticleName"/>
        <xsl:text>,</xsl:text>  
        <xsl:value-of select="RowAmount"/>
        <xsl:text>,</xsl:text>  
        <!  add more here  >
        <xsl:copy-of select="$common-tail"/>
        <xsl:text>&#10;</xsl:text>
    </xsl:for-each>
</xsl:template>

</xsl:stylesheet>

结果

SellerIdentifier,BuyerIdentifier,InvoiceNumber,ArticleName,RowText,RowText,RowAmount,EpiBfiIdentifier,InvoiceUrlText
1234-1,1234-2,0001,Article1,10.00,XXXXX,Some text
1234-1,1234-2,0001,Article2,20.00,XXXXX,Some text
1234-1,1234-2,0001,Article3,30.00,XXXXX,Some text

针对以下内容添加：

Is there a way in XSLT to get the same results using loop? For example loop through and output all the elements and the sub-elements except the InvoiceRow elements and then vice versa?

如果您愿意，您可以尝试以下方式：

XSLT1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>

<xsl:template match="Invoice">
    <xsl:variable name="invoice-fields" select="//*[not(*) and not(ancestor::InvoiceRow)]" />
    <xsl:variable name="common-data">
        <xsl:for-each select="$invoice-fields">
            <xsl:value-of select="."/>
            <xsl:text>,</xsl:text>  
        </xsl:for-each> 
    </xsl:variable>
    <!  header  >
    <xsl:for-each select="$invoice-fields">
        <xsl:value-of select="name()"/>
        <xsl:text>,</xsl:text>  
    </xsl:for-each>
    <xsl:for-each select="InvoiceRow[1]/*">
        <xsl:value-of select="name()"/>
        <xsl:if test="position()!=last()">,</xsl:if>
    </xsl:for-each>
    <xsl:text>&#10;</xsl:text>
    <!  data  >
    <xsl:for-each select="InvoiceRow">
        <xsl:copy-of select="$common-data"/>
        <xsl:for-each select="*">
            <xsl:value-of select="."/>
            <xsl:if test="position()!=last()">,</xsl:if>
        </xsl:for-each> 
        <xsl:text>&#10;</xsl:text>
    </xsl:for-each>
</xsl:template>

</xsl:stylesheet>

结果是：

Identifier,SellerStreet,SellerTown,BuyerIdentifier,BuyerStreet,BuyerTown,BuyerNumber,InvoiceNumber,EpiBfiIdentifier,InvoiceUrlText,ArticleName,RowText,RowText,RowAmount
1234-1,Street1,Town1,1234-2,Street2,Town2,001234,0001,XXXXX,Some text,Article1,Product Text1,Product Text2,10.00
1234-1,Street1,Town1,1234-2,Street2,Town2,001234,0001,XXXXX,Some text,Article2,Product Text11,Product Text22,20.00
1234-1,Street1,Town1,1234-2,Street2,Town2,001234,0001,XXXXX,Some text,Article3,Product Text111,Product Text222,30.00

即在行字段之前列出所有发票字段

针对以下内容添加：

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用Python或XSLT将复杂XML转换为CSV

针对以下内容添加：

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >