有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

saxparser与Java中解析XML文件的混淆

给定此XML文件:

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <data>
      <track clipid="1">
         <url>http://www.emp3world.com/to_download.php?id=33254</url>
         <http_method>GET or POST</http_method>
         <post_body>a=1&b=2&c=3</post_body>
      </track>
   </data>
</root>

我想要的是从这个XML文件中打印如下内容:

ID: 1
URL: http://www.emp3world.com/to_download.php?id=33254
Http method: GET or POST

目前,这是我的基本处理程序代码:

class MyHandler extends DefaultHandler
{
    String str = "";
    StringBuilder s = new StringBuilder();
    public void startElement(String namespaceURI, String sName, String qName, Attributes atts)
    {
        if(qName.equals("track"))
        {
            s.append("ID: ").append(atts.getValue("clipid")).append("\n");
        }
        if(qName.equals("url"))
        {
            s.append("URL: ");
        }
        if(qName.equals("http_method"))
        {
            s.append("Http method: ");
        }
    }

    public void endElement(String uri, String localName, String qName)
    {
        if(qName.equals("url"))
        {
            s.append(str).append("\n");
            str = "";
        }
        if(qName.equals("http_method"))
        {
            s.append(str).append("\n");
            str = "";
        }
        System.out.println(s);
    }

    public void characters(char[] ch, int start, int length) throws SAXException {
        str = new String(ch, start, length);
    }
}

我的问题是它总是打印4次结果(第一次没有Http方法字段。我想这对于所有Sax解析器初学者来说都是个问题。
我知道startElement、endElement和characters函数的功能,但正如您所见,我不知道如何正确使用它们。为了获得正确的输出,我应该在代码中更改什么


共 (1) 个答案

  1. # 1 楼答案

    问题在于你的角色方法。把它的身体换成

    s.append(new String(ch, start, length));
    

    然后将这一行添加到startElement的开头

    s.setLength(0);
    

    你应该看到一些输出

    以下是the Java tutorial on SAX对characters方法的看法:

    Parsers are not required to return any particular number of characters at one time. A parser can return anything from a single character at a time up to several thousand and still be a standard-conforming implementation. So if your application needs to process the characters it sees, it is wise to have the characters() method accumulate the characters in a java.lang.StringBuffer and operate on them only when you are sure that all of them have been found.