<bs-submission participant-id="tagger1" run-id="first annotations with the prospectus tagger" task="book-toc" toc-creation="manual" toc-source="full-content">
<source-files pdf="yes" xml="no"/>
This file contains the **manual** annotations of tagger1 for one single prospectus
<toc-section page="13"/>
<toc-section page="14"/>
<toc-section page="15"/>
<toc-section page="16"/>
<toc-entry title="ESSAY I. On RIDICULE considered as a Test of Truth." page="17">
<toc-entry title="I. VINDICATION of the noble Writer's Zeal for Freedom." page="17"/>
<toc-entry title="II. Of bis Method of treating the Question concerning Ridicule." page="23"/>
<toc-entry title="III. Of the different Kinds of Composition; Poetry, Eloquence, and Argument." page="28"/>
<toc-entry title="IV. That Ridicule is a Species of Eloquence." page="57"/>
<toc-entry title="V. A Confirmation of the foregoing Truths by an Appeal to Fact." page="64"/>
<toc-entry title="VI. Of the noble Writer's Arguments in support of his new Theory; particularly the Case of SOCRATES." page="70"/>
<toc-entry title="VII. His" page="80"/>
<toc-entry title="VII. His further Reasonings examined." page="80"/>
<toc-entry title="VIII. Of his main Argument; relating to Protestantism and Christianity" page="90"/>
<toc-entry title="IX. Of the Opinion of GORGIAS quoted by his Lordship from ARISTOTLE." page="97"/>
<toc-entry title="X. The Reasoning of one of his Followers in this Subject, examined." page="104"/>
<toc-entry title="XI. Of the particular Impropriety of applying Ridicule to the Investigation of religious Truth." page="115"/>
<toc-entry title="ESSAY II. On the Motives to VIRTUE, and the Necessity of Religious Principle." page="125">
<toc-entry title="I. Introduction." page="125"/>
<toc-entry title="II. That the Definitions which Lord SHAFTESBURY, and several other Moralists have given of Virtue, are inadequate and defective." page="127"/>
<toc-entry title="III. Of the real Nature of Virtue." page="139"/>
<toc-entry title="IV. Of" page="153"/>
<toc-entry title="IV. Of an Objection urged by Dr. MAN-DEVILLE against the permanent Reality of Virtue." page="153"/>
<toc-entry title="V. Examination and Analysis of The Fable of the Bees." page="162"/>
<toc-entry title="VI. Of the natural Motives to virtuous Action." page="174"/>
<toc-entry title="VII. How far these Motives can in Reality influence all Mankind. The Errors of the Stoic and Epicurean Parties; and the most probable Foundation of these Errors." page="183"/>
<toc-entry title="VIII. The noble Writer's additional Reasonings examined; and shown to be without Foundation." page="203"/>
<toc-entry title="IX. That the religious Principle, or Obedience to the Will of God, can alone produce a uniform and permanent Motive to Virtue. The noble Writer's Objections examined." page="222"/>
<toc-entry title="X. Of the Efficacy of the religious Principle. Conclusion." page="239"/>
<toc-entry title="ESSAY III. On Revealed RELIGION, and CHRISTIANITY." page="257">
<toc-entry title="I. Of the noble Writer's Manner of treating Christianity." page="257"/>
<toc-entry title="II. Of his Objections to the Truths of natural Religion." page="261"/>
<toc-entry title="III. Of the Credibility of the Gospel-History." page="272"/>
<toc-entry title="IV. Of the Scripture-Miracles" page="287"/>
<toc-entry title="V. Of Enthusiasm." page="310"/>
<toc-entry title="VI. Of the religious and moral Doctrines of Christianity." page="330"/>
<toc-entry title="VII. Of several detached Passages in the Characteristics." page="365"/>
<toc-entry title="VIII. Of the Style and Composition of the Scriptures." page="387"/>
<toc-entry title="IX. Of the noble Writer's Treatment of the English Clergy." page="407"/>

如您所见,toc-entry元素是层次结构的。在一些XML中,它们的深度可以达到5或6。 我想编写一个函数,将这些内容作为输入并输出相同的文件,同时只保留深度小于或等于指定整数的toc-entry


我第一次只使用了xml.etree.ElementTree,但是当我想计算toc条目的深度时,我发现了一个使用第二个库的函数,所以我也开始使用它。 计算深度的函数如下(node是一个lxml.html对象):

def depth(node):
    taken from:
    d = 0
    while node is not None:
        d += 1
        node = node.getparent()
    # return d
    return d - 4  


 # book is a ET.Element
 book_lxml = lxml.html.fromstring(ET.tostring(book))
 for toc_entry in book_lxml.iter('toc-entry'):
          if depth(toc_entry) > max_depth:
              except ValueError:

# root is a ET.Element containing the bs-submission span



