如何使用selenium python从下面的HTML中获取文本属性

2024-10-03 11:22:30 发布

您现在位置:Python中文网/ 问答频道 /正文

下面是html代码,我想从中提取文本“第2页,共2页”

HTML代码

<thead>
        <tr>
            <th scope="col" class="GridHeader_Sonetto"><input id="ctl00_ctl00_ContentPlaceHolderContent_MainCategoriserContent_Map1_SubgroupsAndProducts1_plcCategoryProductsGrid_ctl00_ctl00_ctl02_ctl01_SelectSelectCheckBox" type="checkbox" name="ctl00$ctl00$ContentPlaceHolderContent$MainCategoriserContent$Map1$SubgroupsAndProducts1$plcCategoryProductsGrid$ctl00$ctl00$ctl02$ctl01$SelectSelectCheckBox" onclick="$find(&quot;ctl00_ctl00_ContentPlaceHolderContent_MainCategoriserContent_Map1_SubgroupsAndProducts1_plcCategoryProductsGrid_ctl00&quot;)._selectAllRows(&quot;ctl00$ctl00$ContentPlaceHolderContent$MainCategoriserContent$Map1$SubgroupsAndProducts1$plcCategoryProductsGrid$ctl00$ctl00&quot;, &quot;&quot;, event);setTimeout(&#39;__doPostBack(\&#39;ctl00$ctl00$ContentPlaceHolderContent$MainCategoriserContent$Map1$SubgroupsAndProducts1$plcCategoryProductsGrid$ctl00$ctl00$ctl02$ctl01$SelectSelectCheckBox\&#39;,\&#39;\&#39;)&#39;, 0)" /></th><th scope="col" class="GridHeader_Sonetto" style="display:none;">_InternalID</th><th scope="col" class="GridHeader_Sonetto" style="display:none;">_ID</th><th scope="col" class="GridHeader_Sonetto" style="display:none;">_Name</th><th scope="col" class="GridHeader_Sonetto" style="display:none;">_Order</th><th scope="col" class="GridHeader_Sonetto" style="display:none;">_Source</th><th scope="col" class="GridHeader_Sonetto" style="display:none;">_RootConcept</th><th scope="col" class="GridHeader_Sonetto">TPNB</th><th scope="col" class="GridHeader_Sonetto">Product Name</th>
        </tr>
    </thead><tfoot>
        <tr class="GridPager_Sonetto">
            <td colspan="3"><div class="PagerLeft_Sonetto">
                <span class="items-summary">Items 11 - 15 of 15</span><span class="grid-pages"><span><input type="submit" name="ctl00$ctl00$ContentPlaceHolderContent$MainCategoriserContent$Map1$SubgroupsAndProducts1$plcCategoryProductsGrid$ctl00$ctl00$ctl03$ctl01$ctl02" value=" " title="Previous Page" class="rgPagePrev" /></span>&nbsp;<input type="submit" name="ctl00$ctl00$ContentPlaceHolderContent$MainCategoriserContent$Map1$SubgroupsAndProducts1$plcCategoryProductsGrid$ctl00$ctl00$ctl03$ctl01$ctl03" value=" " onclick="return false;" title="Next Page" class="rgPageNext" />
            </div><div class="PagerRight_Sonetto">
                </span><span class="hide page-summary">Page 2 of 2</span>
            </div></td>
        </tr>
    </tfoot><tbody>

下面是我的代码尝试

urll = driver.find_element(By.XPATH, "//input[@id='ctl00_ctl00_ContentPlaceHolderContent_MainCategoriserContent_Map1_SubgroupsAndProducts1_plcCategoryProductsGrid_ctl00_ctl00_ctl02_ctl01_SelectSelectCheckBox']")
            urll.find_element(By.XPATH,"//span[@class='hide page-summary']").get_attribute("textContent")

上面的代码正在工作,但它正在提取此HTML代码之前的另一个代码的文本 请帮助获取文本第2页,共2页


Tags: 代码colclassscopespanthctl00ctl01
2条回答

使用.text

element = driver.find_element_by_class_name('hide page-summary').text
print(element)
elem=driver.find_elements_by_xpath("//span[@class='hide page-summary']")

print(elem[2].get_attribute("textContent"))

如果有两个元素,则索引第二个元素

另外,当从父级索引时,请使用//否则将从根级索引

相关问题 更多 >