有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

java使用Jsoup提取特定的html表内容

我想从响应的HTML表中提取单独的内容,我使用的是Jsoup

以下是我的表格结构:

<table id="main_widget_table" class="table table-striped table-hover table-condensed table-bordered">
                <tbody>
                <!-- ngRepeat: object in currentView --><tr ng-repeat="object in currentView" class="ng-scope">
                    <td>
                        <a id="main_widget_table_object_name_action" href="#//object/" target="_blank">
                            <b class="ng-binding">TASK_BACKUP</b>
                        </a>

                        <p style="font-size:11px">
                            <span class="text-success" ng-show="object.label"><em class="ng-binding"> task backup</em></span>
                            <br ng-show="object.label">
                            <span ng-show="object.session" class="ng-binding" style="display: none;">
                                <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding">
                                </em>
                            </span>
                            <br ng-show="object.session" style="display: none;">
                            <span ng-hide="object.session" class="ng-binding">
                                <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em>
                            </span>
                            <br ng-hide="object.session">
                            <span class="text-warning ng-binding">Location: TASKMUBACKUP</span>
                            <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;">

                            </span>
                        </p>
                    </td>
                </tr><tr ng-repeat="object in currentView" class="ng-scope">
                    <td>
                        <a id="main_widget_table_object_name_action" href="#//object/" target="_blank">
                            <b class="ng-binding">TASK_TOTO</b>
                        </a>

                        <p style="font-size:11px">
                            <span class="text-success" ng-show="object.label"><em class="ng-binding"> task toto</em></span>
                            <br ng-show="object.label">
                            <span ng-show="object.session" class="ng-binding" style="display: none;">
                                <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding">
                                </em>
                            </span>
                            <br ng-show="object.session" style="display: none;">
                            <span ng-hide="object.session" class="ng-binding">
                                <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em>
                            </span>
                            <br ng-hide="object.session">
                            <span class="text-warning ng-binding">Location: TASKMUTOTO</span>
                            <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;">

                            </span>
                        </p>
                    </td>
                </tr><tr ng-repeat="object in currentView" class="ng-scope">
                    <td>
                        <a id="main_widget_table_object_name_action" href="#//object/" target="_blank">
                            <b class="ng-binding">TASK_FTP</b>
                        </a>

                        <p style="font-size:11px">
                            <span class="text-success" ng-show="object.label"><em class="ng-binding"> task ftp</em></span>
                            <br ng-show="object.label">
                            <span ng-show="object.session" class="ng-binding" style="display: none;">
                                <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding">
                                </em>
                            </span>
                            <br ng-show="object.session" style="display: none;">
                            <span ng-hide="object.session" class="ng-binding">
                                <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em>
                            </span>
                            <br ng-hide="object.session">
                            <span class="text-warning ng-binding">Location: TASKMUFTP</span>
                            <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;">

                            </span>
                        </p>
                    </td>
                </tr><tr ng-repeat="object in currentView" class="ng-scope">
                    <td>
                        <a id="main_widget_table_object_name_action" href="#//object/" target="_blank">
                            <b class="ng-binding">TASK_MSSQL</b>
                        </a>

                        <p style="font-size:11px">
                            <span class="text-success" ng-show="object.label"><em class="ng-binding"> task mssql</em></span>
                            <br ng-show="object.label">
                            <span ng-show="object.session" class="ng-binding" style="display: none;">
                                <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding">
                                </em>
                            </span>
                            <br ng-show="object.session" style="display: none;">
                            <span ng-hide="object.session" class="ng-binding">
                                <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em>
                            </span>
                            <br ng-hide="object.session">
                            <span class="text-warning ng-binding">Location: TASKMUMSSQL</span>
                            <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;">

                            </span>
                        </p>
                    </td>
                </tr><tr ng-repeat="object in currentView" class="ng-scope">
                    <td>
                        <a id="main_widget_table_object_name_action" href="#//object/" target="_blank">
                            <b class="ng-binding">TASK_ORACLE</b>
                        </a>

                        <p style="font-size:11px">
                            <span class="text-success" ng-show="object.label"><em class="ng-binding"> task oracle</em></span>
                            <br ng-show="object.label">
                            <span ng-show="object.session" class="ng-binding" style="display: none;">
                                <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding">
                                </em>
                            </span>
                            <br ng-show="object.session" style="display: none;">
                            <span ng-hide="object.session" class="ng-binding">
                                <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em>
                            </span>
                            <br ng-hide="object.session">
                            <span class="text-warning ng-binding">Location: TASKMUORA1</span>
                            <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;">

                            </span>
                        </p>
                    </td>
                </tr><tr ng-repeat="object in currentView" class="ng-scope">
                    <td>
                        <a id="main_widget_table_object_name_action" href="#//object/" target="_blank">
                            <b class="ng-binding">TASK_TUTU</b>
                        </a>

                        <p style="font-size:11px">
                            <span class="text-success" ng-show="object.label"><em class="ng-binding"> task tutu</em></span>
                            <br ng-show="object.label">
                            <span ng-show="object.session" class="ng-binding" style="display: none;">
                                <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding">
                                </em>
                            </span>
                            <br ng-show="object.session" style="display: none;">
                            <span ng-hide="object.session" class="ng-binding">
                                <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em>
                            </span>
                            <br ng-hide="object.session">
                            <span class="text-warning ng-binding">Location: TASKMUTUTU</span>
                            <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;">

                            </span>
                        </p>
                    </td>
                </tr><tr ng-repeat="object in currentView" class="ng-scope">
                    <td>
                        <a id="main_widget_table_object_name_action" href="#//object/" target="_blank">
                            <b class="ng-binding">TASK_TITI</b>
                        </a>

                        <p style="font-size:11px">
                            <span class="text-success" ng-show="object.label"><em class="ng-binding"> task titi</em></span>
                            <br ng-show="object.label">
                            <span ng-show="object.session" class="ng-binding" style="display: none;">
                                <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding">
                                </em>
                            </span>
                            <br ng-show="object.session" style="display: none;">
                            <span ng-hide="object.session" class="ng-binding">
                                <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em>
                            </span>
                            <br ng-hide="object.session">
                            <span class="text-warning ng-binding">Location: TASKMUTITI</span>
                            <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;">

                            </span>
                        </p>
                    </td>
                </tr><tr ng-repeat="object in currentView" class="ng-scope">
                    <td>
                        <a id="main_widget_table_object_name_action" href="#//object/" target="_blank">
                            <b class="ng-binding">TASK_WSB</b>
                        </a>

                        <p style="font-size:11px">
                            <span class="text-success" ng-show="object.label"><em class="ng-binding"> task wsb</em></span>
                            <br ng-show="object.label">
                            <span ng-show="object.session" class="ng-binding" style="display: none;">
                                <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding">
                                </em>
                            </span>
                            <br ng-show="object.session" style="display: none;">
                            <span ng-hide="object.session" class="ng-binding">
                                <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em>
                            </span>
                            <br ng-hide="object.session">
                            <span class="text-warning ng-binding">Location: MUWSB</span>
                            <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;">

                            </span>
                        </p>
                    </td>
                </tr><tr ng-repeat="object in currentView" class="ng-scope">
                    <td>
                        <a id="main_widget_table_object_name_action" href="#//object/" target="_blank">
                            <b class="ng-binding">TASK_SAP</b>
                        </a>

                        <p style="font-size:11px">
                            <span class="text-success" ng-show="object.label"><em class="ng-binding"> task sap</em></span>
                            <br ng-show="object.label">
                            <span ng-show="object.session" class="ng-binding" style="display: none;">
                                <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding">
                                </em>
                            </span>
                            <br ng-show="object.session" style="display: none;">
                            <span ng-hide="object.session" class="ng-binding">
                                <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em>
                            </span>
                            <br ng-hide="object.session">
                            <span class="text-warning ng-binding">Location: FRQPMDEV18</span>
                            <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;">

                            </span>
                        </p>
                    </td>
                </tr><tr ng-repeat="object in currentView" class="ng-scope">
                    <td>
                        <a id="main_widget_table_object_name_action" href="#//object/" target="_blank">
                            <b class="ng-binding">TASK_BATCH</b>
                        </a>

                        <p style="font-size:11px">
                            <span class="text-success" ng-show="object.label"><em class="ng-binding"> task batch</em></span>
                            <br ng-show="object.label">
                            <span ng-show="object.session" class="ng-binding" style="display: none;">
                                <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding">
                                </em>
                            </span>
                            <br ng-show="object.session" style="display: none;">
                            <span ng-hide="object.session" class="ng-binding">
                                <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em>
                            </span>
                            <br ng-hide="object.session">
                            <span class="text-warning ng-binding">Location: MUFRQPMDE</span>
                            <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;">

                            </span>
                        </p>
                    </td>
                </tr>
                </tbody>
            </table>

我只需要提取粗体标记之间的值,例如,对于第一个TD,值为TASK_TOTO

以下是我的JAVA代码:

ublic class HtmlParser {

public class HtmlParser {

public static void main(String[] args) throws Exception {
    Document doc =    Jsoup.connect("http://frstmwarwebsrv2.orsyptst.com:9000/ui/#/en/search?searchString=TSK&filterchecks=nameSWF").get();
    for (Element table : doc.select("#search_results_table")) {
        for (Element row : table.select("tr")) {
            Elements tds = row.select("td");
            System.out.println(tds.get(0).text());   
        }
    }
}

}

我是JSOUP的新手,到目前为止,我的代码没有显示任何内容。我正在使用表id查找表

谢谢 谢谢你的帮助

仅供参考:我的表格是使用angular JS生成的,因此Jsoup不是提取表格数据的最佳方式

改用此代码时:

List<WebElement> resultsDiv =    driver.findElements(By.xpath("id('search_results_table')"));
         for (int i=0; i<resultsDiv.size(); i++) {
         System.out.println( resultsDiv.get(i).getText());
         System.out.println (resultsDiv.size());

我仍然无法显示内容,大小设置为1!!我不确定我做错了什么


共 (1) 个答案

  1. # 1 楼答案

    根据您提供的HTML代码片段,表的id是main_widget_table,而不是search_results_table。(代码中的URL不再可访问,因此我无法判断该页面上是否有其他search_results_table。)

    您可以使用

    for (Element e : doc.select("#main_widget_table b"))
        System.out.println(e.text());