javascript为什么Selenium中的所有getX（）调用都需要这么长时间？

8 月，3 周 Questions & Answers 35

我想用Selenium做一些网页抓取。似乎即使在我将网站上的所有WebElements放入ArrayList之后，程序也需要很短的时间来完成对getText()、getAttribute()或getTagName()的每次调用

我正在抓取的页面使用Javascript

这是因为每次通话都需要从网页上重新下载材料吗

我曾假设Selenium会提前下载网页，将元素存储在我的计算机上的快速内存中，然后在我使用getText()或getAttribute()时遍历这些材料。错了吗？如果是这样的话，有没有办法下载整个页面，提取元素并将它们放入我的计算机内存中，然后以这种方式处理它们？下面是一些示例代码：

            List<WebElement> all_elements = driver.findElements(By.cssSelector("*"));
            for (int i = 0; i < all_elements.size() - 2; i++) {
                //If the tag name two elements ahead is "em", then the current
                //element is the title of the paper, and the "em" element is the author.
                //Search up to 10 elements for a "Full Text" link.
                if (all_elements.get(i + 2).getTagName().equalsIgnoreCase("em")) {
                    String title = all_elements.get(i).getText();
                    String author = all_elements.get(i + 2).getText();
                    String full_text = "";
                    for (int j = 3; j < 10 && full_text.isEmpty(); j++) {
                        if (all_elements.get(i + j).getText().equalsIgnoreCase("full text"))
                            full_text = all_elements.get(i + j).getAttribute("href");
                    }
                    System.out.println("Title: " + title);
                    System.out.println("Author: " + author);
                    System.out.println("Full text link: " + full_text);
                }
            }

Python中文网

有 Java 编程相关的问题?

javascript为什么Selenium中的所有getX（）调用都需要这么长时间？

共 (0) 个答案