有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

在HTMLUnit中的submit button click()之后,java无法访问新页面

问题如下:当我运行此代码时,它一直运行到submitButton.fireEvent("onclick").getNewPage(),然后即使最后一个System.out.println(pageAfterLogin.getUrl().toString())没有执行,它似乎也会结束。在程序执行期间没有发生错误

import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlInput;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import java.util.List;

public class WebScraperHTMLUnit2 {

public static void main(String[] args) {
     try{
        WebClient wc = new WebClient();
        HtmlPage page = wc.getPage("https://www.google.com/");

        HtmlInput searchForm = (HtmlInput)page.getFirstByXPath("//input[@name='q']");
        searchForm.setValueAttribute("q");

        HtmlElement submitButton = page.getFirstByXPath("//button[@id='searchButton']");
        HtmlPage pageAfterLogin = (HtmlPage) submitButton.fireEvent("onclick").getNewPage();

        System.out.println(pageAfterLogin.getUrl().toString());   

    } catch (Exception ex) {}       
}    
}

以下是NetBeans的输出日志:

run:
дек 16, 2016 2:38:16 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://www.google.ru/' [1:14018] Error in expression. (Invalid token " ". Was expecting one of: <NUMBER>, "inherit", <IDENT>, <STRING>, <HASH>, <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <RESOLUTION_DPI>, <RESOLUTION_DPCM>, <PERCENTAGE>, <DIMENSION>, <UNICODE_RANGE>, <URI>, <FUNCTION>, "progid:".)
дек 16, 2016 2:38:16 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: 'https://www.google.ru/' [1:14042] Error in expression. (Invalid token " ". Was expecting one of: <NUMBER>, "inherit", <IDENT>, <STRING>, <HASH>, <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <RESOLUTION_DPI>, <RESOLUTION_DPCM>, <PERCENTAGE>, <DIMENSION>, <UNICODE_RANGE>, <URI>, <FUNCTION>, "progid:".)
дек 16, 2016 2:38:16 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'text/javascript'.
СБОРКА УСПЕШНО ЗАВЕРШЕНА (общее время: 3 секунды)

共 (1) 个答案

  1. # 1 楼答案

    按钮的xpath不正确。按钮是:

    <input value="Google Search" aria-label="Google Search" name="btnK" type="submit" jsaction="sf.chk">
    

    你的代码应该是这样的:

     try {
         final WebClient wc = new WebClient();
         wc.getOptions().setThrowExceptionOnScriptError(false);
    
         HtmlPage page = wc.getPage("https://www.google.com/");
    
         HtmlInput searchForm = page.getFirstByXPath("//input[@name='q']");
         searchForm.setValueAttribute("q");
    
         HtmlSubmitInput submitButton = page.getFirstByXPath("//input[@name='btnK']");
        HtmlPage pageAfterLogin = submitButton.click();
    
        System.out.println(pageAfterLogin.getUrl().toString());   
    
    } catch (Exception e) {}
    

    您需要将SetThroweExceptionOnScriptError添加到false的原因是因为抛出了一个错误(原因未知),并且您不想因此而停止代码的执行

    根据this post在www.google上生成的HTML。com不断变化。 因此,我的//输入[@name='btnK']xpath将来可能无法工作