英文:
Setting htmlunit webclient browser version doesn't work
问题
我正在使用htmlunit 2.19(版本更新没有解决这个问题)。我想要"爬取"一个通过JavaScript动态生成的网页。在设置了webclient中的浏览器版本之后,网页仍然返回浏览器版本不受支持的消息。我看过的所有示例代码和示例都和我做的一样,但能够完美运行。
我是否需要考虑此问题的其他细节?
这是我的htmlunit Maven依赖:
<dependency>
<groupId>net.sourceforge.htmlunit</groupId>
<artifactId>htmlunit</artifactId>
<version>2.19</version>
</dependency>
以及示例代码:
WebClient webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setUseInsecureSSL(true);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.setCssErrorHandler(new SilentCssErrorHandler());
webClient.getOptions().setCssEnabled(false);
webClient.getOptions().setJavaScriptEnabled(true);
HtmlPage page = webClient.getPage(url);
英文:
I am using htmlunit 2.19 (version update didn't solve this problem). I want to scrape a web page which is dynamically generated in javascript. After setting browser version in webclient, webpage still returns message that browser version is unsupported. All the sample codes and examples I have seen do the same as I do, but work perfectly.
Do I have to consider any other details in this matter ?
Here is my htmlunit maven dependency :
<dependency>
<groupId>net.sourceforge.htmlunit</groupId>
<artifactId>htmlunit</artifactId>
<version>2.19</version>
</dependency>
and sample code :
WebClient webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setUseInsecureSSL(true);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.setCssErrorHandler(new SilentCssErrorHandler());
webClient.getOptions().setCssEnabled(false);
webClient.getOptions().setJavaScriptEnabled(true);
HtmlPage page = webClient.getPage(url);
专注分享java语言的经验与见解,让所有开发者获益!
评论