java - HtmlUnit gets page error -
i trying parse page.
http://www.reuters.com/article/2015/07/08/us-china-cybersecurity-iduskcn0pi09020150708 my code looks this
webclient webclient = new webclient(browserversion.chrome); final htmlpage page = webclient.getpage("http://www.reuters.com/article/2015/07/08/us-alibaba-singapore-post-iduskcn0pi03j20150708"); system.out.println(page.asxml()); it gives me lot of warnings , huge call stack. related javascript engine. have used these options
webclient.waitforbackgroundjavascript(1000000); webclient.setjavascripttimeout(1000000); but nothing seems work. page executes javascript load content of page. need wait page load content. ideas how can resolve issue?
you need wait after getting page, there error of "addimpression" not defined, don't know in javascript defined.
i feel not using recent version, since there not lot of warnings.
with latest snapshot content using:
try (webclient webclient = new webclient(browserversion.chrome)) { webclient.getoptions().setthrowexceptiononscripterror(false); final htmlpage page = webclient.getpage("http://www.reuters.com/article/2015/07/08/us-alibaba-singapore-post-iduskcn0pi03j20150708"); webclient.waitforbackgroundjavascript(10000); system.out.println(page.astext()); }
Comments
Post a Comment