英文:
How do I change the text of an Element in jsoup?
问题
我正在遍历一个HTML文档并更改元素的文本,但是在尝试更改任何元素的文本时,Jsoup不起作用。我的代码如下:
// URL
String url = "http://example.com/source.html";
Document doc = Jsoup.connect(url)
.userAgent("Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0")
.referrer("http://www.google.com")
.ignoreHttpErrors(true).get();
// 选择HTML中的所有元素
Elements eles = doc.body().select("*");
// 对于每个元素
for (Element ele : eles) {
String text = ele.ownText();
System.out.println(text);
ele.text("newText");
}
File htmlFile = new File("output.html");
PrintWriter pw = new PrintWriter(htmlFile, "UTF-8");
// 将翻译后的HTML写入输出文件
pw.println(doc);
pw.close();
然后我得到的HTML主体如下:
<body>
newText
</body>
英文:
I'm iterating over an HTML document and changing the text of the elements, but Jsoup is not working when trying to change any of the elements' texts. My code is:
// URL
String url = "http://example.com/source.html";
Document doc = Jsoup.connect(url)
.userAgent("Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0")
.referrer("http://www.google.com")
.ignoreHttpErrors(true).get();
// Select all of the elements in the HTML
Elements eles = doc.body().select("*");
// For each element
for (Element ele : eles) {
String text = ele.ownText();
System.out.println(text);
ele.text("newText");
}
File htmlFile = new File("output.html");
PrintWriter pw = new PrintWriter(htmlFile, "UTF-8");
// Write our translated HTML to the output file
pw.println(doc);
pw.close();
And the resulting HTML body that I get is:
<body>
newText
</body>
答案1
得分: 0
你正在使用.select("*")
来选择文档中的所有元素,结果是你替换了<body>
的内容,因此其他所有内容都丢失了。尝试更具体地选择元素。
英文:
You're selecting ALL elements in the document using .select("*")
and the effect is that you replaced the content of the <body>
as well so everything else is lost. Try to be more specific with selecting elements.
专注分享java语言的经验与见解,让所有开发者获益!
评论