如何在jsoup中更改元素的文本?

huangapple 未分类评论50阅读模式
英文:

How do I change the text of an Element in jsoup?

问题

我正在遍历一个HTML文档并更改元素的文本,但是在尝试更改任何元素的文本时,Jsoup不起作用。我的代码如下:

		// URL
		String url = "http://example.com/source.html";

		Document doc = Jsoup.connect(url)
	               .userAgent("Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0")
	               .referrer("http://www.google.com") 
	               .ignoreHttpErrors(true).get();

		// 选择HTML中的所有元素
		Elements eles = doc.body().select("*");
		// 对于每个元素
		for (Element ele : eles) {
			
			String text = ele.ownText();
			System.out.println(text);
			ele.text("newText");

		}
		
		File htmlFile = new File("output.html");
		PrintWriter pw = new PrintWriter(htmlFile, "UTF-8");
		// 将翻译后的HTML写入输出文件
		pw.println(doc);
		pw.close();

然后我得到的HTML主体如下:

<body>
newText
</body>

英文:

I'm iterating over an HTML document and changing the text of the elements, but Jsoup is not working when trying to change any of the elements' texts. My code is:

			// URL
			String url = &quot;http://example.com/source.html&quot;;

			Document doc = Jsoup.connect(url)
		               .userAgent(&quot;Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0&quot;)
		               .referrer(&quot;http://www.google.com&quot;) 
		               .ignoreHttpErrors(true).get();

			// Select all of the elements in the HTML
			Elements eles = doc.body().select(&quot;*&quot;);
			// For each element
			for (Element ele : eles) {
				
				String text = ele.ownText();
				System.out.println(text);
				ele.text(&quot;newText&quot;);

			}
			
			File htmlFile = new File(&quot;output.html&quot;);
			PrintWriter pw = new PrintWriter(htmlFile, &quot;UTF-8&quot;);
			// Write our translated HTML to the output file
			pw.println(doc);
			pw.close();

And the resulting HTML body that I get is:

 &lt;body&gt;
  newText
 &lt;/body&gt;

答案1

得分: 0

你正在使用.select("*")来选择文档中的所有元素,结果是你替换了<body>的内容,因此其他所有内容都丢失了。尝试更具体地选择元素。

英文:

You're selecting ALL elements in the document using .select(&quot;*&quot;) and the effect is that you replaced the content of the &lt;body&gt; as well so everything else is lost. Try to be more specific with selecting elements.

huangapple
  • 本文由 发表于 2020年5月31日 06:41:25
  • 转载请务必保留本文链接:https://java.coder-hub.com/62109502.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定