英文:
How to force removal of attributes with implied default values from DTD in Java XML DOM
问题
根据我在Stack Overflow的其他帖子中的报告,我正在解析一个传统的模块化XHTML 1.1文档,而DTD正在添加各种默认属性,比如version="-//W3C//DTD XHTML 1.1//EN"
。其中一些甚至是不适当的,比如xml:space="preserve"
。
我正在编写一个实用程序,在解析后清理DOM,但我忘记了DOM会自动从DTD中重新添加默认属性,如果我删除它们的话。因此,例如,如果我在文档元素上调用Element.removeAttributeNS(null, "version")
,它只会重新添加version="-//W3C//DTD XHTML 1.1//EN"
,我又回到了起点。
我如何在Java中强制DOM删除属性,即使DTD指示该属性具有隐含值?或者我如何从DOM树中更改/删除DTD,以便隐含属性不会自动显示?
我已经成功创建了一个没有DTD的新的空文档;导入旧的文档元素及其后代;然后用导入的元素树替换新文档中的根元素,但这会增加很多开销,效率也太低。是否有更有效的解决方法或解决方案?
英文:
As I reported elsewhere on Stack Overflow, I'm parsing a legacy modular XHTML 1.1 document and the DTD is adding all sorts of default attributes such as version="-//W3C//DTD XHTML 1.1//EN"
. Some of these are even inappropriate, such as xml:space="preserve"
.
I'm writing a utility to clean up the DOM after parsing, but I forgot that the DOM will automatically add back default attributes from the DTD if I remove them. So if I call Element.removeAttributeNS(null, "version")
on the document element, for example, it just adds back version="-//W3C//DTD XHTML 1.1//EN"
and I'm back where I started.
How can I force the DOM in Java to remove an attribute, even if the DTD indicates that attribute has an implied value? Or how can I just change/remove the DTD from the DOM tree so that implied attributes don't automatically show up?
I have succeeded in created a new, empty document without a DTD; importing the old document element and its descendants; and then replacing the root element in the new document with the imported element tree, but this is a lot of overhead and too inefficient. Is there a more efficient workaround or solution?
专注分享java语言的经验与见解,让所有开发者获益!
评论