从HTML DOM中删除一些HTML标签。

huangapple 未分类评论36阅读模式
英文:

Remove few html tag from the html dom

问题

<table>
    <colgroup>
        <col style="width: 20%"/>
        <col style="width: 20%"/>
        <col style="width: 50%"/>
        <col style="width: 10%"/>
    </colgroup>
    <tbody>
        <tr>
            <th colspan="1">
                <p>Header1</p>
            </th>
            <th colspan="2">
                <p>Header2</p>
            </th>
            <th colspan="1">
                <p><a><strong>Header3</strong></a></p>
            </th>
        </tr>
        <tr>
            <td colspan="1">
                <p>Value1</p>
            </td>
            <td colspan="2">
                <p>Value2</p>
            </td>
            <td colspan="1">
                <p><a><strong>Value3</strong></a></p>
            </td>
        </tr>
    </tbody>
</table>
英文:

I have below html table. I want to convert it to xml. I have done my coding as below whereby this will convert to html dom first and later i will convert it to xml. My problem is i just want to remain the <th>, <tr> <tbody>,<table> and <p> tag the rest of the tag should not be captured in the document How can i do that? As i would like to change the html table to xml table. So after that i will able to proceed to use list to insert the data to a class which will then be converted ot xml.

builder = factory.newDocumentBuilder();		
is = new InputSource(new StringReader(tableInString));
document = builder.parse(is);
document.getDocumentElement().normalize();
&lt;table style=&quot;width: 100%;&quot;&gt;
    &lt;colgroup&gt;
        &lt;col style=&quot;width: 20%;&quot;/&gt;
        &lt;col style=&quot;width: 20%;&quot;/&gt;
        &lt;col style=&quot;width: 50%;&quot;/&gt;
        &lt;col style=&quot;width: 10%;&quot;/&gt;
    &lt;/colgroup&gt;
    &lt;tbody&gt;
        &lt;tr&gt;
            &lt;th colspan=&quot;1&quot;&gt;
                &lt;p&gt;Header1&lt;/p&gt;
            &lt;/th&gt;
            &lt;th colspan=&quot;2&quot;&gt;
                &lt;span&gt;&lt;div&gt;Header2&lt;/div&gt;&lt;/span&gt;
            &lt;/th&gt;
            &lt;th colspan=&quot;1&quot;&gt;
                p&gt;&lt;a&gt;&lt;strong&gt;Header3&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;
            &lt;/th&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
            &lt;td colspan=&quot;1&quot;&gt;
                &lt;div&gt;Value1&lt;/div&gt;
            &lt;/td&gt;
            &lt;td colspan=&quot;2&quot;&gt;
                &lt;span&gt;&lt;div&gt;Value2&lt;/div&gt;&lt;/span&gt;
            &lt;/td&gt;
            &lt;td colspan=&quot;1&quot;&gt;
                &lt;p&gt;&lt;a&gt;&lt;strong&gt;Value3&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;
            &lt;/td&gt;
        &lt;/tr&gt;
    &lt;/tbody&gt;
&lt;/table&gt;

huangapple
  • 本文由 发表于 2020年4月5日 22:39:03
  • 转载请务必保留本文链接:https://java.coder-hub.com/61044313.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定