I am currently working on converting Word documents to clean HTML. I have been using Apache POI, but it seems to create messy output similar to MS Word's own HTML saving method. What I really need is a solution like the one offered by . For instance, when converting a table, I prefer not to include any width properties or unnecessary elements, just simple <td>
and <tr>
tags with maybe some <b>
formatting.
Can anyone suggest a better approach to achieve this? I am open to exploring alternative Java APIs for Word to HTML conversion, as I am not bound to using Apache POI.