A space has been included before the CSS tag in the document for docx4j/Js

I am looking for a way to convert an HTML formatted string into a docx file. Currently, I am using Jsoup to clean up the HTML and then docx4j to parse the XHTML into a docx format.

However, I encountered an issue with colors since they are not supported by docx4j. To work around this problem, I modified my string to apply color using CSS styling with a new tag (random name tag). Although the color now works, there are some extra spaces added before and after the colored text.

Below is the code snippet:

import java.io.File;
import java.util.List;

import org.docx4j.Docx4J;
import org.docx4j.convert.in.xhtml.XHTMLImporter;
import org.docx4j.convert.in.xhtml.XHTMLImporterImpl;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

public class test {

    public static void main(String[] args) throws Docx4JException {
        WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
        String outputfilepath = "test.docx";


        String d = "<html xmlns=\"http://www.w3.org/1999/xhtml\"><head><style type=\"text/css\">body{font-family:Arial; font-size:120%;}si{color:#0000FF;padding-right: 0;margin:0;}si:before { padding-right: 0;margin:0;indent:0; }</style></head><body><p>blabla<si><strong>blaebdqzd</strong>qdzd</si>zdqzdq</p></body></html>";
        String e ="<html xmlns=\"http://www.w3.org/1999/xhtml\"><head><style type=\"text/css\">body{font-family:Arial; font-size:120%;}si{color:#0000FF;padding-right: 0;margin:0;}si:before { padding-right: 0;margin:0;indent:0; }</style></head><body><p><si><strong>blaebdqzd</strong>qdzd</si>zdqzdq</p></body></html>";


        XHTMLImporter importer = new XHTMLImporterImpl(wordMLPackage);
        String text = htmlToXhtml(d);
        List<Object> content = importer.convert(text, null);
        wordMLPackage.getMainDocumentPart().getContent().addAll(content);

        importer = new XHTMLImporterImpl(wordMLPackage);
        text = htmlToXhtml(e);
        content = importer.convert(text, null);
        wordMLPackage.getMainDocumentPart().getContent().addAll(content);


        Docx4J.save(wordMLPackage, new File(outputfilepath), Docx4J.FLAG_NONE);
    }

    private static String htmlToXhtml(final String html) {
        final Document document = Jsoup.parse(html);
        document.outputSettings().syntax(Document.OutputSettings.Syntax.xml);
        return document.html();
    }
}

Can anyone provide assistance with this issue? Thank you!

Answer №1

Here are a couple of suggestions to consider:

  1. Adjust the display property of your random element "si" to either block or inline
  2. Consider using a span instead

Additionally, converting xhtml to docx involves utilizing Flying Saucer / xhtml renderer, so conducting a search related to that may provide helpful insights.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Encoding URLs to be search engine optimization friendly

As I work on my initial blog website design, I am interested in learning how to incorporate SEO friendly URL encoding. Currently, my URL structure looks like this: However, I aspire to achieve URLs that resemble the following structure: Could someone of ...

"Creating a Customized Justified Navigation Menu with Bootstrap

Trying to implement a justified bootstrap nav using bootstrap 3.3? Check out this link for guidance! If you want to see a working demo, click the link below: View the code on CodePen here. After downloading and exporting the code, some users have f ...

Troubleshooting: Unable to Remove Files in PhoneGap

I've been working on a basic app that heavily utilizes PhoneGap to test its capabilities. Currently, I'm trying to delete a file that has been downloaded within the app, but I'm encountering some issues. The majority of the code I've im ...

What is the best way to save data in order to effectively showcase a collection of images that together form a single entity in a game

Apologies for the unclear title, I struggled to find the right words. Recently, I delved into the world of 2D game development and was amazed by the capabilities of HTML5's Canvas element. Currently, I am working on my first basic project to grasp th ...

Display a layout of dynamic checkboxes without using specific indexes

This particular inquiry expands upon the question posed in this post: How to map dynamic array of input fields. I am dealing with a dynamic collection of rows, each containing its own set of input fields. As these rows can be dynamically added to the DOM, ...

The nth-child selector fails to function properly with a customized MUI component in CSS

I've created a styled component as shown below: const FormBox = styled(Box)(({ theme }) => ({ width: "47vw", height: "25vh", backgroundColor: theme.palette.grey[100], borderRadius: theme.shape.borderRadius, marginLeft: ...

"An empty image is represented by a red cross symbol with a disappointed expression

Is there a way to prevent Internet Explorer from displaying red crosses when the image is empty? If I have an ASP Image, can I make it invisible and check if it contains any content or not? <asp:Image ID="Image1" runat="server" ImageUrl="Handler.ashx" ...

Activating development mode for LessCSS in Node.js

I'm encountering some major difficulties identifying LESS parse errors within a large CSS compilation. The error messages aren't providing much help (at least not for me). Here are my questions: 1. Is there a method to activate more debugging ...

Is there a way to retrieve the modal's viewport height in Angular?

Is it possible to determine the viewport height of my ng bootstrap modal within my Angular application? Here is what I currently have: I have a modal with CSS styling as shown below: .modal-xxl { width: 95% !important; max-height: 90% !important; ...

Searching for components in Python Selenium - a missing element

Encountered a strange issue with the "find_elements_by_css_selector" function in Selenium. Despite expecting to retrieve 10 elements, only 9 are returned consistently across various page examples. Interestingly, when testing the same selector using JavaScr ...

Conceal the element if the offspring is devoid of content

I want to implement a feature where a div is hidden when its child element is empty. To achieve this, I aim to assign the class .no-content to the div if it contains no elements. Below is my existing code with spaces: <div class="ee-posts-list&quo ...

PHP: Injecting CSS directly into the head tag instead of the body (Joomla plugin)

I've been using the AutsonSlideShow extension for Joomla 1.7 and it's been working well for me. However, one issue I have with the plugin is that it injects CSS directly into the body of the index.php file, which I would like to change for valida ...

What is the best way to center text within a div that is wider than 100%?

On my webpage, I have a header set to 100% width and text positioned on the banner. However, when zooming in or out, the text's position changes. How can I ensure that the text stays in place regardless of zoom level? Example: jsfiddle.net/5ASJv/ HT ...

Is there a way to connect the YouTube iframe in order to display a pop-up message?

Is it possible to display popup messages when clicking on a YouTube video embedded in an iframe? <a href='login.html?sid=0&keepThis=true&TB_iframe=true&height=240&width=650' class="thickbox" title='test'> ...

Unable to highlight text within a div container

I am currently facing an issue with my layout that involves three columns stacked to the left and four inner columns with three floated left and one floated right. This configuration is causing a problem where I cannot select any text on the page. Removing ...

Ways to deactivate a text area or mat-form-field

I need assistance with disabling a form field using Angular (4) + Angular Material and Reactive Forms. I have tried searching for syntax options like disabled="true", but haven't found the correct one yet. Can you please provide me with the right synt ...

Ways to prevent a div element from inheriting the styling of its parent container

I am facing an issue with two .SCSS stylesheets on my website. One serves as a base, while the other is for the homepage. In the base stylesheet, the div element has a float of left. However, I want to remove this float specifically in the What We Do sect ...

The attempt to use RTSP streaming with Node.js for an IP camera and JSMPEG

Having an issue connecting to an IP camera where the image does not appear properly When running node app.js "frame= 1970 fps=2.0 q=7.6 size= 22457kB time=00:16:25.00 bitrate= 186.8kbits/s dup=0 drop=3 speed=1.01x" Upon executing index.html &qu ...

Is there a way to only refresh the div specifically for the calendar with an id of "

$(document).ready(function() { $("#client-list").on("change", function() { var selectedValue = $(this).val(); location.reload(); }); }); Is there a way to refresh only the div with id='calendar' without refreshing the entire pa ...

What causes the position: sticky; property to fail in React implementations?

Currently, I am facing a challenge in making two sidebars sticky so that they will follow as the user scrolls until they reach the bottom of the page. In my current editing project, this implementation seems to be more complex compared to other programs w ...