PDF Conversion with Arabic Content - renders content wrongly

akashkrishnan · April 22, 2026, 6:39am

I have a docx file with Hebrew Content. It renders content in Wrong Direction.

Check the below Pics and also Attached Docx & PDF as well

Also Attached Sample Code We use for Converting Docx to PDF

import com.groupdocs.conversion.Converter;
import com.groupdocs.conversion.contracts.FontSubstitute;
import com.groupdocs.conversion.options.convert.PdfConvertOptions;
import com.groupdocs.conversion.options.load.WordProcessingLoadOptions;

import java.util.ArrayList;
import java.util.List;

public class ConvertDocxToPdf {
    public static void main(String[] args) {
        String sourceFilePath = "path/to/your/testSEFE03.docx";
        String outputFilePath = "path/to/output/testSEFE03.pdf";

        // Create load options with font substitution
        WordProcessingLoadOptions loadOptions = new WordProcessingLoadOptions();
        loadOptions.setAutoFontSubstitution(false);
        loadOptions.setDefaultFont("Arial"); // Set a default font if needed

        List<FontSubstitute> fontSubstitutes = new ArrayList<>();
        fontSubstitutes.add(FontSubstitute.create("Arial MT", "Arial")); // Map Arial MT to Arial
        fontSubstitutes.add(FontSubstitute.create("Times New Roman", "Times New Roman")); // Ensure Times New Roman is used

        loadOptions.setFontSubstitutes(fontSubstitutes);

        // Initialize the converter
        try (Converter converter = new Converter(sourceFilePath, () -> loadOptions)) {
            PdfConvertOptions options = new PdfConvertOptions();
            converter.convert(outputFilePath, options);
            System.out.println("Conversion completed successfully.");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Screenshot 2026-04-22 at 12.04.05 PM.jpg (74.6 KB)

Screenshot 2026-04-22 at 12.05.11 PM.png (306.6 KB)

Generated Docx.docx (181.8 KB)

Generated PDF.pdf (36.8 KB)

nikola.yankov · April 22, 2026, 1:35pm

@akashkrishnan

Thanks for the report — this is a classic malformed RTL case. The paragraphs are missing <w:bidi/> and the Hebrew runs have <w:rtl w:val="0"/>. Word and LibreOffice detect the Hebrew characters and fix the direction automatically. GroupDocs.Conversion currently just follows what the XML says, which is why your PDF comes out reversed.

We’ve now added an RTL auto-detect pass to the WordProcessing → PDF pipeline that does the same thing Word does — it checks the actual characters, fixes paragraph and run direction, and keeps the visual alignment. It’s exposed as WordProcessingLoadOptions.AutoDetectRtlDirection and is on by default, so your existing code will just start working correctly.

It will ship in GroupDocs.Conversion for .NET 26.6, and about a month later in the Java version.

akashkrishnan · April 23, 2026, 8:21am

@nikola.yankov Is it possible to give the update for Java Sooner

vsevolod.orefin · April 23, 2026, 9:17pm

Hi @akashkrishnan, I will introduce it in GroupDocs.Conversion for Java 26.5