PDF to DOC or DOCX - Text overlapping issue

We are facing one issue of converting from pdf to docx, few texts are getting overlapped. We are using the recent version of groupdocs conversion only.

Below is the java code snippet, we tried. Attached the pdf and converted docx for reference.

    String convertedFile = "template_" + templateId +".doc";
    com.groupdocs.conversion.Converter converter  = new com.groupdocs.conversion.Converter(file1.getAbsolutePath());

    WordProcessingConvertOptions options = new WordProcessingConvertOptions();
    options.setFormat(WordProcessingFileType.Doc);
    options.setPdfRecognitionMode(PdfRecognitionMode.FLOW);
    options.setFormat_ConvertOptions_New();
    options.setPageOrientation(PageOrientation.PORTRAIT);
    options.setZoom(50);
    options.setPageSize(PageSize.A4);

// options.setHeight(100);
// options.setWidth(100);
converter.convert(convertedFile, options);

Can you please share the login or raise the details to group docs support forum ticket?
strong text
image.png (89.1 KB)

But when we do conversion through the online, it was converted properly without issue.

Please let us know any other options to be specified for converter options to convert without the overlapping issue.

Hi,
I’m actually investigating this bug.
Bug #CONVERSIONJAVA-2358 ticket was created in our internal tracker.

Best regards

@suhail.thusu I found the cause of this bug and will update you when it will be fixed.

Any response for this bug?

@suhail.thusu work is still in progress.
I will update you when will have some new information.

@vsevolod.orefin can you please respond for this query?

Hi @suhail.thusu it is still in work. For now I can’t give ETA.

@suhail.thusu we found that the text is overlapped in the source pdf

It was not the case, I have attached the source pdf and and target docx, in pdf the content was proper but in the target docx it was misaligned.

I have attached sample screenshot also for this.
output (3).pdf (928.3 KB)

doc1 (2).docx (777.0 KB)

pdf_has_bulletin_without_overlap.png (72.3 KB)

word_doc_after_conversion_bulletin_got_overlap.png (123.8 KB)

Hi @suhail.thusu ,
Which version of the GroupDocs.Conversion do you use? I can’t reproduce this bug in the latest releases.

We are using the 24.1 version

<dependency>
   <groupId>com.groupdocs</groupId>
   <artifactId>groupdocs-conversion</artifactId>
   <version>24.1</version>
</dependency>

@suhail.thusu , I’m investigating this issue