PDF to Image conversion process optimization

Hi Team
We are trying to build PDF to Image (JPEG) converter using groupdocs conversation API for java.

The pdf has multiple pages and each page should be converted to an image.
It seems the converter.convert() method only convert one page at a time. To convert every page to an image a “for loop” is required. Which I believe also increase the conversion time as for every loop the whole pdf is loaded to the converter and a page is converted. This is not at all ideal.

Is there any way to optimise the conversion process for this case? It is important for our use case have faster conversation time.

If possible please share a code snippet with optimization. Any way to use streams or anything that can improve the performance.

Thanks

1 Like

@shivang2k

Please try this Convert to Image with Advanced Options code.

1 Like

@Atir_Tahir
Tried this, it is not solving the problem

I need something with with, by making a single call to converter.convert() all the pages of a PDF are converted to an image.

@shivang2k

Do you want to covert all pages in a PDF to a single image? No, then how this solution is not working for you?

It even doesn’t implement a loop. Do you get any exception?

No, I do not want a single image.

The given solution is only converting the first page of the PDF to an image. I want all the pages to be converted to separate image files.
This is the example code. I want to convert 1st 3 pages of a PDF to 3 separate image files

public static void run() {
    String outputFileTemplate = Constants.getConvertedPath("ConvertToImageWithAdvancedOptions-converted-page-%s.png");

    try {
        FileOutputStream getPageStream = new FileOutputStream(String.format(outputFileTemplate, 1));
        Converter converter = new Converter(Constants.SAMPLE_PDF);
        ImageConvertOptions options = new ImageConvertOptions();
        options.setFormat(ImageFileType.Png);
        options.setPageNumber(1);
        options.setPagesCount(3);
        converter.convert(getPageStream, options);
    } catch (IOException e) {
        System.out.println(e.getMessage());
    }


    System.out.print("\nDocument converted successfully. \nCheck output in " + Constants.getConvertedPath(""));
}

Ideally, according to you this should work. but I get the following exception

Exception in thread “main” class com.groupdocs.conversion.exceptions.GroupDocsConversionException: Saving complete multi page document to image is not supported. Please save by page. com.groupdocs.conversion.Converter.convert(Unknown Source)

I want to save page by page. but not sure how to do that

@shivang2k

This issue is reproduced at our end. Therefore, we’ve logged it in our internal issue tracking system with ticket ID CONVERSIONJAVA-1564. It’ll be now further investigated. You’ll be notified in case of any update.

1 Like

@shivang2k

Page by page conversion will be fixed in GroupDocs Conversion for Java 22.7. As the release gets onboard, we’ll notify you.

1 Like

Sure Thanks

What is the expected release date for this new version?

Can you provide me with a preview build for the new version (22.7) if it has the page by page conversion fix.

@shivang2k

API version 22.7 is expected to be released in July. As we have any further information/details, we’ll notify you.

Hi @Atir_Tahir
Any updates on this? We are stuck on this problem for quite long time. Also, if you could help me with getting some benchmarks for time taken for PDF to Image conversions.

Thanks

1 Like

@shivang2k

CONVERSIONJAVA-1564 is planned to be fixed in API version 22.7 that is expected to be released this month.

Please note that the conversion time varies from document to document. For example, if we have 2 PDF files of same size (e.g. 20MB). API may take 30 seconds to convert the first PDF and 1 minute for second.
Basically, it depends on multiple factors. Actually, we do not have any recommendations as the memory/resources or the time consumption totally depends on multiple factors including:

  • Type of the files processed
  • File’s content
  • Count files processed simultaneously

First PDF maybe a simple text base file and the second PDF has graphs, tables or clip art, images. Therefore, conversion time varies for the same sized PDF (or any other) files.

The issues you have found earlier (filed as CONVERSIONJAVA-1564) have been fixed in this update. This message was posted using Bugs notification tool by Atir_Tahir

1 Like

Great Thanks
@Atir_Tahir can you please proved sample code for conversation of multi page document to multiple images. I will be using the latest groupDocs version now. (22.8)

Thanks

@shivang2k

Please try the following code:

Converter converter = new Converter("input.pdf");
ConvertOptions convertOptions = new ImageConvertOptions();
convertOptions.setFormat(ImageFileType.Png);
converter.convert((SavePageStream) i -> {
           try {
                  return new FileOutputStream("output-page" + i + ".png");
           } catch (FileNotFoundException e) {
                  throw new RuntimeException(e);
           }
}, convertOptions);