Hello @anonymousP ,
Thank you for your patience.
After conducting some investigation on our side, we have prepared several recommendations that may help improve the rendering performance when converting your PDF document to HTML.
1. Use HtmlViewOptions.forExternalResources(...) instead of HtmlViewOptions.forEmbeddedResources(...)
Whenever possible, we recommend using HtmlViewOptions.forExternalResources(...) to store page resources (CSS, fonts, images) as separate files instead of embedding them directly into the HTML.
This approach often improves rendering and page loading performance, especially for large documents. During our testing with your file, this change improved rendering performance by approximately 25%.
Additionally, using forExternalResources requires significantly less Java heap space.
2. Use a lower image quality setting
You may also consider using a lower image quality value, such as:
ImageQuality.LOW
We understand that this may affect image quality, but if it is acceptable for your use case, it can reduce rendering time by up to 5%.
3. Enable PdfOptions.setWrapImagesInSvg(true)
This option wraps raster images from the PDF page into an SVG container, which helps preserve the precise positioning of elements. It is also particularly useful for scanned documents, technical drawings, and PDFs containing a large number of graphical elements. Using this option may improve rendering performance by up to 10%. Based on these recommendations, we updated your code example and attached the modified version below. Please try using it and let us know your feedback.
InputStream stream = new FileInputStream(file);
Viewer viewer = new Viewer(stream);
Path outputFolder = Path.of("").toAbsolutePath()
.resolve("output")
.resolve(file.getName().split("\\.")[0]);
String pageExternalFilePathFormat = outputFolder.resolve("page_{0}.html").toString();
String resourceFilePathFormat = outputFolder.resolve("page_{0}_{1}").toString();
String resourceUrlFormat = outputFolder.resolve("page_{0}_{1}").toString();
HtmlViewOptions viewOptions = HtmlViewOptions.forExternalResources(pageExternalFilePathFormat, resourceFilePathFormat,
resourceUrlFormat);
//pdfOptions are only taken into consideration when input file is pdf !
PdfOptions pdfOptions = new PdfOptions();
pdfOptions.setImageQuality(ImageQuality.LOW);
pdfOptions.setWrapImagesInSvg(true);
viewOptions.setPdfOptions(pdfOptions);
//generic options
viewOptions.setMinify(false); //Minification can break some layout fidelity (e.g., whitespace-sensitive glyph alignments).
viewOptions.setForPrinting(false); // prevents layout changes
viewOptions.setRemoveJavaScript(false);
viewer.view(viewOptions);
Finally, during our investigation and testing with your document, we noticed that page 13 requires significantly more processing time than the other pages. Rendering this page alone takes more than 50% of the total rendering time for the entire document.
Therefore, we have scheduled a more detailed investigation of this case. If possible, we will try to further improve PDF-to-HTML rendering performance in future versions of our product.