hi guys, I am evaluating a viewer and have a temporary license.
I am using java application to start with and using maven dependency:
com.groupdocs
groupdocs-viewer
25.12
I am testing pdf and generating single file html files (one html file per page). Those are “self contained” files (all in html only) . I am also using high resolution of images to make sure quality is as original.
I have notices that this is quite slow, even on relatively good machine:
file-sample_150kB.pdf - 5.794 sec
tracemonkey.pdf - 63.05 sec
file-example_PDF_500_kB.pdf - 4.364 sec
file-example_PDF_500_kB.pdf (458.5 KB)
file-sample_150kB.pdf (139.4 KB)
Those files are not very big but it take relatively long time. I am afraid that when having really big files for production use cases it will not perform too well… Could I improve it somehow?
compressed.tracemonkey-pldi-09.pdf (992.5 KB)
Hello @anonymousP ,
Thank you for your interest in our product.
First of all, we would like to note that the performance of the Viewer when rendering documents to HTML is directly related to the content and structure of the input document. We are continuously working on improving the overall performance of our product.
To better assist you, could you please share the sample code you are using to convert PDF to HTML? We will analyze it along with the files you provided and do our best to offer recommendations for performance optimization, if possible.
We look forward to your response.
Sure,
here is my code:
InputStream stream = new FileInputStream(file);
Viewer viewer = new Viewer(stream);
Path outputFolder = Path.of("").toAbsolutePath()
.resolve("output")
.resolve(file.getName().split("\\.")[0]);
HtmlViewOptions viewOptions = HtmlViewOptions.forEmbeddedResources(outputFolder.resolve("page_{0}.html"));
//pdfOptions are only taken into consideration when input file is pdf !
PdfOptions pdfOptions = new PdfOptions();
pdfOptions.setImageQuality(ImageQuality.HIGH);
viewOptions.setPdfOptions(pdfOptions);
//generic options
viewOptions.setMinify(false); //Minification can break some layout fidelity (e.g., whitespace-sensitive glyph alignments).
viewOptions.setForPrinting(false); // prevents layout changes
viewOptions.setRemoveJavaScript(false);
viewer.view(viewOptions);
Hello @anonymousP ,
Thank you for the information provided.
We will get back to you with the results once our investigation is complete.