Concatenate Multiple HTML pages from ViewerHtmlHandler

Greetings !

I have just started trying your software in order to render Office Document to HTML and using the ViewGenerator (java) example provided I noticed that Word documents gets converted in multiple different html files.

I was wondering if there was already an option out of the box to generate a single HTML file having all the page from the source document.

As a work around i could concatenate all page in different frames but I would prefer avoiding that.

P.S.: Please accept my apologies if that question has been already answered in other documents, I am still learning how your great product works.

@benoitl,

Thanks for taking interest in GroupDocs.Viewer.

GroupDocs.Viewer doesn’t provide any built in function to get a single HTML file containing HTML content of all the pages of source document. However, concatenating the HTML content of all the pages into a single HTML file is a simplified way to achieve your required functionality. For your reference, the following code snippet shows how to concatenate the HTML content of all pages into a single HTML file.

// Set configuration
ViewerConfig config = new ViewerConfig();
config.setStoragePath("<your storage folder's path>");
config.setCachePath("<your cache folder's path>");
config.setUseCache(true);

// Create ViewerHandler
ViewerHtmlHandler _htmlHandler = new ViewerHtmlHandler(config);
String guid = "calibre.docx";

// Create HtmlOptions
HtmlOptions options = new HtmlOptions();
options.setResourcesEmbedded(true);

// Get Html pages
List<PageHtml> pages = (List<PageHtml>) _htmlHandler.getPages(guid, options);

String outputHtml = "";

// Concatenate all the content into a single string
for (PageHtml page : pages) {

	outputHtml+= page.getHtmlContent();	
}

// Save the outputHtml as an HTML file. . .
File file = new File("output_all_pages.html");
PrintWriter writer = new PrintWriter(file, "UTF-8");
writer.write(outputHtml);
writer.close();

Hope it helps.

Thank you very Much, however this will duplicate the HTML headers, I am not sure this would work correctly.

I will give it a try nevertheless

On another Subject, I am trying to find where to download the Modern UI Servlet, is that included in the same GroupDocs.Viewer.jar ? (i was expecting some .war file instead with the various jsp)

I apology, I have found where to get it

Again I appreciate your support

@benoitl,

Thanks for your response.

Sure. You can check it at your end and get back to us in case of any issue.

Please note that GroupDocs.Viewer is a back-end API that provides backend features to render the document pages either as HTML pages or images. The API is available as JAR file. Whereas, GroupDocs.Viewer-for-Java-App is a servlet based open source document viewer application that uses GroupDocs.Viewer for document rendering at the back end and it is not included in GroupDocs.Viewer.jar (read more). We would also recommend you to please have a look at the documentation of the API.