Document viewer API implementation in Java

Hello,
I am migrating from 19.11 version to 20.1. As pointed here all classes I have been using are now deprecated. I would like to know/confirm the following:

  • I was using a single ViewerHtmlHandler object for all rendering tasks in all threads, I would need to create a new Viewer instance everytime instead, right?
  • Do these methods clearTempFiles and clearCache still exist?
  • Does this exception FileTypeNotSupportedException or any equivalent exist?

Any help is appreciated :slight_smile:

Jorge

1 Like

@jorgeeflorez,

We are investigating these points. Your investigation ticket ID is VIEWERJAVA-2174. As there’s any further update, you’ll be notified.

@jorgeeflorez,

Yes, from now you need one instance of Viewer per file. Please don’t forget to call close() method on Viewer object to release the resources.

  • clearTempFiles - this method will be added in some upcoming releases of the API, we’ll notify you as the release is available
  • clearCache - this method should be implemented as per your consumption. To use caching feature you can start with File Cache and extend it with clearCache method that will handle cache removal

API will throw generic GroupDocsViewer exception. All the supported file types can be listed with FileType.getSupportedFileTypes() method and you can predict if the file is supported or not (of course this is true in case you know what is you file type).
Please let us know if there is anything unclear.

@atirtahir3, thank you for your reply.

should I expect many temporal files or large space consumption in System.getProperty(“java.io.tmpdir”) directory?

1 Like

@atirtahir3,
in previous version I set the encoding to the different options object.

    HtmlOptions options = new HtmlOptions();
	options.setEmbedResources(true);
	options.getCellsOptions().setEncoding(encoding);
    options.getWordsOptions().setEncoding(encoding);
    options.getEmailOptions().setEncoding(encoding);

is it still done now?

after calling view(), how can I get the generated html? do I have to read the generated file? does PageHtml no longer exist?

1 Like

@jorgeeflorez,

No, you can call ViewerHtmlHanlder.clearTempFiles this method is still available and works as before. So, calling this method from time to time will reduce space consumption.

You can still set the encoding. Settings has been moved to LoadOptions class. See the example below:

LoadOptions loadOptions = new LoadOptions();    
loadOptions.setCharset(Charset.forName("UTF-8"));

Viewer viewer = new Viewer("sample.csv", loadOptions);
viewer.view(HtmlViewOptions.forEmbeddedResources());     
viewer.close();

It basically depends on your use-case. By default, we’re saving pages to disk, but you can also save pages into a stream as shown below:

Viewer viewer = new Viewer("sample.docx");
viewer.view(HtmlViewOptions.forEmbeddedResources(new PageStreamFactory() {

    @Override
    public OutputStream createPageStream(int pageNumber) {
        return new ByteArrayOutputStream();
    }

    @Override
    public void closePageStream(int pageNumber, OutputStream pageStream) {
        ByteArrayOutputStream outputStream = (ByteArrayOutputStream)pageStream;
        byte[] bytes = outputStream.toByteArray();

        String pageHtml = new String(bytes);
        System.out.println("Page " + pageNumber);
        System.out.println("HTML " + pageHtml);
    }

}));     
viewer.close();

Yes, PageHtml class no longer exists. It was just a wrapper around output result, so it’s quite simple to create one based on the above example.

Thank you @atirtahir3 for your answers.
Another question, can I get the page width and height when getting the file info?
Using the following code:

public static void main(String[] args) throws FileNotFoundException {
    File file = new File("C:\\Users\\Jorge Eduardo\\Downloads\\et.png");
    try (Viewer viewer = new Viewer(file.getAbsolutePath())) {
        ViewInfoOptions viewInfoOptions = ViewInfoOptions.forHtmlView();
        ViewInfo documentInfo = viewer.getViewInfo(viewInfoOptions);
        
        int pageCount = documentInfo.getPages().size();
        System.out.println("pageCount " + pageCount);
        System.out.println("pageWidth " + documentInfo.getPages().get(0).getWidth());
        System.out.println("pageHeight " + documentInfo.getPages().get(0).getHeight());
    }
}

it prints 0. I was expecting the image width and height.

1 Like

@jorgeeflorez,

We’ll investigate this and update you.

@jorgeeflorez,

Please try to create ViewInfoOptions with ViewInfoOptions.forPngView() and the ViewInfo will contain output page size.
When you’re passing ViewInfoOptions.forHtmlView() there is no correct value that we can return because HTML pages are not fixed by width and height.

ViewInfoOptions viewInfoOptions = ViewInfoOptions.forPngView(); 
ViewInfo documentInfo = viewer.getViewInfo(viewInfoOptions);

@atirtahir3 thank you,
I would like to know, if possible, what was then returned in version 19.11 (in object DocumentInfoContainer) because I was using that width and height of the first page to determine the size of the divs that contained the html of each page I was rendering (I guess this is different, or I was using that info in the wrong way). I am not sure how I will adjust this…

@jorgeeflorez,

We’ll investigate and update you.

Hi @atirtahir3,
I was wondering, is there any progress on this?

@jorgeeflorez,

This scenario is still under investigation. You’ll be notified about the outcomes soon.

@jorgeeflorez

Please try the code below:

Viewer viewer = new Viewer("sample.pdf");
ViewInfoOptions options = ViewInfoOptions.forPngView(false);
ViewInfo viewInfo = viewer.getViewInfo(options);
System.out.println(viewInfo.getPages().get(0).getWidth());
System.out.println(viewInfo.getPages().get(0).getHeight());
viewer.close();