HTML-based rendering mode and formatting issues

hi,



I’ve been using the HTML-based rendering mode (which became available in GroupDocs Viewer 2.0 for .NET) for improved performance and have been observing some rather noticeable formatting issues with previews of doc/docx/pdf documents. The most common unwanted side-effect is the missing whitespaces, followed by irregular capitalization. I’ve tried reproducing this issue using the sandbox provided to me on the GroupDocs site, and there the document looks fine, however I’m pretty sure that the GroupDocs sandbox is using image-based rendering mode (which is the default behaviour, if I’m not mistaken).



For example, try previewing the following file with UseHTMLBasedEngine(true) and see if you can also notice the issues I’m referring to - fyi I’ve tried Chrome, Firefox and latest IE, with more or less same results:



Redirect Notice



I’ll attach a couple of screenshots of what I’m seeing, just in case. In the second screenshot, there even appears to be some data missing from the table (e.g. last column should have rows with values like 2000, not 000)



p.s. looks like I don’t have permission to add attachments, after all. So I’ve uploaded them here:



http://tinypic.com/r/25ppqna/8

http://tinypic.com/r/rvd1mx/8



regards,

Greg

Furthermore, when previewing some .doc/.docx files, the fonts and text positioning/alignment shown in the preview seem to be different to those in the original document.


I also have a couple of documents which exhibit peculiar behaviour when rendered:

* one .doc file results in a preview where some of the paragraphs are rendered one letter at a time (instead of one word at a time), meaning the search functionality doesn’t behave as expected (if I save this document as a .pdf, and preview the latter, problems go away)

* one .docx file results in entire document being rendered as an image (.svg I think) in the preview, even though I have UseHTMLBasedEngine clearly set to ‘true’ - this means for that document I’m unable to highlight/select snippets of text.

Not sure what’s so special about these couple of documents (which are part of a subset of randomly googled documents of particular file types I found on the internet) that makes them behave in the above-mentioned manner… I’ll be happy to share them.

Hello Greg,

We are sorry to hear that you have such issue. You’re absolutely right - HTML-based rendering mode in GroupDocs.Viewer is not perfect yet. We can reproduce all distortions that you had described. That’s why we leave image-based rendering mode, and that’s why it is enabled by default. Your guess is correct - we are using image-based rendering in GroupDocs.Viewer for Cloud (which was provided to you on the GroupDocs Web site).

Actually, HTML-based rendering mode is a pretty new feature and it is still under development at this moment, our developers work with it. And for such cases, when document is displayed incorrectly, you should use image-based rendering. Only rasterization allows to display a 100%-copy of original document in a browser.

Thank you for providing us a document “n1007-05-2006.pdf”. We will send it to our developers as a sample which is displayed incorrectly so they will be able to fix it. We also will be very pleasant if you will share with us other documents (that you mentioned in your second post) that are distorted in the GroupDocs.Viewer if it’s possible.

Thanks Denis

The other two documents I referred to in my last post can be found here:
First one seems to get previewed as one image, even though UseHTMLRenderingMode(true) is used - you can see it from the fact that you’re not able to highlight/select individual words in it.

Second one has issues with rendering (e.g. each letter in the first heading is rendered as a separate word, so searching for ‘Richard’ won’t match the first occurrence of that word)

Redirect Notice

Redirect Notice