Hi,
I have a problem with text recognition from specific pdf documents. When i’m trying to select or find some words from these documents, there is no any highlighted area on the preview. However, when i try to copy and paste anything from there, all i can see are same, random letters instead of actual words. It is worth to mention that text layer in each document was extracted before by OCR application based on FREngine ABBYY FineReader. My question is, what is the cause of the problem? Can you suggest a solution in this? I’m using groudocs viewer 2.15.1.0 and 3.7.0.0. In attachment you will find sample document.
Thanks,
Witold
Hi Witold,
Unfortunately, this error occurs on either image based rendering or HTML based rendering. Also, as you suggested, i used GroupDocs.Viewer for .NET 16 for testing, but effect was the same.
Regards,
Witold
Hi Witold,
Hi,
I would like to ask you if you figured out, what cause problems with the preview of specific documents and how to solve it?
Regards,
Witold
Thanks for coming back to us.
The quality of PDF document may effect its rendering and cause the problem. However, the issue is still under investigation and we can not provide you any information at this stage until we get the results. Once we have any further updates, we will notify you here.
Warm Regards
Hi Witold,
Hi,
Thank you for your detailed response. However, i would like to ask you another question. If this document doesn’t contain any text but only raster image, so why i am able to copy, mark, search actual words from this document when is opened in Adobe Reader and similar software?
Regards,
Witold
Hi Witold,
Hi,
Thank you for your quick response. I did a research about what you say and with all respect, i don’t think that Adobe Acrobat Reader in standard version use OCR technology to provide text selection on documents which text layer was extracted before by third-party software. The OCR functionality is only available in Adobe Acrobat Reader Pro and is used as separated mechanism but , as i mentioned before, i am able to select, copy text in standard version of Adobe Acrobat Reader. In my opinion, text layer is extracted not as raster image but as a text. Have you any other ideas what is the cause of the problem?
Regards,
Witold
Hi Witold,