Free Support Forum - groupdocs.com

Scanned PDF file not able to view as Text after converting same to OCR


#1

Hi,

I have one scanned image as PDF and I upload the same in my application with below steps:-

1) Upload Scanned Image
2) Scanned image is converted to OCR and document is saved in database.
3) I preview the document in groupdocs but it displays the document as Image instead of text.

But When I download the same its converted to ocr and I can search the text in other softwares such as adobe reader.

Also please find the attached document below.
Please responds ASAP.



#2

Hello Sumit,

We’ve downloaded and investigated the document that you’ve attached to the forum post. In the “scannedfiles.zip” archive there is one document “Scanned-OCRDoc(1).pdf” (62 067 bytes). This document doesn’t have a text layer - Adobe reader, Adobe Acrobat and all other PDF viewers that we’ve tried were not able to select text, make search and so on.

Please send us a document that you obtain after the OCR process and which has a text layer.

Thanks and waiting for a document.


#3

Hi,

Please find both documents scanned & searchable text.


#4

Hello Sumit,

One more time, thank you for the uploaded documents. Yes, there is a text layer in the “Scanned-OCRDoc.pdf” file. As for the GroupDocs.Viewer, situation is very interesting. In the new HTML-based rendering mode GroupDocs.Viewer cannot extract the text: you cannot select it, search is not working. But in the image-based rendering mode both these functions are working, as you can see on the screenshot.






For this time we suggest you to use image-based mode. From our side, our developers begin to investigate the document. We will notify you in this forum thread when new info will arise.

Thanks.