We have discovered an issue when we create PDF files. We cannot search the text within the PDF. When we create PDF files using other tools, our documents are searchable.
In attachment, you can find two PDF files which one was generated by the GroupDocs (not searchable) and one by the Foxit Reader PDF Printer (searchable). Both were generated from the attached file ‘Atqui electram.docx’.
To generate PDF I use following code:
1) in Global.asax.cs:
GroupdocsConversion.SetLicensePath(HostingEnvironment.ApplicationPhysicalPath + @"\App_data\Licenses\GroupDocs.Total.for.NET.lic");
GroupdocsConversion.SetRootStoragePath(HostingEnvironment.ApplicationPhysicalPath + SiteContext.CurrentSiteName + “\files\”);
2) and method to convert PDF:
private static string ConvertCommand(string inputFilePath, string inputFileName)
string pdfFilePath = String.Empty;
FileType fileType = FileType.Pdf;
var conversion = GroupdocsConversion.Instance();
string outputFilePath = fileTempStorage + inputFileName + “.” + fileType;
var convertResult = conversion.Convert(inputFilePath, outputFilePath, fileType);
if (convertResult.State == ConversionState.Completed)
pdfFilePath = convertResult.ConvertedFileName;
if (convertResult.State == ConversionState.Failed)
Exception ex = new Exception(convertResult.ErrorMessage);
EventLogProvider.LogException(typeof(GeneratePDFHelper).Name, MethodBase.GetCurrentMethod().Name, ex);
We are using Kentico and the PDF files cannot be indexed. We are able to use several other PDF files created using other tools, but those using GroupDocs.Conversion cannot be indexed.
When we upload the document to the SQL database, Kentico will index the document content, making it accessible for full-text search.
does the search indexing work for other PDFs? If so, it looks like the tool you are using is using something special to create the documents. I do not want to play table tennis but I do not see how is this related to Kentico. Our search is using standard .Net and Lucene engine to index the files. I am not sure how we can control this if the PDF is generated in some maybe non standard format.
Please rate my answer if you found it useful!
+1-866-328-8998 (US Toll free)
+61-1800-764-561 (APAC Toll free)
Nove sady 25, Brno 602 00, Czech Republic
This can be viewed with a standard Kentico installation. Upload a GroupDocs PDF document and try to search for content in that document.
Do you have an estimate when this will be completed?