I would like to ask if there is any ready solution for your library to handle German grammar (nouns in the singular/plural, irregular forms of verbs etc.).
Thank you in advance.
Could you please share a sample Word or PDF file and search query/string? We’ll investigate and guide you accordingly.
Here are some small sample files from our client’s fileshare. Our end users will search for word or words/phrases within content of files but I’m not able to give a sample search string since we are still considering and seeking for technical solution for this.
About 80% of those documents are written in German language.
test1.pdf (270.4 KB)
test2.docx (35.0 KB)
Thanks for the details. We are investigating this scenario. Your investigation ticket ID is SEARCHNET-2705.
We performed Search over these two documents using following code:
License lic = new License();
lic.SetLicense(@"D:/GroupDocs.Search.lic");
var settings = new IndexSettings();
settings.UseRawTextExtraction = false;
var index = new Index(@"D:/Index", settings);
index.Add(@"D:/folder");
// Getting list of indexed documents
DocumentInfo[] documents = index.GetIndexedDocuments();
for (int i = 0; i < documents.Length; i++)
{
DocumentInfo document = documents[i];
Console.WriteLine(document.FilePath);
FileOutputAdapter outputAdapter = new FileOutputAdapter(@"D:/" + Path.GetFileName(document.FilePath) + ".html");
index.GetDocumentText(document, outputAdapter);
}
var result = index.Search("Lüftungsgeräte");
Console.WriteLine("DocumentCount: " + result.DocumentCount);
Console.WriteLine("OccurrenceCount: " + result.OccurrenceCount);
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
And it detected the document and the word Lüftungsgeräte occurrence counts.
As far as licensing is concerned, you can avail a temporary license following these steps.