Searching for a phrase with stop words - GroupDocs Search for .Net

I have created an index with the default stop words. When I search for an exact phrase with stop words, I do not get any results back.

For example, say the phrase I am looking for is “The information contained in these materials”. When I search for the exact phrase (using double quotes), I get no results back. So I tried a few more things - I tried removing the stop words and using regex to come up with a query like “information contained”&materials This time it found matches but it also found results for documents where the “information contained” phrase was found and then somewhere else in the document it found the word materials. I also tried to search for the phrase without any stop words at all like “information contained materials” but that yielded no results.

I would like to keep the default stop words so that the index doesn’t grow huge and the performance is
not affected by having to index those stop words. Is there any way to handle this situation? Basically we want to keep the stop words but also let the user search for a phrase as a whole.

Thanks!

@ncodedcode,

Thanks for taking interest in GroupDocs.Search for .NET and posting your concerns.
We would like to help you out. Can you please share following details with us:

  • API Version (e.g. 18.12, 18.9) that you integrated in the project
  • Sample code or project
  • Sample source file

API version: 18.12
Sample code:
//IndexLocation is just a constant that points to the directory on the drive where we have created a test index
//The index was created using the default settings and just adding bunch of test folders using
// index.AddToIndex
var index = new Index(IndexLocation);

//This yields no results
var results = index.Search("“information contained in these materials”");

//This yields no results as well
var results2 = index.Search("“information contained materials”");

//This yields false positives in the sense that materials is down somewhere else in the doc rather than next to //the phrase
var results3 = index.Search("\“information contained\”&materials");

Here are the two files I used for my index:
files.zip (17.9 KB)

Let me know if you need anything else.

@ncodedcode,

Thanks for sharing the details.
We are investigating this scenario. Your investigation ticket ID is SEARCHNET-1845. We’ll notify you as we have any update on it.

@atirtahir3 - Any updates on the ticket? We have to give a recommendation to our client whether or not this product will work for their requirements and this issue is a major showstopper for them.

@ncodedcode,

We have an update on SEARCHNET-1845. Stop words will work correctly in next release of the API (19.3). As release gets on-board, we’ll notify you.

However, as a workaround (in current version) you may replace the query "“The information contained in these materials” with the following query:
"“information contained *1 *1 materials” or "“information contained *2 materials”
And you can search for each word from phrase search query in index.Dictionaries.StopWordDictionary and replace stop words with string " *1 ".

@ncodedcode,

Your reported issue SEARCHNET-1845 is now fixed in GroupDocs.Search for .NET 19.3. Please download latest release of the API.