Using GroupDocs.Parser to search for “test” in the attached document, shows hits on page 1 (pageIndex 0) and page 2 (PageIndex 1) which would be correct.
Using the HighlightOptions reveals that the results on pageIndex 0, are actually the header and footer.
Results with pageIndex 1 are coming from the actual first page, which should be pageIndex 0.
There are no results from page two, but that seems to be the same bug as I already reported here.
HighlightOptions options = new HighlightOptions(20);
var sr = parser.Search("test", new SearchOptions(false, false, false, true, options, options));
Test Document.zip (15.8 KB)
GroupDocs.Parser 20.12 in .NET was used.
@Clemens_Pestuka
Could you please further elaborate this scenario?
We are getting 2 results on page index 0 (page 1) and 6 results on page index 1 (page 2). Have a look at this screenshot.PNG (10.8 KB). Text “test” is highlighted just for once on page one. Would the output be only 1 hit from page index 0 (page 1)?
1 Like
@Atir_Tahir
I’m getting that as well.
Those two results are just header and footer.
Those are all results of index 0 actually (page 1).
The correct output would be 8 hits [or 6, depending to which page you count the header/footer] on page 1 (index 0):
image.png (28.4 KB)
And 4 hits [or 6, depending to which page you count the header/footer] on page 2 (index 1)
image.png (29.7 KB)
[Even word handles search results in the header a bit differently…]
What I can say for sure, that the results of the first page should have page index 0, regardless to which page you add the results from the header/footer.
1 Like
@Clemens_Pestuka
We’ve logged this scenario in our internal issue tracking system with ID PARSERNET-1732, you’ll be notified in case of any update.
1 Like
The issues you have found earlier (filed as PARSERNET-1732) have been fixed in this update. This message was posted using Bugs notification tool by Atir_Tahir