Feature request: Allow specifying FILE_FLAG_BACKUP_SEMANTICS optionally when opening files

Hello there,

We have a feature request:

We’d like to be able to extract files you usually don’t have access to, via the backup privilege, which seems to not be possible with GroupDocs.Search at the moment.

With Win32, it’s possible to call AdjustTokenPrivileges to acquire the backup privilege for the current process.

After that, it’s possible to read files you don’t usually have access to when specifying the flag “FILE_FLAG_BACKUP_SEMANTICS” (Win32).

The file-reading happens in GroupDocs code, so we want to ask whether it might be possible to implement something that allows us to specify that the flag “FILE_FLAG_BACKUP_SEMANTICS” shall be used whenever a file is opened during the extraction.

If you have any questions about this, feel free to ask!

Best regards
jamsharp

@jamsharp
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): SEARCHNET-3477

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Hello,

I saw this in your change logs:

SEARCHNET-3486 | Implement indexing of files with backup privileges | Feature

Is that what we asked for? Is it also usable when extracting manually?

@jamsharp

We are currently investigating your specific issue under ticket SEARCHNET-3477.

The issues you have found earlier (filed as SEARCHNET-3477) have been fixed in this update. This message was posted using Bugs notification tool by atir.tahir

Hello,
I finally got to try the feature and unfortunately encountered the following Exception:

Error during text extraction from \path\to\file.pdf
System.UnauthorizedAccessException: Access to the path ‘\path\to\file.pdf’ is denied.

I implemented the feature using manual text extraction like so:

private readonly Extractor m_Extractor = new();

var extractionOptions = new ExtractionOptions { UseRawTextExtraction = false, UseBackupPrivilege = true };
var extractedData = m_Extractor.Extract(document, extractionOptions);

Am I doing something wrong here or is this maybe a problem with the manual text extraction?
If there is any additional information I can provide, please let me know.

Best regards,
jamsharp

@jamsharp

Could you please share the complete code (how you are loading the file etc) and the problematic/sample source file?

Attached you may find the (simplified) code:

using System;
using GroupDocs.Search;
using GroupDocs.Search.Common;
using GroupDocs.Search.Options;

public class ExampleClass
{
    private readonly GroupDocs.Search.Extractor m_Extractor = new();

    public ExampleClass()
	{
        //...
	}

    public bool TryExtract(string pFilePath)
    {
        var extractionOptions = new GroupDocs.Search.Options.ExtractionOptions { UseRawTextExtraction = false, UseBackupPrivilege = true };
        var document = Document.CreateFromFile(pFilePath);

        GroupDocs.Search.Common.ExtractedData? extractedData;

        try
        {
            extractedData = m_Extractor.Extract(document, extractionOptions);
            // ...
        }
        catch (UnauthorizedAccessException)
        {
            // ...
            return false;
        }
        return true;
    }
}

Regarding problematic files, it does not seem to happen with a specific file, rather the exception occurs with any file where access without “FILE_FLAG_BACKUP_SEMANTICS” would not be possible.

Thanks in advance,
jamsharp

@jamsharp
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): SEARCHNET-3522

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.