Hi,
I have a question regarding thread safety and concurrency for indexing. Is it possible to have multiple clients open the same index add files to that index using the Add method?
The reason I ask is because I created a test application that uses multiple threads, and each thread opens the same existing index. I then use the Add method to add files to the index one at a time to simulate a user uploading a file to my file storage service that I want to index the files. However, I’ve noticed this causes issues with files already existing (temp files):
System.IO.IOException: ‘The process cannot access the file ‘C:\SearchIndexes\GroupDocs.Search\1.temp’ because it is being used by another process.’
What’s the best way of allowing users to upload a file to my network drive and then index that file? Multiple users will be accessing the index and a lot of files can be uploaded to the network drive.
1 Like
@wichanski
We are investigating this scenario. Your investigation ticket ID is SEARCHNET-2626. You’ll be notified in case of any update.
1 Like
@Atir_Tahir
If your team is curious what I was using for testing, here’s the source code:
The folder I want to index files from: pathToIndex
The path to the index is: INDEX_PATH
I also want you to know that I know you support indexing a folder, but my requirement is to support adding one file to the index at a time as they will be added in at random times (unlike this code snippet).
var supportedExtension = new string[] { ".doc", ".docx", ".rtf", ".pdf", ".txt", ".csv", ".xml", ".xls" };
var files = Directory
.EnumerateFiles(pathToIndex, "*", SearchOption.AllDirectories)
.Where(x => supportedExtension.Any(y => x.EndsWith(y, StringComparison.InvariantCultureIgnoreCase)))
.ToList();
var index = new GroupDocs.Search.Index(INDEX_PATH);
Parallel.ForEach(files, new ParallelOptions { MaxDegreeOfParallelism = 5 }, (fileToAdd) =>
{
try
{
var sharedIndex = new GroupDocs.Search.Index(INDEX_PATH);
sharedIndex.Events.ErrorOccurred += (sender, args) =>
{
Console.WriteLine(args.Message);
};
var doc = Document.CreateFromFile(fileToAdd);
sharedIndex.Add(new Document[] { doc }, new IndexingOptions { IsAsync = false });
} catch (Exception e)
{
Console.WriteLine($"Exception caught: {e.Message}");
}
});
@wichanski
We’ve an update on SEARCHNET-2626.
You must have only one instance of the index instantiated for each specific index path.
Only one index change task (Add, Update, Delete, etc.) and an arbitrary number of search tasks can run at a time for each index.
However, the indexing task (Add) can be multi-threaded to increase indexing speed. This is set in the indexing options. Please have a look.
Thus, for several users who want to add documents to the index, it is necessary to independently implement a certain queue serving the sequence of sending documents for indexing.
You can also add documents to the index not from disk, but from a stream or structure.
1 Like