We are using GroupDocs viewer to create thumbnails for various files types. Recently we have encountered some XLSX files with a small amount of content; but the user has applied formatting to maximum number of rows and columns (example AllRowsFormatted.zip (542.9 KB)). This causes the thumbnail generation to run for a very long time (ten’s of minutes at least, never let it complete), and using many GB of memory (hits max memory, garbage collects, repeats). Note: we’re only asking for page 1.
To try and limit the data processed, we’ve added options to set the page to 50 rows by 20 columns, so less data needs to be processed. This reduces the runtime (to ~350 seconds on my laptop). But the process is still using 10GBs, and the runtime is still too long.
I’ve also added a cancellation token, and whilst control is returned, the task is still running. Ideally we need it to be stopped and the memory freed.
Is there a better way to approach this conversion?
We have paid support - details can be provided on request.
Simplest example code, viewer.View() is where the processing and memory is used.
var bytes = File.ReadAllBytes(file);
using (var stream = new MemoryStream(bytes))
{
var pageStreams = new ContentPageStreamFactory();
using (var viewer = new Viewer(stream))
{
var spreadsheetOptions = SpreadsheetOptions.ForSplitSheetIntoPages(20, 5);
spreadsheetOptions.SkipEmptyColumns = true;
spreadsheetOptions.SkipEmptyRows = true;
var options = new JpgViewOptions(pageStreams)
{
Width = 250,
Height = 400,
Quality = 50,
SpreadsheetOptions = spreadsheetOptions
};
viewer.View(options, 1);
}
File.WriteAllBytes(allFormatted.Replace("xlsx", "jpg"), pageStreams.GetPageStream(1).ToArray());
}
The page stream class:
internal class ContentPageStreamFactory : IPageStreamFactory
{
public readonly Dictionary<int, MemoryStream> Pages = new();
public Stream CreatePageStream(int pageNumber)
{
var pageStream = new MemoryStream();
Pages.Add(pageNumber, pageStream);
return pageStream;
}
public void ReleasePageStream(int pageNumber, Stream pageStream)
{
// Do not release page stream as we'll need to keep the stream open
}
public MemoryStream GetPageStream(int pageNumber)
=> Pages.ContainsKey(pageNumber) ? Pages[pageNumber] : null;
}