High memory usage, long processing time creating thumbnail for spreadsheet

We are using GroupDocs viewer to create thumbnails for various files types. Recently we have encountered some XLSX files with a small amount of content; but the user has applied formatting to maximum number of rows and columns (example AllRowsFormatted.zip (542.9 KB)). This causes the thumbnail generation to run for a very long time (ten’s of minutes at least, never let it complete), and using many GB of memory (hits max memory, garbage collects, repeats). Note: we’re only asking for page 1.

To try and limit the data processed, we’ve added options to set the page to 50 rows by 20 columns, so less data needs to be processed. This reduces the runtime (to ~350 seconds on my laptop). But the process is still using 10GBs, and the runtime is still too long.

I’ve also added a cancellation token, and whilst control is returned, the task is still running. Ideally we need it to be stopped and the memory freed.

Is there a better way to approach this conversion?

We have paid support - details can be provided on request.

Simplest example code, viewer.View() is where the processing and memory is used.

var bytes = File.ReadAllBytes(file);
using (var stream = new MemoryStream(bytes))
{
	var pageStreams = new ContentPageStreamFactory();

	using (var viewer = new Viewer(stream))
	{   
		var spreadsheetOptions = SpreadsheetOptions.ForSplitSheetIntoPages(20, 5);
		spreadsheetOptions.SkipEmptyColumns = true;
		spreadsheetOptions.SkipEmptyRows = true;

		var options = new JpgViewOptions(pageStreams)
		{
			Width = 250,
			Height = 400,
			Quality = 50,
			SpreadsheetOptions = spreadsheetOptions
		};

		viewer.View(options, 1);
	}

	File.WriteAllBytes(allFormatted.Replace("xlsx", "jpg"), pageStreams.GetPageStream(1).ToArray());
}

The page stream class:

internal class ContentPageStreamFactory : IPageStreamFactory
{
    public readonly Dictionary<int, MemoryStream> Pages = new();

    public Stream CreatePageStream(int pageNumber)
    {
        var pageStream = new MemoryStream();
        Pages.Add(pageNumber, pageStream);
        return pageStream;
    }

    public void ReleasePageStream(int pageNumber, Stream pageStream)
    {
        // Do not release page stream as we'll need to keep the stream open
    }

    public MemoryStream GetPageStream(int pageNumber)
        => Pages.ContainsKey(pageNumber) ? Pages[pageNumber] : null;
}

@mautv

Thank you for providing the details and the code you’re running. I can reproduce high memory usage locally and I have found the potential issue. We’ll take a look at how it can be fixed and update you.

@mautv
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): VIEWERNET-4356

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

@mautv

This issue has been fixed in GroupDocs.Viewer for .NET 23.6. The version is available at

Have a nice day!

Thank you, I will check it out shortly.

@mautv

You’re welcome. In case of any issues please let us know.