XLSX to PDF conversion issue System.OutOfMemoryException exception in .NET

Hi,

When I try to convert this file
LargeTable.zip (3.6 MB)
Conversion runs into a System.OutOfMemoryException.

I know the file is not exactly the typical xlsx, but GroupDocs.Conversion consuming 3 GB of memory before crashing with a memory exception is also not great.

I was using GroupDocs.Conversion 20.4 for .Net with the following code:

        using (var conv = new GroupDocs.Conversion.Converter("LargeTable.xlsx"))
        {
            conv.Convert("Output.pdf", new PdfConvertOptions());
        }

I also tried splitting the xlsx up into pages using the following code:

        var options = new GroupDocs.Conversion.Options.Load.SpreadsheetLoadOptions();
        options.OnePagePerSheet = false;

        using (var conv = new GroupDocs.Conversion.Converter("\LargeTable.xlsx", ()=> options))
        {
            conv.Convert("Output.pdf", new PdfConvertOptions());
        }

This consumed way less memory ~300 MB, but ultimately also failed.

The error was different though:
An unhandled exception of type 'GroupDocs.Conversion.Exceptions.GroupDocsConversionException' occurred in GroupDocs.Conversion.dll

1 Like

@Clemens_Pestuka,

We are investigating this issue. Your investigation ticket ID is CONVERSIONNET-3929. As there’s any update, you’ll be notified.

1 Like

@Clemens_Pestuka

In this scenario, converting the complete sheet to one page will not work because the page is too huge and converting to a single PDF page will need a lot of resources. That’s why there’s a memory exception.

Using above approach (setting OnePagePerSheet to false), we cannot reproduce the issue. The file is converted properly with a memory consumption about 300-600MB. Have a look at the output PDF.

There is a backslash in the file name. It’s probably is a typo and probably due to this the file is not found and you get exception.

1 Like

@atirtahir3

Thank you for your help :slight_smile:
After cleaning up my test program, I no longer got the exception when using OnePagePerSheet.

Is there maybe some documentation where we can check / link to our customers regarding such limitations from GroupDocs?
If I understood correctly this is “by design” and not going to change any time soon, right?

1 Like

@Clemens_Pestuka

Good to know that your issue is fixed.

Basically, it’s more dependent on length of the page and available resources (e.g. Memory). It seems that there will be a dependency, if page is huge, more resources will be required.

1 Like

@atirtahir3
Thank you for your reply. I accept that this is a GroupDocs limitation and will change my conversion code to use OnePagePerSheet.

Could you please provide me with some information when to use the OnePagePerSheet property of the SpreadsheetLoadOptions?
As I do not know in advance which file-format will be input for conversion, is there a way to determine that it’s one of the many Excel formats?

1 Like

@Clemens_Pestuka

We are investigating this requirement with ID CONVERSIONNET-3929. You’ll be notified about the outcomes.

1 Like

@atirtahir3

I’ve seen the status has been updated to Resolved, could you please give me an update about the outcome of the investigation?

@Clemens_Pestuka

Unfortunately, there is no way to determine when to set OnePagePerSheet to true and when to set it false. In the current case, the sheet has columns from A to AJX and rows from 1 to 3247. It’s hard to place information on single page. Even if it is placed successfully, it will be unreadable.

@atirtahir3

Sorry but I don’t understand that answer.

I just wanted to know if there is a way to ask GroupDocs “is this file a supported Excel format”?
Currently I have a IF that checks for
“.xls”, “.xlt”, “.xlm”, “.xlsx”, “.xlsm”, “.xltx”, “.xltm”, “.xlsb”, “.xla”, “.xlam”, “.xll”, “.xlw”, “.csv”, “.ods”, “.fods”
and that’s probably not complete and not a great way to do it.

So to make it clearer, I don’t need to know if the OnePagePerSheet option is required, but if the SpreadsheetLoadOptions is required.

@Clemens_Pestuka

Basically, you need to know “before hand” that which specific LoadOptions must be used based on the input/source file.
Currently, API doesn’t support such a feature. However, we’re investigating the possibility to provide such a functionality. Your investigation ticket ID is CONVERSIONNET-4024. As there’s any update, you’ll be notified.

1 Like

@atirtahir3

Yes exactly :slight_smile:
Unless there is a way to specify multiple LoadOptions, that would also be fine for us.
(What ever is easier to implement on your side.)

Thank you for submitting this feature request!

1 Like

@Clemens_Pestuka

You’re welcome.

1 Like

The issues you have found earlier (filed as CONVERSIONNET-4024) have been fixed in this update. This message was posted using Bugs notification tool by Atir_Tahir

@Atir_Tahir

Thank you for implementing this feature and for the notification!
May I kindly ask for a sample on how to use it?

1 Like

@Clemens_Pestuka

Have a look at this screenshot.png (6.0 KB). That’s how you can detect load options at runtime. Learn more about this feature in this article.

1 Like

@Atir_Tahir

Thanks a lot for the link and screenshot :slight_smile:
I gave the new feature a try now and I’m happy with how it works :+1:

1 Like

@Clemens_Pestuka

You are welcome.

1 Like