When I try to convert this file LargeTable.zip (3.6 MB)
Conversion runs into a System.OutOfMemoryException.
I know the file is not exactly the typical xlsx, but GroupDocs.Conversion consuming 3 GB of memory before crashing with a memory exception is also not great.
I was using GroupDocs.Conversion 20.4 for .Net with the following code:
using (var conv = new GroupDocs.Conversion.Converter("LargeTable.xlsx"))
{
conv.Convert("Output.pdf", new PdfConvertOptions());
}
I also tried splitting the xlsx up into pages using the following code:
var options = new GroupDocs.Conversion.Options.Load.SpreadsheetLoadOptions();
options.OnePagePerSheet = false;
using (var conv = new GroupDocs.Conversion.Converter("\LargeTable.xlsx", ()=> options))
{
conv.Convert("Output.pdf", new PdfConvertOptions());
}
This consumed way less memory ~300 MB, but ultimately also failed.
The error was different though: An unhandled exception of type 'GroupDocs.Conversion.Exceptions.GroupDocsConversionException' occurred in GroupDocs.Conversion.dll
In this scenario, converting the complete sheet to one page will not work because the page is too huge and converting to a single PDF page will need a lot of resources. That’s why there’s a memory exception.
Using above approach (setting OnePagePerSheet to false), we cannot reproduce the issue. The file is converted properly with a memory consumption about 300-600MB. Have a look at the output PDF.
There is a backslash in the file name. It’s probably is a typo and probably due to this the file is not found and you get exception.
Thank you for your help
After cleaning up my test program, I no longer got the exception when using OnePagePerSheet.
Is there maybe some documentation where we can check / link to our customers regarding such limitations from GroupDocs?
If I understood correctly this is “by design” and not going to change any time soon, right?
Basically, it’s more dependent on length of the page and available resources (e.g. Memory). It seems that there will be a dependency, if page is huge, more resources will be required.
@atirtahir3
Thank you for your reply. I accept that this is a GroupDocs limitation and will change my conversion code to use OnePagePerSheet.
Could you please provide me with some information when to use the OnePagePerSheet property of the SpreadsheetLoadOptions?
As I do not know in advance which file-format will be input for conversion, is there a way to determine that it’s one of the many Excel formats?
Unfortunately, there is no way to determine when to set OnePagePerSheet to true and when to set it false. In the current case, the sheet has columns from A to AJX and rows from 1 to 3247. It’s hard to place information on single page. Even if it is placed successfully, it will be unreadable.
I just wanted to know if there is a way to ask GroupDocs “is this file a supported Excel format”?
Currently I have a IF that checks for
“.xls”, “.xlt”, “.xlm”, “.xlsx”, “.xlsm”, “.xltx”, “.xltm”, “.xlsb”, “.xla”, “.xlam”, “.xll”, “.xlw”, “.csv”, “.ods”, “.fods”
and that’s probably not complete and not a great way to do it.
So to make it clearer, I don’t need to know if the OnePagePerSheet option is required, but if the SpreadsheetLoadOptions is required.
Basically, you need to know “before hand” that which specific LoadOptions must be used based on the input/source file.
Currently, API doesn’t support such a feature. However, we’re investigating the possibility to provide such a functionality. Your investigation ticket ID is CONVERSIONNET-4024. As there’s any update, you’ll be notified.
The issues you have found earlier (filed as CONVERSIONNET-4024) have been fixed in this update. This message was posted using Bugs notification tool by Atir_Tahir