GroupDocs.Conversion .NET of PDF caused OOM

we bought the license for the conversion library but we experienced very very bad performance using the library. we tried converting pdf file of 50MB and our memory sky rocketed to 6GB. our services crashed experiencing OOM exceptions and we had to revert the usage of the library
we use GroupDocs.Conversion version 25.8 using .NET
this is the start of the conversion before hitting 6-7GB while only processing this file

image.png (26.0 ק״ב)

image.png (17.7 ק״ב)

Adding an example pdf here. But you can take any large pdf you have, and from our experience the memory jumps no matter what file.

Adding the code here:
image.png (44.9 KB)
When calling the .Convert() function, the memory goes up drastically

Hello @dangilboa ,

We are sorry to hear that you encountered this issue. We have registered your request in our tracking system under the ID CONVERSIONNET-7970 and have started investigating the situation. As soon as we have any updates, we will let you know right away.

2 Likes

Hey Evgen,
In addition to the jump in memory usage (which I see for small pdf’s aswell) I also see that the memory doesn’t come back down after leaving the scope of the conversion, which leads me to believe there also might be some memory leak. Please take a look at that aswell
Thanks,
Dan

@dangilboa ,

Thank you for the clarification, we will investigate this.

2 Likes

Hey Evgen,
Any updates? This is very critical for us.
Thanks,
Dan

Hello @dangilboa ,

Unfortunately, there are still no updates we can share with you. Our team is still investigating the issue you’ve encountered. Could you please clarify one more detail – does the high memory consumption occur only when converting from PDF to MD, or do you also experience it with other conversion types? It would also be very helpful if you could provide a small test project that replicates your implementation, as this could greatly assist us in identifying the cause of the problem.

Hello @dangilboa ,

Our development team has analyzed your sample document, and we would like to provide further clarification on the reasons behind high memory consumption.
There is no exact way to measure how much memory GroupDocs.Conversion actually consumes when processing a specific document file. As you may know, .NET stores data in classes, and each class instance uses a certain amount of memory for CLR internal purposes. Therefore, any paragraph or formatted text (even consisting of a single character) takes additional memory once loaded into the DOM. Moreover, the .NET garbage collector applies a complex algorithm to determine the best time for memory collection, which makes it difficult to determine the actual memory usage.
When you attempt to convert a large multi-page document (such as your 2000-page file), you should understand that our library builds a page-by-page model of the document during loading. If the document contains a large amount of graphical content, memory consumption increases accordingly. The next step is the actual conversion, which also adds to memory usage.
To mitigate this issue, our development team has implemented a short-term solution: fragment-based conversion. This approach ensures reliable results while significantly reducing memory consumption. You can download the test application with the alpha version of GroupDocs.Conversion for .NET 25.9 using the link provided. Please either test your private documents with it or try using this alpha version in your implementation, and share your results with us.