Converting certain EML to HTML / PDF takes about half an hour in .NET

Converting the attached file to HTML with GroupDocs.Viewer or to PDF with GroupDocs.Conversion takes about half an hour to complete. Version 21.2 was used for testing (both).
The resulting HTML/PDF file looks fine, it’s just the long conversion time that’s problematic for such a small file.
It’s probably related to the embedded images which cannot be loaded.

eml conversion time.zip (24.3 KB)

Hi @Clemens_Pestuka

Could you provide the source code that used for conversion?

@mikhail.evgrafov.aspose
sure, but it was just a basic conversion, nothing special.

Viewer:

        using (var viewer = new Viewer(documentPath))
        {
            var options = ViewOpts.HtmlViewOptions.ForEmbeddedResources("output_viewer{0}.html");
            viewer.View(options);
        }

Conversion:

        using (var converter = new Converter(documentPath))
        {
            var options = new ConvOpts.PdfConvertOptions();
            converter.Convert("output.pdf", options);
        }

Hi @Clemens_Pestuka

We could reproduce this issue at our end. It’s been logged in our internal issue tracking system with ID VIEWERNET-3129. As there’s any update, you’ll be notified.

1 Like

Hi @Clemens_Pestuka

I have investigated this file - it contains many invalid links. Files exist for these links, but with the wrong content-type format.
For example:
https://www.nespresso.com/emailing/NespressoVisuals/common/temp_2020/logo-light.jpg - resource exists, but it’s not JPEG, it’s actually WEBP. So outlook and Viewer unable to handle it correctly, because of the wrong content type (JPEG, but actually is WEBP) returned by server www.nespresso.com.
We fixed long rendering (we will provide code for you), the fix will be in the current release (21.3). But because of these invalid links, links resources will not be visible (as it not visible in Outlook too).

1 Like

@mikhail.evgrafov.aspose

Thanks for the quick and detailed answer!
I understand that the images cannot be viewed because of the wrong content type.
As long as the conversion time is fixed in 21.3, I am happy :smiley:
Thank you :slight_smile:

@Clemens_Pestuka

GroupDocs.Viewer for .NET v21.3 that includes fix for this issue has been published. You can find the new version at

Have a nice day!

1 Like

Hi @Clemens_Pestuka

To set load resources timeout and prevent long rendering please use the following code:

 HtmlViewOptions viewOptions = HtmlViewOptions.ForEmbeddedResources("result_{0}.html");
 LoadOptions loadOptions = new LoadOptions();
 loadOptions.ResourceLoadingTimeout = TimeSpan.FromSeconds(1);

 using (Viewer viewer = new Viewer(documentPath, loadOptions))
 {
      viewer.View(viewOptions);
 }
1 Like

@mikhail.evgrafov.aspose

Thanks a lot for the very fast fix :+1:
I can confirm that it’s working fine with this code and the latest version.

@Clemens_Pestuka

You’re welcome!

1 Like

@mikhail.evgrafov.aspose

Hi, I would have a small followup question regarding this resource loading timeout:
Will this timeout only affect the time it takes to establish a connection, or would this timeout also terminate existing, but slow connections?
I’m wondering if it can be safely set to 1 second, or if this might interrupt slowly loading images?

As the conversion time is still rather slow for this document, when I set the timeout to 10 seconds,
I was wondering if it might be possible as a future improvement, to retrieve multiple images at once?
This specific file has 24 images that all can’t be loaded. With the 10 seconds timeout, it took over 4 minutes to render, which strongly suggests that images are currently not loaded in parallel.

@Clemens_Pestuka

I believe it relates to both cases as we’re using WebClient under the hood and it supposed to drop any connection regardless a connection’s state.

Yes, you’re totally right. The resources are loaded sequentially so the total loading time is a sum of time taken to load each image. We’ve created the issue in our bug tracker to investigate if we can add such a feature. The issue ID is VIEWERNET-3353. We’ll notify you in case of any updates.

1 Like

@vladimir.litvinchik

Thank you for the detailed description and for creating the feature request :+1:

@Clemens_Pestuka

You’re welcome!

Hi @Clemens_Pestuka

We are unable to do parallel loading because of:

  • document should be built sequentially
  • if we sent multiple requests to the host it would look like a DDOS attack and get banned by IP

But we can check the host for availability and if it is unavailable we will don’t load resources from this host.

1 Like

@mikhail.evgrafov.aspose

Thanks for the update :slight_smile:
Even when building the page sequentially, loading images in parallel should not be a problem.
I’m not certain if loading resources in parallel will really be considered a DDOS attack. I mean browsers are for sure loading images in parallel as well.
The proposed solution sounds good as well :+1:

For the new improvement “Prevent loading resources if the host is unavailable” I created new improvement VIEWERNET-3598 in our tracker. I will reply here in case of any updates.

1 Like

@mikhail.evgrafov.aspose

Thank you :slight_smile:

1 Like

@Clemens_Pestuka

You are welcome!

@Clemens_Pestuka

Improvement VIEWERNET-3598 will be in the nearest release (21.10).

1 Like