PDF view problem

I save the pages of the PDF document as PNG with the code below.
While there is no problem with other PDFs, the page images are distorted in the PDF file I sent as attachment.

Can you help troubleshoot the problem?

using GroupDocs.Viewer;
using GroupDocs.Viewer.Options;
using System;
using System.IO;
using System.Linq;
using System.Threading.Tasks;

namespace CA.Imaging6
{
    /// <summary>
    /// During testing,
    /// Console Application (Dotnet Framework 4.7.2)
    /// GroupDocs.Total 24.2.0.0 was used 
    /// </summary>
    internal class ViewTest
    {
        public string SourceFile = "sample.pdf";

        public void Do()
        {            
            ImagingUtil.UnlockGroupDocs();

            #region Init Pages folder
            string pagesDir = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Pages");
            if (!Directory.Exists(pagesDir))
                Directory.CreateDirectory(pagesDir);
            else
                foreach (var pageFile in Directory.GetFiles(pagesDir, "*.*"))
                    File.Delete(pageFile);
            #endregion

            byte[] content = File.ReadAllBytes(SourceFile);
            var pages = Enumerable.Range(1, 14).ToList();
            Parallel.ForEach(pages, (page) =>
            {
                var options = new LoadOptions(FileType.FromExtension("PDF"));
                using (var ms = new MemoryStream(content))
                {
                    using (var viewer = new GroupDocs.Viewer.Viewer(() => ms, () => options))
                    {
                        var viewOptions = new PngViewOptions(Path.Combine(pagesDir,"page{0}.png"));
                        viewer.View(viewOptions, page);
                    }
                }
            });

            Console.ReadLine();
        }
    }
}

sample.pdf (827.8 KB)

@fsimsek

Can you please share which version of GroupDocs.Viewer you’re using?

GroupDocs.Total 24.2.0.0

@fsimsek

I have overlooked that you left the version in the code. Thank you for attaching the sample file. We’ll take a look and update you.

The sample code was created to illustrate another situation that actually causes problems. In the real case, the application is a web application and the pages of the source document are returned asynchronously in response to separate web requests.
Therefore I cannot use it as follows.

            using (var ms = new MemoryStream(content))
            {
                using (var viewer = new GroupDocs.Viewer.Viewer(() => ms, () => options))
                {
                    var viewOptions = new PngViewOptions(Path.Combine(pagesDir, "page{0}.png"));
                    viewer.View(viewOptions, pages.ToArray());
                }
            }

@fsimsek

I have reproduced the issue with distorted pages when creating pages using Parallel.ForEach. Unfortunate I can’t provide any workaround to this issue at the moment except rendering all pages synchronously.

Please check the following code that you can use to render one page synchronously

  var fileName = "sample.pdf";
 var fileContent = File.ReadAllBytes(fileName);
 var fileStream = new MemoryStream(fileContent);
 var loadOptions = new LoadOptions(FileType.PDF);
 
 var pageNumber = 1;
 var dstStream = new MemoryStream();
 
 var viewer = new Viewer(fileStream, loadOptions); 
 viewer.SavePage(dstStream, pageNumber);
 viewer.Dispose();
 
 dstStream.Position = 0;
 
 File.WriteAllBytes($"page-{pageNumber}.png",dstStream.ToArray());
 
 return dstStream;

//

 public static class ViewerExtensions
 {
     public static void SavePage(this Viewer viewer, Stream dstStream, int pageNumber)
     {
         var pageStreamFactory = new SinglePageStreamFactory(dstStream);
         var viewOptions = new PngViewOptions(pageStreamFactory);
         viewer.View(viewOptions, pageNumber);
     }
 }
 
 public class SinglePageStreamFactory 
     : IPageStreamFactory
 {
     private readonly Stream _dstStream;
 
     public SinglePageStreamFactory(Stream dstStream)
     {
         _dstStream = dstStream;
     }
 
     public Stream CreatePageStream(int pageNumber) => _dstStream;
     
     public void ReleasePageStream(int pageNumber, Stream pageStream) {  }
 }

The View() method executes synchronously, so the execution in your sync or async method will continue only after View() method is completed executing. Therefore you can use the code above to create all the pages or a single page and return it back.

In the future versions we’ll improve our public API with new methods that would be less confusing and work similar to SavePage method.

Please let us know if it works for you.

Hi,
Converting pages to PDF synchronously is a situation we do not prefer in terms of application performance. As I wrote before, this problem does not occur with other PDF files (or other formats such as DOC, TIF, etc.). What is the problematic property of this file? If you can share this, maybe I can create an alternative structure that will work synchronously for PDF files with this property.

@fsimsek

At the moment I can’t say what is so special about this file. Possibly other files that you have is not scans but regular PDF files with text. The other formats are rendered by different engines so the issues in most cases are specific to format or even to a file.

Our product is a content management system product and documents are stored in all formats. All types of PDF documents containing text (or not) are stored.
Actually, since this error occurs specifically for this file, I think you can tell the cause of the problem by debugging it. Otherwise, we will have to wait for this problem to be resolved for the GroupDocs.Total product.

@fsimsek

Sure, as soon as we have any new information we’ll let you know.