Converting an XLSX to TIFF does not observe page breaks

Our customer was having an issue with conversion not observing the page breaks they inserted into their spreadsheet. I created a rather extreme but much smaller version of their document to include here as well as some simplified code to attempt the conversion. In all cases attempted, one page was produced. I could not find any options in SpreadsheetLoadOptions that would seem to help.

  public static void DoesNotObserveSpreadSheetPageBreaks(string sourceFile, string destFile)
  {
     var convertOptions = new ImageConvertOptions
     {
        Format = ImageFileType.Tiff,
        Grayscale = false,
        TiffOptions = { Compression = TiffCompressionMethods.Ccitt4 },
        HorizontalResolution = 400,
        VerticalResolution = 400,
     };

     using (var converter = new Converter(() => File.OpenRead(sourceFile))) 
     {
        converter.Convert(() => new FileStream(destFile, FileMode.OpenOrCreate), convertOptions);
     }
  }

pagebreaks.zip (7.2 KB)

Thanks
-Jonathan

1 Like

@jisabell

Using the code, source file you shared and API version 23.7, we are getting this output.zip (3.0 KB). Please let us know if this is the expected result.

That’s great to hear. We are using 23.9. Any idea why we’d get different results?

In the meantime, I will double check my test’s GroupDocs.Conversion.dll.

Apologies. My csproj has a hintpath for GroupDocs.Conversion.dll that doesn’t exist in the packages folder so the build is falling on the dll already in the output directory.

Hopefully, once I get paths in order, I’ll be on the version I actually thought I was.

Thanks

Alright. For some reason, when my project was set to net452, nuget did not want to put in the right path. I upped it to 462 and now it’s using the new dll; however, looks like a few types have disappeared.

Are there new equivalents for these values we were previously using:

  1. ImageFileType.Svg (or do I have to use .Svgz?)
  2. MarkupConvertOptions class
  3. PdfFileType.Ps
  4. PdfFileType.Pcl
  5. PdfFileType.Xps
    image.png (75.0 KB)

Thanks
-Jonathan

Could you please share your resultant file? Is there any evaluation watermark (in case you are not using license)?

Sorry, but we need some clarification here. Are you facing this issue with 23.9 and .NET 4.6.2? If yes, could you please share a simple console application?

Sorry. To clarify, when my project was set to .NET 4.5, nuget was not setting the path right. Updating my projects to .NET 4.62 resolved this.

Additionally, I was having trouble compiling with the name changes and ?perhaps removed? classes in 23.9. For example, it seems MarkupConvertOptions became WebConvertOptions and MarkupFileType became WebFileType.

All that said, I think my mistake was SpreadsheetLoadOptions.OnePagePerSheet. I had this option set to true. I had been told setting it to false did not help but clearly we had a miscommunication here because that does seem to be working for me now.

I will try going back to our current version of GroupDocs.Conversion 21.8.0 and start testing from the beginning being sure OnePagePerSheet is false.

Thank you.

Alright. So they are right in that, when OnePagePerSheet is false, their document comes through as 2 pages where, in Excel, it sees 3 pages. Even though my small test document does work for all its page breaks, theirs is converted to 2 instead of the expected 3 pages of TIFF output.

Please try this document with the original test code. I am back to using 21.8 but I think the difference is the input document.
RF-38291_3pageFax.zip (20.9 KB)

Thank you and sorry for the confusion.

@jisabell

Please share the problematic output (only two TIFF files). We still cannot reproduce this issue at our end using the above provided code. Could you please also share a sample application using that issue could be reproduced?

Here is the output we get.
00000000.zip (79.1 KB)

The code:
using System;
using System.IO;
using GroupDocs.Conversion.FileTypes;
using GroupDocs.Conversion.Options.Convert;
using GroupDocs.Conversion.Options.Load;

namespace TestApp
{
   internal static class Program
   {
      static void Main(string[] args)
      {
         try
         {
            var convertOptions = new ImageConvertOptions
            {
               Format = ImageFileType.Tiff,
               Grayscale = true,
               TiffOptions = { Compression = TiffCompressionMethods.Ccitt4 },
               HorizontalResolution = 203,
               VerticalResolution = 200,
               Brightness = 0,
               Contrast = 0,
               Gamma = 0,
               PageNumber = 1,
               PagesCount = 1000000,
               RotateAngle = 0,
               UsePdf = false
            };
            var loadOptions = new SpreadsheetLoadOptions { OnePagePerSheet = false };
            using (var converter = new GroupDocs.Conversion.Converter(() => File.OpenRead("RF-38291_3pageFax.xls"), () => loadOptions))
            {
               var i = 0;
               converter.Convert(() => new FileStream($"{i++:x8}.tif", FileMode.OpenOrCreate), convertOptions);
            }
         }
         catch (Exception e)
         {
            Console.WriteLine(e);
         }
         finally
         {
            Console.WriteLine("Press a key to exit.");
            Console.ReadKey();
         }
      }
   }
}

When I try to use the 23.9 version, I get this exception. Is there a way to tell what type’s having the issue? It appears to be obfuscated.

GroupDocs.Conversion.Exceptions.GroupDocsConversionException: The type initializer for ' ’ threw an exception.
at ?.()
at GroupDocs.Conversion.Converter.()
at GroupDocs.Conversion.Converter.Convert(Func`1 document, ConvertOptions convertOptions)
at Converters.Aspose.AsposeConverter.Convert(ConvertParameters parameters)

Additional info from the debugger. The stack looks like this:

mscorlib.dll!System.DefaultBinder.SelectMethod(System.Reflection.BindingFlags bindingAttr, System.Reflection.MethodBase[] match, System.Type[] types, System.Reflection.ParameterModifier[] modifiers) Line 367 C#
mscorlib.dll!System.RuntimeType.GetMethodImpl(string name, System.Reflection.BindingFlags bindingAttr, System.Reflection.Binder binder, System.Reflection.CallingConventions callConv, System.Type[] types, System.Reflection.ParameterModifier[] modifiers) Line 2614 C#
mscorlib.dll!System.Type.GetMethod(string name, System.Reflection.BindingFlags bindingAttr, System.Reflection.Binder binder, System.Reflection.CallingConventions callConvention, System.Type[] types, System.Reflection.ParameterModifier[] modifiers) Line 922 C#
GroupDocs.Conversion.dll! .(int , ) Unknown
GroupDocs.Conversion.dll! .(int ) Unknown
GroupDocs.Conversion.dll! .( ) Unknown
GroupDocs.Conversion.dll! . ( , ) Unknown
GroupDocs.Conversion.dll! .() Unknown
GroupDocs.Conversion.dll! .(bool ) Unknown
GroupDocs.Conversion.dll! .(object[] , System.Type[] , System.Type[] , object[] ) Unknown
GroupDocs.Conversion.dll! ..MoveNext() Unknown
GroupDocs.Conversion.dll! .() Unknown
GroupDocs.Conversion.dll!GroupDocs.Conversion.Converter.() Unknown
GroupDocs.Conversion.dll!GroupDocs.Conversion.Converter.Convert(System.Func<System.IO.Stream> document, GroupDocs.Conversion.Options.Convert.ConvertOptions convertOptions) Unknown

Thanks

@jisabell

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): CONVERSIONNET-6381

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Please specify the .NET version where you get this exception. It would be great if you can share a screencast/short video of this issue.

I had to update to .NET 4.62 to get the paths to work so I was running with that framework version. This customer is on an older version of our product so we need to limit changes in framework, etc.

Since it seems to work with their sample doc if I use SpreadsheetLoadOptions.SkipEmptyRowsAndColumns = false, I’ve moved on to our next defect. But it would be nice if we could use this option (set to true) and not lose a page break.

Additionally, I would like to update to your latest version for general bug fixes and new features so I will try to get a task allocated for me to spend more time on this and hopefully get you that short video.

Thanks

1 Like

@jisabell

We’ll look into this scenario and notify you in case of any update.

Thanks. we’ll look forward to that.

@jisabell

Please use the method for converting by page. Instead you are using the method for conversion of the whole document and the result is multipage TIF.

Try the following snippet:

const string source = "pagebreaks.xlsx";

var loadOptions = new SpreadsheetLoadOptions { OnePagePerSheet = false, SkipEmptyRowsAndColumns = true};

using (var converter = new Converter(() => File.OpenRead(source), () => loadOptions))
{
    var convertOptions = new ImageConvertOptions
    {
        Format = ImageFileType.Tiff,
        Grayscale = true,
        TiffOptions = { Compression = TiffCompressionMethods.Ccitt4 },
        HorizontalResolution = 203,
        VerticalResolution = 200,
        Brightness = 0,
        Contrast = 0,
        Gamma = 0,
        PageNumber = 1,
        PagesCount = 1000000,
        RotateAngle = 0,
        UsePdf = false
    };
    converter.Convert((int page) => new FileStream($"converted-{page:x8}.tif", FileMode.OpenOrCreate), convertOptions);
}

Please note the call to convert method:

converter.Convert((int page) => new FileStream($"converted-{page:x8}.tif", FileMode.OpenOrCreate), convertOptions);

Let us know if issue perssits.