Encoding problems when converting xls to csv

Hello

I need to convert xls & xlsx files to csv. Encopding is not properly set when converted to csv. If xlsx is converted to pdf, it works well, no encoding problem.

I tried to find a way of specifying encoding on SpreadsheetLoadOptions but I cound not find any way of doing it.
I also tried to find a way of setting encoding on SpreadsheetConvertOptions but I couldn’t

Here is the code I use:

SpreadsheetLoadOptions option = new SpreadsheetLoadOptions() { CultureInfo = new System.Globalization.CultureInfo("fr-FR"), Format = SpreadsheetFileType.Xlsx };

using (Converter converter = new Converter(fileName, () => option))
{
    SpreadsheetConvertOptions options = new SpreadsheetConvertOptions();
    options.Format = SpreadsheetFileType.Csv;
    options.Separator = ';';

     converter.Convert(outputFileName, options);
}

Is there a way of converting to csv with correct encoding? Or this fonctionnality is not available?

Thanks for your help

Yannick

@Yannick001

To convert XLS and XLSX files to CSV with the correct encoding, you can use the SpreadsheetConvertOptions class, but it appears that there is no direct way to specify the encoding in the current API. However, you can ensure that the conversion is done correctly by following the example below.

Here’s a sample code snippet that demonstrates how to convert an XLSX file to CSV format:

using System;
using System.IO;
using GroupDocs.Conversion.Options.Convert;

namespace GroupDocs.Conversion.Examples.CSharp.BasicUsage
{
    internal static class ConvertXlsxToCsv
    {
        public static void Run()
        {
            string outputFolder = Constants.GetOutputDirectoryPath();
            string outputFile = Path.Combine(outputFolder, "xlsx-converted-to.csv");

            // Load the source XLSX file
            using (var converter = new GroupDocs.Conversion.Converter("path_to_your_file.xlsx"))
            {
                SpreadsheetConvertOptions options = new SpreadsheetConvertOptions 
                { 
                    Format = GroupDocs.Conversion.FileTypes.SpreadsheetFileType.Csv 
                };
                // Save converted CSV file
                converter.Convert(outputFile, options);
            }

            Console.WriteLine("\nConversion to CSV completed successfully. \nCheck output in {0}", outputFolder);
        }
    }
}

In this example, replace "path_to_your_file.xlsx" with the actual path to your XLSX file. The conversion will create a CSV file in the specified output directory.

If you are experiencing encoding issues, consider checking the default encoding settings of your environment or the application that opens the CSV file, as the conversion process itself does not provide an option to set encoding directly.

For more detailed information about converting XLSX to CSV, you can refer to the official documentation here.

If you continue to face issues, please let me know, and I can assist you further!

Sources:
[1]: ConvertXlsxToCsv.cs

Sorry but I cannot change the defauld encoding on the server environment.

Would it be possible in the future versions to provide such a fonctionnality? I saw it’s avaislable for the SpreadsheetLoadOptions() method.

Thanks,

Yannick

@Yannick001

We are investigating the possibility to add this feature. Your investigation ticket ID is CONVERSIONNET-7156. You’ll be notified in case of any update.

@Yannick001

Could you please share the problematic XLS file and the resultant CSV?

Hello

you will find a zip files with 5 files:

  • original.xlsx : the original xlsx file
  • converted.csv : the converted file (with incorrect encoding)
  • convertedAndEncodingChangedManually.csv : the converted file opened and manually saved with notepad++ with correct encoding
  • OpenedInExcelWithIncorrectEncoding.JPG : a screen capture of file with incorrect encoding opend in excel
  • OpenedInExcelWithCorrectEncoding.JPG : a screen capture of file with correct encoding opend in excel

files.zip (23,4 Ko)

1 Like

@Yannick001

Thanks for the details. We’ll continue our investigation and notify you in case of any update.

The issues you have found earlier (filed as CONVERSIONNET-7156) have been fixed in this update. This message was posted using Bugs notification tool by nikola.yankov