Text encoding is not always detected correctly as it seems.
When converting the attached “Text normal.txt” or “Partially working.txt” to HTML without any other options, the result will not be correct:
image.png (5.9 KB)
image.png (1.3 KB)
Using GroupDocs.Conversion and PDF conversion, the attached “Partially working.txt” will look perfectly fine. (see “Partially working.txt.pdf”)
However the “Text normal.txt” will not (see “Text normal.txt.pdf”).
When specifying the Byte Order Mark, encoding will be correctly detected. (see “Text with BOM.txt”)
Text encoding.zip (40.7 KB)
I know that it’s possible to specify the encoding myself in the LoadOptions, but I also don’t know that in advance. Is it possible to improve the encoding detection, of the Viewer, at least to the level of Conversion?
GroupDocs Viewer and Conversion 22.11 were used for testing.