PDF/UA support when converting to HTML in Viewer for .NET

Hi,

I was wondering how PDF/UA features are still supported / usable after conversion to HTML for displaying in our WebViewer.
We found this article, describing in detail the challenges when converting a Word file into a PDF:

Here it states:

„Please note that PDF/UA-1 output will also be WCAG 2.0 and Section 508 compliant.”

Is the HTML output from GroupDocs.Viewer also WCAG 2.0 and Section 508 compliant, when the original PDF/UA was?
How could we turn on accessibility features on conversion?
E.g. increase the text size and alt-text for images and hyperlinks.

Best regards,
Clemens

Hi @Clemens_Pestuka

Please note that the PDF/UA is a standard, which describes the PDF documents, but not the HTML documents. I think that in terms of HTML markup and W3C there is no such thing as “WCAG 2.0 and Section 508 compliance”.

As far as I understand you, you’re asking whether the resultant HTML, obtained after passing a PDF with PDF/UA on input, is editable, so you may change some things in this produced HTML markup like hyperlinks, text size and alt-text for images. The answer depends on where you want to make these changes. Of course, you can open resultant HTML file in Notepad or even in broswer developer tools and change anything you want there.

But if you’re trying to edit the produced HTML document in some browser-based online WYSIWYG HTML-editors like TinyMCE or CKEditor, then there will be problems. The main goal of the GroupDocs.Viewer internal PDF-to-HTML converter is to produce the HTML document, which will be looked in the browser exactly the same as original PDF document in any PDF reader. The exactness of visual representation is the main goal and feature. I think you’re already familiar with “internals” of the produced HTML markup and know, what is inside and what we do in order to achieve this goal: a background image, where all non-standard elements, formatting, layout and so on are backed, tons of small raster and vector images, and so on.

If you need to make the editable and “transparent” HTML, which internally has meaningful structure, which is not “fixed-layout”, but flexible “float-layout”, with document sections, paragraphs, where lists are truly lists, but not the single-line paragraphs that look like list items, where tables are truly tables, but not a set of lines drawn on background image around absolutely positioned glyphs, then the GroupDocs.Viewer is not the best choise — you should review the GroupDocs.Editor.

Unlike the GroupDocs.Viewer, for the GroupDocs.Editor the main goal is to produce the editable HTML-markup, where exactness of visual representation is not so important.

With best regards,
Denis Gvardionov

1 Like

@denisgvardionov

Hi Denis,

Thanks a lot for the very detailed answer, much appreciated :+1:
I gave GroupDocs.Editor a try and agree that it makes sense for that scenario.

Best regards,
Clemens

1 Like