Merged PDF documents missing Accessibility standards

pankajgupta · July 30, 2025, 8:08am

Hi @AlekseiSemenchenko , i tried your code as well with merger 25.7. It seems still same accessibility failures for me.

AlekseiSemenchenko · July 30, 2025, 3:24pm

Hi,

To ensure we’re analyzing the same scenario and understand why you’re seeing different results with the same product version,
let’s align our source code and environments using the shared GitHub repository.

Please download and run the test project from the provided repository on your setup.

Before executing, make sure to copy the following input files with exact file names into the /SampleFiles/pdf folder:

Condizioni_Generali_Fornitura_Estra.pdf
Informativa_Privacy_Qualita_Commerciale.pdf
Lettera_Accompagnamento_Plico_Contr.pdf
Modulo_Dati_Catastali_Estra.pdf
Modulo_Reclamo_Estra.pdf
Modulo_Ripensamento_Estra.pdf
Proposta_Fornitura_Estra_Luce.pdf

We did not include them into public repository due to possible sensitive information.

Please note: the first source file passed to the constructor (e.g., Condizioni_Generali_Fornitura_Estra.pdf)
heavily influences the accessibility structure of the resulting document:

Merger merger = new Merger(firstPdfInputStream);

If this initial file contains accessibility issues, they may be propagated into the output file.

The two errors you mentioned are actually present in the original input files and were therefore carried over into the final merged result.
In your earlier attempt, we suspect the mismatch occurred due to either:

an outdated or mismatched library version
differences in input files (e.g., the previously generated output had 19 pages instead of 18), suggesting different source content or structure.

Looking ahead, the upcoming version 25.8 will include an optional automatic correction mechanism designed to detect and fix such structural issues during processing.
This feature should improve stability when dealing with input files that contain known accessibility or structure problems.

Let me know once you’ve run the test — we’ll be able to compare results more accurately then.

Thanks!

UPD Please share as much details about your environment as possible

what is environment OS,
Java version

pankajgupta · August 4, 2025, 6:01am

hey @AlekseiSemenchenko
I used same code as provided by you with same sequence of files. still the result is not as you mentioned. It is retruning me same as above response (2).pdf
Regarding env details:

what is environment OS : Mac and linux
Java version: 17

Is it possible to have a joint call and let’s get it checked together once?

pankajgupta · August 5, 2025, 5:22am

hey @AlekseiSemenchenko : I am able to merge file with same output as you. But somehow we feel relying on first file for meeting the max accessibility is not the right strategy. It should consider max of all files.

yuriy.mazurchuk · August 5, 2025, 6:15pm

Hi @pankajgupta!

You’re correct — the current fix in version 25.7 is not the ideal long-term approach. Our priority was to provide you with a quick solution as soon as possible.

For the upcoming 25.8 release, we’re working on a more general solution that will preserve accessibility artifacts from all source pages, regardless of which files are merged, to achieve the highest possible accessibility compliance. As part of this effort, we’ll also verify whether the PDF standards allow transferring these accessibility artifacts from the source documents to the corresponding pages in the destination document.

We’ll keep you updated as we make progress.

Thank you for your feedback!

pankajgupta · August 11, 2025, 11:14am

Hi @yuriy.mazurchuk , @AlekseiSemenchenko ,
We are encounttering :
error: error reading bazel-out/k8-fastbuild/bin/external/com/com/groupdocs/groupdocs-merger/25.7/header_groupdocs-merger-25.7.jar; Unsupported size: 16667311 for JarEntry META-INF/MANIFEST.MF. Allowed max size: 16000000 bytes. You can use the jdk.jar.maxSignatureFileSize system property to increase the default value.

similar to: Groupdocs comparison jar SignatureFileSize issue
Can you help us on it, we aren’t able to consume this version due to it’s size.

AlekseiSemenchenko · August 12, 2025, 6:51am

Hi @pankajgupta,

Thanks for flagging this. We’ve published 25.7.1, which resolves the

Unsupported size … META-INF/MANIFEST.MF (max 16000000)

error. Please update to 25.7.1 and let us know if everything works on your side.

pankajgupta · August 12, 2025, 8:33am

Thank you @AlekseiSemenchenko for the quick response. I have verified in local for accessibility in 25.7.1 and it seems no side effect. We will pull it to dev env and share the feedback. thanks

pankajgupta · August 29, 2025, 10:56am

hey @AlekseiSemenchenko : When can we expect 25.8 ?

AlekseiSemenchenko · August 30, 2025, 12:34pm

Hello, within two weeks

pankajgupta · September 11, 2025, 8:11am

hey @AlekseiSemenchenko : Can you please confirm on date, it’s yet not released.
cc: @amitdash

AlekseiSemenchenko · September 11, 2025, 9:35am

Tentatively September 14–20. We are putting in maximum effort to release as quickly as possible.

pankajgupta · September 30, 2025, 3:37am

Hi @AlekseiSemenchenko : it’s been too long, we are waiting on solution. Can you share when would be available?

AlekseiSemenchenko · September 30, 2025, 10:02am

Hello!
Sorry for the delayed. Version 25.9 is now available.

It includes a new feature — Automatically create PDF document logical structure tags, which allows you to automatically generate logical structure tags in PDFs to improve accessibility.

Here’s an example of how to use it:

String fileIn = "...";
String fileOut = "..."; 

PdfSaveOptions pdfSaveOptions = new PdfSaveOptions();
pdfSaveOptions.getAccessibilitySettings().setEnableAutoTagging(true);

try {
    Merger merger = new Merger(fileIn);
    merger.save(fileOut, pdfSaveOptions);
} catch (Exception e) {
    throw new RuntimeException(e);
}

With this update, the PDF will automatically receive logical structure tags when saved.

pankajgupta · October 7, 2025, 7:31am

hey @AlekseiSemenchenko : Thanks, We are now ina situation similar to:

Can you please take these as guidelines as with every version handling it becomes difficult.

AlekseiSemenchenko · October 8, 2025, 9:55am

Thank you for your message and reporting the issue you encountered.

We are pleased to inform you that we have updated our repository with new code examples, and now it includes a new example PdfAutoTaggingExample.java. This example has been specifically developed to demonstrate the functionality of the latest version of our product.

To verify whether the issue you reported has been resolved, please follow these steps:

Check your version: Ensure that you are using the latest version 25.9 of our product.

Update your local repository to the latest version if necessary.

Review the PdfAutoTaggingExample.java example.

Test it in your development environment.

We have conducted thorough testing of version 25.9 and confirmed that the issue you described should no longer occur. However, it is crucial to verify that you are indeed using version 25.9 to ensure the problem does not reproduce.

pankajgupta · October 10, 2025, 7:23am

hey @AlekseiSemenchenko : is it possible that if i won’t use pdfSaveOptions and setEnableAutoTagging , i would lose exisiting tagging with 25.9?
The question is in context to: 25.7* had limitations that the first file decides. the max accessibility. But 25.9 without using pdfSaveOptions, will it work correct, as we may not enable autotagging always

AlekseiSemenchenko · October 13, 2025, 10:08am

Hello, this option adds tags if they don’t exist, so it shouldn’t negatively affect the result. If tags already exist, they should remain in the resulting file. If not, it’s better to use the new option.

pankajgupta · October 13, 2025, 10:40am

Thanks @AlekseiSemenchenko : but can you please confirm

Like in 25.7.1 groupdocs merger the first file used to decide the max accesibility criteria, is it true for 25.9 as well? ref: Merged PDF documents missing Accessibility standards - #26 by yuriy.mazurchuk
or is the max criteria post merge is taken by the max of file irrespective of any sequence?

AlekseiSemenchenko · October 15, 2025, 9:24pm

Yes, this is also true for 25.9

In version 25.9, only auto-tagging functionality was introduced.