License parsing error and performance issue during DOC/DOCX to PDF conversion on Linux (Java 8)

Hi Team,

We are currently using GroupDocs.Conversion for Java to support a functionality where files in non-PDF formats
(.doc, .docx, .xls, .xlsx) are converted to PDF, and later all PDFs are merged into a single document.

Environment details

OS: Linux

Java version: Java 8

GroupDocs version: 23.6

Issue 1: License parsing error during conversion initialization

While running the application in the dev environment, we are encountering the following exception during conversion:

com.groupdocs.conversion.legacy.exceptions.GroupDocsException:
class com.groupdocs.conversion.internal.c.a.s.exceptions.G: License parsing error
—> java.lang.IllegalArgumentException:
Property 'http://javax.xml.XMLConstants/property/accessExternalDTD’ is not recognized.

Relevant stack trace snippet:

com.groupdocs.conversion.legacy.exceptions.GroupDocsException: class com.groupdocs.conversion.internal.c.a.s.exceptions.G: License parsing error —> java.lang.IllegalArgumentException: Property 'http://javax.xml.XMLConstants/property/accessExternalDTD’ is not recognized.
— End of inner exception stack trace —
com.groupdocs.conversion.internal.c.a.s.internal.oh.v.a(Unknown Source)
com.groupdocs.conversion.internal.c.a.s.internal.oh.Z.ex(Unknown Source)
com.groupdocs.conversion.internal.c.a.s.internal.oh.Z.ez(Unknown Source)
com.groupdocs.conversion.internal.c.a.s.License.setLicense(Unknown Source)
com.groupdocs.conversion.internal.c.g.f.l.aspose.a.Ex(Unknown Source)
com.groupdocs.conversion.internal.c.g.f.l.aspose.a.L(Unknown Source)
com.groupdocs.conversion.internal.c.g.f.common.d.zyp(Unknown Source)
com.groupdocs.conversion.Converter.init(Unknown Source)
com.groupdocs.conversion.Converter.(Unknown Source)
com.groupdocs.conversion.Converter.(Unknown Source)
org.mit.compliance.attachmentexport.util.AttachmentExportUtil.convertToPdfIfRequired(AttachmentExportUtil.java:78)

This exception occurs when executing the following line of code:

try (Converter converter = new Converter(originalFilePath)) {
PdfConvertOptions options = new PdfConvertOptions();
}

It appears that the exception is thrown during the initialization of the Converter, before the actual conversion starts.

Previously, we were using GroupDocs version 25.9.
With this version, the conversion process did not complete successfully - the execution appeared to hang when conversion started, and the flow did not proceed further.

Due to this, we downgraded to version 23.6, but are now facing the license parsing issue mentioned above.

Issue 2: Performance concern with large DOC files

Additionally, we observed a performance issue while converting large DOC files:

File size: 40–50 MB

Content: Plain Text

Conversion time: around 23 minutes

We would like to understand:

Whether this conversion time is expected

If there are recommended configurations, JVM options, or GroupDocs settings to improve performance for large documents

Requesting for support on below queries:

  1. Understanding the root cause of the license parsing error in Java 8 / Linux environment and recommended fix.

  2. Guidance on performance optimization for large DOC/DOCX to PDF conversions

Thank you for your support.

@suchithrapv

Hello,

Thank you for reaching out and providing detailed information about the issues you are experiencing.

Your request has been received and will be reviewed by the appropriate team. We will get back to you as soon as possible with further assistance.


If you have any additional details or files (e.g., a minimal reproducible example, license file, or stack‑trace) that you can share, please attach them to your post. This will help us investigate the problem more efficiently.


Kind regards,
Support Team


This response is based on the information you provided. If you have any further questions or need to add more context, feel free to reply to this thread.

Hello @suchithrapv ,

We’re sorry to hear that you’ve encountered these issues.
Regarding the licensing problem, you can resolve it by upgrading to a newer version, for example 23.12.
Could you please clarify why you are using version 23.6? Are you just trying to recheck the behavior, or have you been using this version previously?

As for the second issue related to the conversion delay, unfortunately there are no quick recommendations to resolve it. What Java heap space values are you using during the conversion?
The minimum recommended values are -Xms1g and -Xmx2g.

However, we would appreciate it if you could share a sample document for analysis so that we can thoroughly investigate the issue.
We look forward to your feedback.

Thank you for your response. We will proceed with version 23.12. Previously, we used version 23.6 since it was aligned with the GroupDocs version implemented in one of our similar products to achieve the same functionality.
Regarding the conversion delay, I will review the Java heap space configuration and get back to you. Additionally, I tried attaching the sample doc with which I tried, but since it is with .doc extension I am unable to attach it as attaching files with .doc extension is not supporting in this editor, its a file with size: 41,745 Kb and contains more than 10,000 pages.

Hello @suchithrapv ,

You can send your file to me by email at evgen.efimov@aspose.com, or try compressing it into an archive and uploading it again via the forum.
I’m looking forward to receiving your file so that I can pass it on to our development team for further investigation.

Hi @evgen.efimov

We tested with GroupDocs version 23.12, but encountered an issue where the conversion process gets stuck. Specifically, we attempted to convert a DOCX file (size: 14 KB) in a Linux environment using Java 8.

The logs are available up to the point where the conversion process begins, but no logs are generated after that stage.

Additionally, attaching another document (size: 40 MB) that we used for testing and faced that conversion delay issue mentioned earlier
40mb.zip (99.0 KB)

Hello @suchithrapv ,

Thank you for providing the sample file. We will immediately begin investigating your use case using this file.
As I understand it, you encountered a different issue, but with a different 14 KB file. Have you tried testing this file with a newer version of our library?
Today, a new version of GroupDocs.Conversion for Java, version 25.12, was released. Could you test the issue using this version or version 25.9, as it may not be reproducible with the latest versions?

If the issue still persists, please provide this file for further analysis.

@suchithrapv ,

Unfortunately, we were unable to open the file you provided, either in Microsoft Office Word or as an archive in order to verify its contents.
Could you please resend the file directly to my email at evgen.efimov@aspose.com?

Hi @evgen.efimov ,

We tried with groupdocs version 25.12, but still the issue exist, our functionality is getting stuck here:
PdfConvertOptions options = new PdfConvertOptions();
logger.info(“convertToPdfIfRequired :: Converting {} to PDF at {}”, originalFilePath, pdfFilePath);
converter.convert(pdfFilePath, options);

Hereby sharing the document with which I tried conversion:
Verify that the Role field is displayed as a multi-select dr (1).docx (13.1 KB)

Related to the conversion delay file sharing, I shall try checking a different way to share it with you, apologies for the inconvinience caused.

@suchithrapv ,

Unfortunately, I was not able to reproduce the freezing(stuck) issue during DOCX to PDF conversion using the document you provided (“Verify that the Role field is displayed as a multi-select dr.docx”) on my Linux environment.
Could you please clarify which Linux distribution and version you are using so that I can recheck this behavior in Docker?
Additionally, please specify the exact Java version you are using.

Hi @evgen.efimov

Hereby sharing our linux distribution details:
NAME=“CentOS Stream”
VERSION=“8”
ID=“centos”
ID_LIKE=“rhel fedora”
VERSION_ID=“8”
PLATFORM_ID=“platform:el8”
PRETTY_NAME=“CentOS Stream 8”
ANSI_COLOR=“0;31”
CPE_NAME=“cpe:/o:centos:centos:8”
HOME_URL=“https://centos.org/
BUG_REPORT_URL=“https://bugzilla.redhat.com/
REDHAT_SUPPORT_PRODUCT=“Red Hat Enterprise Linux 8”
REDHAT_SUPPORT_PRODUCT_VERSION=“CentOS Stream”

and the Java version we use is 8

@suchithrapv ,

Thank you for your clarifications. Unfortunately, even with the new Docker image matching your environment, I was not able to reproduce this issue.
In this case, could you please provide a minimal example of your application that uses GroupDocs.Conversion, which I could run on my side to reproduce the problem?
You can create a simple and minimal example — the important thing is that it reproduces the issue on your side. I believe this will help us move forward with investigating this case.

Hi @evgen.efimov ,

We tried reproducing the issue with a basic conversion scenario where a document is fetched from a shared repository and converted from DOCX to PDF using GroupDocs.Conversion.

try (Converter converter = new Converter(originalFilePath)) {
			PdfConvertOptions options = new PdfConvertOptions();
			logger.info("convertToPdfIfRequired :: Converting {} to PDF at {}", originalFilePath, pdfFilePath);
			converter.convert(pdfFilePath, options);
			logger.info("convertToPdfIfRequired :: Converted {} to PDF: {}", originalFilePath, pdfFilePath);
			// Delete the original file after successful conversion
			File originalFile = new File(originalFilePath);
			if (originalFile.exists() && originalFile.delete()) {
				logger.info("convertToPdfIfRequired :: Deleted original file: {}", originalFilePath);
			} else {
				logger.error("Failed to delete original file: {}", originalFilePath);
			}
			return pdfFilePath;
		} catch (Exception e) {
			logger.error("Failed to convert {} to PDF.", originalFilePath, e);
			throw new AttachmentProcessingException("Failed to convert to PDF.", e);
		}

One potential difference we observed between environments is related to Java versions:

  • The server environment has OpenJDK 17
  • The application itself is compiled and configured for Java 8

We wanted to check if this Java version mismatch (runtime vs application target) could be a contributing factor to the issue?

Any guidance on this would be very helpful.

@suchithrapv ,

The thing is that I initially tested this scenario using a Docker image with OpenJDK 17, while the Maven project itself was configured to compile for Java 8, and I was not able to reproduce this issue using your file.

Could you please let us know whether you have tried running our GitHub examples? Do you observe the same behavior on your side when using them?

I am asking this because it might be related to a specific configuration in your application that leads to this behavior. That is why I requested a minimal example of your application that I could run locally.

As I understand, you are passing a file path rather than a stream. It is possible that there are some access-related restrictions for this document. Without a complete picture of how our library is integrated into your application, it is difficult to determine the root cause.

Additionally, could you please confirm whether any other GroupDocs libraries are used in your application?

Hi @evgen.efimov ,

Thank you for your reply, since your team is unable to reproduce the issue with the configurations we suggested and we also faced a similar issue in a different product of ours which we discussed under this forum discussion:

So we’ll try reverting back to 23.6 version, since it worked on our other product, but while using this version we got this error:

com.groupdocs.conversion.legacy.exceptions.GroupDocsException: class com.groupdocs.conversion.internal.c.a.s.exceptions.G: License parsing error —> java.lang.IllegalArgumentException: Property ‘http://javax.xml.XMLConstants/property/accessExternalDTD’ is not recognized.
— End of inner exception stack trace —
com.groupdocs.conversion.internal.c.a.s.internal.oh.v.a(Unknown Source)
com.groupdocs.conversion.internal.c.a.s.internal.oh.Z.ex(Unknown Source)
com.groupdocs.conversion.internal.c.a.s.internal.oh.Z.ez(Unknown Source)
com.groupdocs.conversion.internal.c.a.s.License.setLicense(Unknown Source)
com.groupdocs.conversion.internal.c.g.f.l.aspose.a.Ex(Unknown Source)
com.groupdocs.conversion.internal.c.g.f.l.aspose.a.L(Unknown Source)
com.groupdocs.conversion.internal.c.g.f.common.d.zyp(Unknown Source)
com.groupdocs.conversion.Converter.init(Unknown Source)
com.groupdocs.conversion.Converter.(Unknown Source)
com.groupdocs.conversion.Converter.(Unknown Source)
org.mit.compliance.attachmentexport.util.AttachmentExportUtil.convertToPdfIfRequired(AttachmentExportUtil.java:78)
org.mit.compliance.pdfmerger.service.PDFMergerService.mergeAsSinglePDF(PDFMergerService.java:45)
org.mit.compliance.pdfmerger.service.PDFMergerService$$FastClassBySpringCGLIB$$22c9ba97.invoke()
org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:749)
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:294)
org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688)
org.mit.compliance.pdfmerger.service.PDFMergerService$$EnhancerBySpringCGLIB$$9484addd.mergeAsSinglePDF()
org.mit.compliance.attachmentexport.listener.ACProtocolAttachmentExportListener.prepareDataForProcessing(ACProtocolAttachmentExportListener.java:89)
org.mit.compliance.attachmentexport.listener.ACProtocolAttachmentExportListener.processProtocolAttachment(ACProtocolAttachmentExportListener.java:50)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)

could you please help us to resolve this particular issue in this version, is this any environment specific issue and can we correct in any way?

Also could you please help me with the following queries?

  1. Is there any debugger/debugging tools we can use to check that conversion stuck issue in groupdocs version-25.12/25.9

  2. Can we know will this conversion stuck be affected by any issue related to missing fonts?

  3. Can u specify the linux configurations required for running groupdocs successfully?

Hello @suchithrapv ,

Thank you for your detailed explanation and for your patience during the investigation of this case.
I will try to address your questions in order and clarify the situation.

1. Licensing issue in version 23.6
Regarding the licensing error in version 23.6, unfortunately this is a known limitation of the legacy GroupDocs licensing mechanism. This issue has been fixed in GroupDocs.Conversion 23.12 and later versions.
Regrettably, there is no safe workaround available for version 23.6.

2. Debugging conversion hangs
GroupDocs.Conversion does not provide a dedicated debugger. However, in cases where the application hangs, you may try using the standard JDK debugging tools, for example:

jstack <pid> > thread_dump.txt

This command outputs stack traces of all threads in the running Java process and is useful for diagnosing hangs, deadlocks, and performance issues.
Official usage and options are described in the Java SE Tools Reference:

3. Microsoft fonts on Linux
We always recommend installing Microsoft fonts when using our conversion library on Linux environments.
Without these fonts, the conversion library may repeatedly attempt font substitution, which can significantly slow down processing.
If these fonts are not yet installed on your system, please follow this documentation:
https://www.opswat.com/docs/mdcore/knowledge-base/how-to-install-msttcore-fonts-on-linux-systems-

4. Recommended Linux configuration
While we do not provide strict Linux-specific configurations in our product documentation, we can recommend the following:

  • Heap size: ≥ 2 GB
    • Minimum: -Xms1g -Xmx2g
    • Recommended: -Xms2g -Xmx4g
  • System and Microsoft Office fonts are installed
  • Temporary directory is writable
  • No file system permission restrictions for input/output paths

Hi @evgen.efimov ,

Currently, while checking the fonts in our linux envs, could see the below listing:
[root@client-app fonts]# ls
abattis-cantarell dejavu google-droid urw-base35
[root@client-app fonts]# cd dejavu
[root@client-app dejavu]# ls
DejaVuSansMono-BoldOblique.ttf DejaVuSansMono-Bold.ttf DejaVuSansMono-Oblique.ttf DejaVuSansMono.ttf
[root@client-app dejavu]# pwd
/usr/share/fonts/dejavu
[root@client-app dejavu]# cd …
[root@client-app fonts]# ls
abattis-cantarell dejavu google-droid urw-base35

Could you please check on this config?

@suchithrapv ,

Thank you for providing information about the font configuration in your Linux environment.

The fonts installed on your system (DejaVu) are standard Linux fonts and are generally sufficient for basic text rendering. However, they are not sufficient for converting DOC/DOCX documents created on Windows.

We recommend installing Microsoft fonts using the documentation link that I shared with you in my previous response.

Hi @evgen.efimov,

We attempted to install the required fonts by following the documentation referenced in the above replies.

In our local Ubuntu environment , the installation completed successfully. We were able to see the expected TrueType font directories created, such as:

  • /usr/share/fonts
  • /usr/share/fonts/truetype
  • /usr/share/fonts/truetype/dejavu
  • /usr/share/fonts/truetype/liberation

However, when we tried performing the same setup in our actual development environment running on CentOS Stream 8 , we observed different behavior. Although the font installation commands executed successfully and fc-cache completed without errors, the expected truetype directory structure was not created , and fonts were instead present under distribution-specific folders.

Below are the logs from the CentOS environment:

[root@client-app fonts]# sudo rpm -i https://downloads.sourceforge.net/project/mscorefonts2/rpms/msttcore-fonts-installer-2.6-1.noarch.rpm --nodeps
	package msttcore-fonts-installer-2.6-1.noarch is already installed
[root@Polus-client-app fonts]# fc-cache -v -r
/usr/share/fonts: caching, new cache contents: 0 fonts, 5 dirs
/usr/share/fonts/abattis-cantarell: caching, new cache contents: 4 fonts, 0 dirs
/usr/share/fonts/dejavu: caching, new cache contents: 4 fonts, 0 dirs
/usr/share/fonts/google-droid: caching, new cache contents: 15 fonts, 0 dirs
/usr/share/fonts/msttcore: caching, new cache contents: 54 fonts, 0 dirs
/usr/share/fonts/urw-base35: caching, new cache contents: 69 fonts, 0 dirs
/usr/share/X11/fonts/Type1: caching, new cache contents: 13 fonts, 0 dirs
/usr/share/X11/fonts/TTF: skipping, no such directory
/usr/local/share/fonts: skipping, no such directory
/root/.local/share/fonts: skipping, no such directory
/root/.fonts: skipping, no such directory
/usr/share/fonts/abattis-cantarell: skipping, looped directory detected
/usr/share/fonts/dejavu: skipping, looped directory detected
/usr/share/fonts/google-droid: skipping, looped directory detected
/usr/share/fonts/msttcore: skipping, looped directory detected
/usr/share/fonts/urw-base35: skipping, looped directory detected
/usr/lib/fontconfig/cache: cleaning cache directory
/root/.cache/fontconfig: not cleaning non-existent cache directory
/root/.fontconfig: not cleaning non-existent cache directory
/usr/bin/fc-cache-64: succeeded

Could you please help us understand:

  1. Whether this difference in directory structure is expected behavior on CentOS/RHEL-based systems
  2. If there are any additional steps or best practices recommended for configuring fonts on CentOS servers specifically

@suchithrapv ,

Thank you for the detailed information and logs — they are very helpful.

  1. On RHEL-based systems, there is no requirement to place fonts in a /truetype subdirectory. Font discovery is fully handled by fontconfig, not by the directory structure itself. The following log entry confirms that the fonts are correctly installed and indexed:
    /usr/share/fonts/msttcore: caching, new cache contents: 54 fonts
  2. Based on your logs, I do not believe any additional system configuration is required. As a final check, you may verify the full list of installed and indexed fonts using the fc-list command. If Windows fonts such as Arial, Times New Roman, and Calibri are present, then everything has been set up correctly.