Threads synchronization - performance degradation

Hello,

I’m experiencing significant thread contention when reading metadata using GroupDocs.Metadata for Java in a multithreaded processing pipeline.

Environment

  • Java: 21
  • GroupDocs.Metadata for Java
  • Processing files concurrently using multiple worker threads

Problem description

When multiple threads create Metadata objects and read metadata from different files concurrently, several threads become BLOCKED waiting for a monitor inside GroupDocs internal classes. This effectively serializes metadata processing and prevents efficient parallelization.

In my code I create the metadata reader like this:

try (Metadata metadata = new Metadata(is)) {
    // metadata processing
}

However, during execution I observe many threads blocked on the same internal monitor:

"category-classifier-worker-0" #152 daemon prio=5 os_prio=0 cpu=262771.68ms elapsed=2586.51s tid=0x00007fa21d026c70 nid=198 waiting for monitor entry
   java.lang.Thread.State: BLOCKED (on object monitor)
	at com.groupdocs.metadata.internal.c.a.i.X.a(Unknown Source)
	- waiting to lock <0x0000000708e58330> (a java.lang.Object)
	at com.groupdocs.metadata.internal.c.a.i.T.a(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.T.i(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.T.getFileFormat(Unknown Source)
	at com.groupdocs.metadata.core.nD.a(Unknown Source)
	at com.groupdocs.metadata.core.eF.b(Unknown Source)
	at com.groupdocs.metadata.core.eh.blm(Unknown Source)
	at com.groupdocs.metadata.core.ip.a(Unknown Source)
	at com.groupdocs.metadata.core.ip.<init>(Unknown Source)
	at com.groupdocs.metadata.Metadata.<init>(Unknown Source)
	at com.groupdocs.metadata.Metadata.<init>(Unknown Source)
	at com.foo.pipeline.classification.GroupDocsMetadataCategoryReader.accept(GroupDocsMetadataCategoryReader.java:92)

"category-classifier-worker-1" #153 daemon prio=5 os_prio=0 cpu=266452.47ms elapsed=2586.51s tid=0x00007fa21d028520 nid=199 waiting for monitor entry
   java.lang.Thread.State: BLOCKED (on object monitor)
	at com.groupdocs.metadata.internal.c.a.i.X.a(Unknown Source)
	- waiting to lock <0x0000000708e58330> (a java.lang.Object)
	at com.groupdocs.metadata.internal.c.a.i.T.a(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.T.i(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.T.getFileFormat(Unknown Source)
	at com.groupdocs.metadata.core.cW.a(Unknown Source)
	at com.groupdocs.metadata.core.eF.b(Unknown Source)
	at com.groupdocs.metadata.core.eh.blm(Unknown Source)
	at com.groupdocs.metadata.core.ip.a(Unknown Source)
	at com.groupdocs.metadata.core.ip.<init>(Unknown Source)
	at com.groupdocs.metadata.Metadata.<init>(Unknown Source)
	at com.groupdocs.metadata.Metadata.<init>(Unknown Source)
	at com.foo.pipeline.classification.GroupDocsMetadataCategoryReader.accept(GroupDocsMetadataCategoryReader.java:92)

"category-classifier-worker-2" #154 daemon prio=5 os_prio=0 cpu=263077.62ms elapsed=2586.51s tid=0x00007fa21d029ab0 nid=200 runnable
   java.lang.Thread.State: RUNNABLE
	at com.groupdocs.metadata.internal.c.a.i.internal.lg.h.j(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.internal.lg.p.b(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.internal.lg.p.a(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.internal.lg.p.a(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.internal.lg.p.d(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.internal.kl.a.<init>(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.internal.fW.c.a(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.internal.jb.f.b(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.X.a(Unknown Source)
	- locked <0x00000007afda9ca0> (a java.lang.Object)
	- locked <0x00000007afda9ca0> (a java.lang.Object)
	- locked <0x0000000708e58330> (a java.lang.Object)
	at com.groupdocs.metadata.internal.c.a.i.T.a(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.T.i(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.T.getFileFormat(Unknown Source)
	at com.groupdocs.metadata.core.bv.a(Unknown Source)
	at com.groupdocs.metadata.core.eF.b(Unknown Source)
	at com.groupdocs.metadata.core.eh.blm(Unknown Source)
	at com.groupdocs.metadata.core.ip.a(Unknown Source)
	at com.groupdocs.metadata.core.ip.<init>(Unknown Source)
	at com.groupdocs.metadata.Metadata.<init>(Unknown Source)
	at com.groupdocs.metadata.Metadata.<init>(Unknown Source)
	at com.foo.pipeline.classification.GroupDocsMetadataCategoryReader.accept(GroupDocsMetadataCategoryReader.java:92)

"metadata-extractor-worker-0" #156 daemon prio=5 os_prio=0 cpu=122496.34ms elapsed=2586.51s tid=0x00007fa21d02bdb0 nid=202 waiting for monitor entry
   java.lang.Thread.State: BLOCKED (on object monitor)
	at com.groupdocs.metadata.internal.c.a.i.X.a(Unknown Source)
	- waiting to lock <0x0000000708e58330> (a java.lang.Object)
	at com.groupdocs.metadata.internal.c.a.i.T.a(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.T.i(Unknown Source)
	at com.groupdocs.metadata.internal.c.a.i.T.getFileFormat(Unknown Source)
	at com.groupdocs.metadata.core.dc.a(Unknown Source)
	at com.groupdocs.metadata.core.eF.b(Unknown Source)
	at com.groupdocs.metadata.core.eh.blm(Unknown Source)
	at com.groupdocs.metadata.core.ip.a(Unknown Source)
	at com.groupdocs.metadata.core.ip.<init>(Unknown Source)
	at com.groupdocs.metadata.Metadata.<init>(Unknown Source)
	at com.groupdocs.metadata.Metadata.<init>(Unknown Source)
	at com.foo.pipeline.metadata.GroupDocsMetadataExtractor.extract(GroupDocsMetadataExtractor.java:156)

Observations

  • Many threads are blocked waiting for the same internal monitor (<0x0000000708e58330>).
  • The blocking happens during getFileFormat() while constructing Metadata.
  • This effectively serializes metadata reading across threads.

Questions

  1. Is GroupDocs.Metadata thread-safe when multiple Metadata instances are created concurrently?
  2. Is there some global synchronization or shared cache inside the library that could cause this contention?
  3. Is this expected behavior (e.g., during file format detection)?
  4. Are there recommended practices to avoid this contention when processing files in parallel?
  5. Would using a different initialization pattern or configuration help?

My goal is to process many files concurrently, but currently the internal synchronization prevents scaling with multiple threads.

Any clarification or recommendations would be greatly appreciated.

Thank you.