Document redaction performance is getting slow and the output file size becomes huge in C# .NET

I would like to share and clear few doubts which I faced while using it,

  • Processing time is getting very delayed for each file to finish the redaction process.
  • File size is getting very huge when compared with the original file.
  • While processing the multiple files(more than five), the first file getting exception that “System Out of Memory”
1 Like

@ramanantechnocit

Please share following details and we’ll investigate these issues:

  • API version that you are using (e.g. 21.1, 21.3)
  • Sample code
  • Problematic files
  • Are you evaluating the API in trial mode (without a license)?

Hi Atir,
Thanks for the support.
Please find the response in inline,

  • API version that you are using (e.g. 21.1, 21.3)
    We are using dll called “GroupDocs.Redaction” with trial license

  • Sample code
    Redactor redactor = new Redactor(filename);
    redactor.Apply(new RegexRedaction("\d{2}\s*\d{2}[^\d]*\d{6}", new ReplacementOptions(System.Drawing.Color.Blue)));
    redactor.Save();
    redactor.Dispose();

  • Problematic files
    File 1: Test Sample 01.pdf (2.7 MB)
    File 2: Test Sample 02.pdf (725.7 KB)

  • Are you evaluating the API in trial mode (without a license)?
    Using the dll with trial license for 30 days GroupDocsLicense.PNG (21.7 KB)

@ramanantechnocit

Please, note, that by default Save() method rasterizes the document, i.e. renders each page into a raster image and replaces the page searchable content with it. This activity takes additional time and makes the file size grow. If you want to disable it, you can pass an instance of SaveOptions class, e.g.

redactor.Save(new SaveOptions(false, SaveOptions.SaveSuffix));

It will perform much faster with minimal changes in file’s size. For more details about saving options, please review this article in public documentation: Saving documents

Hi Alexander,
If we use the above code, it is not applying the redaction in the PDF.

For Example the below code will redact the XMPManifest

redactor.Apply(new RegexRedaction("\d{2}\s*\d{2}[^\d]*\d{6}", new ReplacementOptions(System.Drawing.Color.Blue)));

If we use save(), it is redacting the XMPManifest. But if we use the code suggested by you, it is not removing the XMPManifest

Please advise

@augustinechristo

The rasterized file does not import any metadata from the original file, so the XMP is not actually “redacted”, it is missing. Also, please, note, that RegexRedaction and coloring options apply only to the document’s body, not the metadata. In order to redact XMP headers, you will need one of Metadata redactions. For instance, you can try this instead:

redactor.Apply(new MetadataSearchRedaction(@"\d{2}\s*\d{2}[^\d]*\d{6}", "removed"));

Thanks Alex,
But Unfortunately MetaDataSearchRedaction not redacting the XMP Manifest.

   <xmpMM:Manifest>
        <rdf:Seq>
           <rdf:li rdf:parseType="Resource">
              <stMfs:linkForm>EmbedByReference</stMfs:linkForm>
              <stMfs:reference rdf:parseType="Resource">
                 <stRef:filePath>/Users/name/Desktop/Jobs/Subfolder/filename.psd</stRef:filePath>
              </stMfs:reference>
           </rdf:li>
        </rdf:Seq>
     </xmpMM:Manifest>

Even we tried to remove all metadata by using the below code, still it is not removing. Any Idea?
redactor.Apply(new EraseMetadataRedaction(MetadataFilters.All));

@augustinechristo

We could reproduce this issue at our end. It’s been logged in our internal issue tracking system with ID REDACTIONNET-383. As there’s any update, you’ll be notified.

Thanks Alex,
Looking forward for the Updated Version

@augustinechristo

GroupDocs.Redaction for .NET v21.9 that includes fix for this issue has been published. You can find the new version at

Have a nice day!