Redaction of a 5MB PDF file having 329 pages, using 8 ExactPhraseRedaction redactions is taking around 100 seconds. The redacted file being created is also double the size of the original file that is, 10.5MB. Where as, on redacting a 5MB word (.docx) file, its taking only 5 seconds. Can you suggest some optimization for redaction of PDF files which can bring the redaction time somewhere near to 5 seconds and the redacted file size does not increase that much?
Can you share the performance metrics of redaction of different types of files?
System configuration:
RAM - 18GB
OS - macOS Sonoma
Chip - M3 pro
Code snippet being used:
final Redactor redactor = new Redactor(Constants.SAMPLE_PDF);
String[] word_list = new String[]{"378282246310005", "371449635398431", "6011111111111117", "35", "alex@servicenow.com", "546-637-8854", "Alex", "322271083"};
Redaction[] redactList = Arrays.stream(word_list).map(word -> new ExactPhraseRedaction(word,
new ReplacementOptions(Color.BLACK))).toArray(Redaction[]::new);
RedactorChangeLog result = redactor.apply(redactList);
if (result.getStatus() != RedactionStatus.Failed) {
SaveOptions saveOptions = new SaveOptions();
saveOptions.setAddSuffix(true);
saveOptions.setRasterizeToPDF(false);
redactor.save(saveOptions);
} else {
//Logging error
}
1 Like
@devansh.sharma1
Could you please share following details and we’ll further investigate this issue:
- Source/Problematic file
- GroupDocs.Redaction version that you are using
Please note that the redaction process depends on various factors for any document, depending on its content. For instance, a 5MB file with text-based content may take less time than a 5MB file containing paragraphs, tables, lists, or images. However, if you share the problematic file, we can identify the root cause of why this particular document is taking more time.
- Unable to attach the PDF as the maximum limit of attachment allowed is 4MB. Even tried compressing the file, still its above 4MB. Can you share your email ID where I can send you that file?
- I am using 24.9 version of GroupDocs.Redaction.
Moreover, the redacted file is almost double the size of the original one. Any reason for that and could you suggest some way to reduce the size of the redacted file?
@devansh.sharma1
Please upload the source/problematic file to some cloud storage and share link here. We’ll then investigate this issue.
This is the link to download: https://file.io/Tjs6ANzzZmDZ
1 Like
@devansh.sharma1
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): REDACTIONJAVA-242
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
Hi @atir.tahir
Any update on this issue? Did you get a chance to try it and measure the time?
1 Like
@devansh.sharma1
The resulting file size is noticeably larger than the original due to the addition of black boxes over the redacted text. However, when limiting the redaction to only the first page, the process completes in about 20 seconds, and the resulting file is even slightly smaller than the original. Please use a filter to restrict the redaction to the first page, as demonstrated in the code snippet.
// Define replacement options with the text "[REDACTED]"
ReplacementOptions options = new ReplacementOptions("[REDACTED]");
// Set the color of the redaction box to black
options.setBoxColor(Color.BLACK);
// Define filters to limit redaction to only the first page
RedactionFilter[] filters = new RedactionFilter[] {
new PageRangeFilter(PageSeekOrigin.BEGIN, 0, 1) // Process only the first page
};