Free Support Forum - groupdocs.com

Strange problem when trying to redact Xls file

Below is my code in VB.NET
Using redactor As New Redactor(_CurrentFileName)
Dim redactionList = New Redaction() {New ExactPhraseRedaction(“John Doe”, New ReplacementOptions("[Client]")),
New RegexRedaction(“Redaction”, New ReplacementOptions("[Product]")),
New AnnotationRedaction("(?im:john)", “[redacted]”),
New RegexRedaction("\d{2}\s*\d{2}[^\d]\d{6}", New ReplacementOptions(System.Drawing.Color.Blue)),
New RegexRedaction("\d{2}\s
\d{2}[^\d]\d{6}", New ReplacementOptions("")),
New RegexRedaction("^\d+[,.]{1}\d+$", New ReplacementOptions(System.Drawing.Color.Blue)),
New RegexRedaction("^\w+([-+.’]\w+)
@\w+([-.]\w+).\w+([-.]\w+)$", New ReplacementOptions(System.Drawing.Color.Blue)),
New CellColumnRedaction(New CellFilter, New Regex("^\w+([-+.’]\w+)@\w+([-.]\w+).\w+([-.]\w+)*$"), New ReplacementOptions("[customer email]"))}
Dim result As RedactorChangeLog = redactor.Apply(redactionList)
’ false, if at least one redaction failed
If result.Status <> RedactionStatus.Failed Then
Dim opt As Options.SaveOptions = New Options.SaveOptions() With {.AddSuffix = True, .RasterizeToPDF = False, .RedactedFileSuffix = DateTime.Now.ToString(“yyyyMMddHHmmss”)}
savedName = redactor.Save(opt)
Else
’ Dump all failed or skipped redactions
For Each logEntry As RedactorLogEntry In result.RedactionLog
If logEntry.Result.Status <> RedactionStatus.Applied Then
Console.WriteLine("{0} status is {1}, details: {2}", logEntry.Redaction.GetType().Name, logEntry.Result.Status, logEntry.Result.ErrorMessage)
End If
Next logEntry
End If
End Using
As a test I use your “sample.xlsx”
First sheet. row 2 column A has word “redaction”, the complete column is replaced with [Product]
row 6,19,23,24 is the same as above
row 16 has word “redaction” and only the word is replaced with [Product]
the other 2 sheets redacted as expected

I have saved the redacted file if you need them

@elumicor

We’ve reproduced this issue and now we’re investigating it with ID REDACTIONNET-308. As there’s any update, we’ll notify you.

@elumicor

This issue will be fixed in some upcoming releases of the API (probably in version 20.9). However, as a workaround, you can use only case-insensitive regular expressions, for instance,

New RegexRedaction("(?i)Redaction", new ReplacementOptions("[Product]")),

instead of

New RegexRedaction("Redaction", new ReplacementOptions("[Product]")),

This workaround will work, replace all occurrences of this word with “[Product]”. If you need to wipe out the entire cell, containing this word, like it was in rows 6, 19, 23, 24, the expression will need start and end of the line characters:

New RegexRedaction(""^(?i)([\\s\\w\\d\"\\[\\],\\.:\\(\\)]*Redaction)+[\\s\\w\\d\"\\[\\],\\.:\\(\\)]*$"", new ReplacementOptions("[Product]")),