Compare and find the exact difference of top and left of a shape or a table in .NET

Hi,

I have got an automated task in hand to compare same documents coming from different source to find out differences.
My management wants more details about each difference about the changes for example what color, style, margins, shapes, images, tables or any formatting were changed.

Do you have something handy or help me to extract as much as possible information out of it?
I preparing a visual tool with differences.

Thanks,
Varun

@varun.arora

Could you please tell us about your development environment? Are you planning to implement .NET or Java variant of the API?

C#.net.

Thanks,
Varun

@varun.arora

Please have a look at the following code:

GroupDocs.Comparison.Comparer comparer = new GroupDocs.Comparison.Comparer(sourcePath);
comparer.Add(targetPath);
CompareOptions compareOptions = new CompareOptions();
StyleSettings changedStyleSettings = new StyleSettings();
changedStyleSettings.HighlightColor = Color.Red;
compareOptions.DetectStyleChanges = true;
compareOptions.GenerateSummaryPage = true;
compareOptions.ChangedItemStyle = changedStyleSettings;
comparer.Compare(@"D:\result.docx", compareOptions);

And these source, target and output files.zip (26.5 KB). API detects the style changes, item inserted, deleted. Furthermore, you can customize the style changes. Have a look at this documentation article. We have an open-source console application for your ease, download it from GitHub. Let us know if you need any further assistance.

Thank you.

I compare doc1 MSWord_Template Horizontal Lines.zip (29.6 KB)
with doc2 TextControl_Template Horizontal Lines.zip (13.7 KB)
and got this comparison: Comparison_GroupDocs_Template Horizontal Lines.zip (21.8 KB)

I noticed few strange things which I want clarifications:

  1. There was a line missing in doc2 and thats not captured in comparison document.
  2. Comparison shows last few lines deleted which are not actually deleted.
  3. Summary page shows everything 0. I also downloaded the code you suggested, its printing empty page as summary page.
  4. Can we get to know what styles were change?

I am using below code:

public int CompareDocs(Stream doc1Stream, Stream doc2Stream)
{
string outputFileName = Path.Combine(_DirectoryPath, “GrouDocsResults.docx”);

        using (Comparer comparer = new Comparer(doc1Stream))
        {
            comparer.Add(doc2Stream);
            CompareOptions compareOptions = new CompareOptions()
            {
                InsertedItemStyle = new StyleSettings()
                {
                    HighlightColor = System.Drawing.Color.Yellow,
                    FontColor = System.Drawing.Color.DarkGreen,
                    IsUnderline = true,
                    IsBold = true,
                    IsStrikethrough = true,
                    IsItalic = true
                },
                DeletedItemStyle = new StyleSettings()
                {
                    HighlightColor = System.Drawing.Color.LightGreen,
                    FontColor = System.Drawing.Color.DarkRed,
                    IsUnderline = true,
                    IsBold = true,
                    IsStrikethrough = true,
                    IsItalic = true
                },
                ChangedItemStyle = new StyleSettings()
                {
                    HighlightColor = System.Drawing.Color.LightGray,
                    FontColor = System.Drawing.Color.DarkBlue,
                    IsUnderline = true,
                    IsBold = true,
                    IsStrikethrough = true,
                    IsItalic = true
                },
                GenerateSummaryPage = true,
                ExtendedSummaryPage = true,
                CompareVariableProperty = true,
                //CompareDocumentProperty = true,
                CompareBookmarks = true,
                SensitivityOfComparison = 100,
                HeaderFootersComparison = true,
                CalculateCoordinates = true,
                ShowDeletedContent = true,
                DetalisationLevel = DetalisationLevel.High,
                DetectStyleChanges = true,
                MarkChangedContent = true,
                MarkNestedContent = true,
                ShowInsertedContent = true

            };
            comparer.Compare(File.Create(outputFileName), compareOptions);
        }
        Console.WriteLine($"\nDocuments compared successfully.\nCheck output in {Directory.GetCurrentDirectory()}.");
        return 1;
    }
1 Like

@varun.arora

Please have a look at this output.zip (23.6 KB).

It doesn’t has these issues.

If there’s a change, let’s say in style of the paragraph. API will elaborate that.
Could you please have a look at the resultant file shared above and point out the issues in that?

You were evaluating the API in trial mode and there are some trial limitations. The good thing is you can avail a temporary license here in the purchase wizard.

Thank you. I am getting the same comparison now.

  1. Still I dont see a line difference on page 1
    image.png (9.1 KB)

  2. How can I get details about style changes and format changes?

  3. How can I figure out what components were deleted or inserted?
    Count of deleted components: 8
    Count of inserted components: 3
    Number of changed styles: 565

Thanks,
Varun

@varun.arora

We are investigating this. Your investigation ticket ID is COMPARISONNET-2626.

Please have a look at this screencast.PNG (115.3 KB). You can get style changes and item inserted or deleted details through ChangeInfo[]

ChangeInfo[] changes = comparer.GetChanges();

Let us know if you have any further query.

Thanks for you reply. I was able to get style changes.

ChangeInfo[] changes = comparer.GetChanges();
string comparison = string.Empty;
foreach(var change in changes)
{

                comparison += $"Change Type: { change.Type.ToString()} {Environment.NewLine} Change Text: {change.Text} {Environment.NewLine} Source Text {change.SourceText} {Environment.NewLine} Target Text {change.TargetText} {Environment.NewLine}";
                comparison += $" Change ComparisonAction: {change.ComparisonAction.ToString()} - Change Box: X: { change.Box.X.ToString() }, Y: { change.Box.Y.ToString() }, Width: { change.Box.Width.ToString() }, Height: { change.Box.Height.ToString() } {Environment.NewLine}";
                

                foreach(var styleChange in change.StyleChanges)
                {
                    comparison += $"Property: { styleChange.PropertyName} - Old Value: { styleChange.OldValue } - New Value: {styleChange.NewValue} {Environment.NewLine}";
                }
                comparison += Environment.NewLine;
            }

I have couple of questions:

  1. Do we have an update on Line Deletion that you open investigation ticket?
  2. The APIs show some insertions and deletions. Can we know more about what is inserted or what is deleted?
  3. In an another comparison, an image was moved a little to bottom, but its not tracking that position change. It shows something change in docx but in API text, it doesn’t show any locatioin or X,Y coordinate change. Please confirm.

Thanks in Advance,
Varun

1 Like

@varun.arora

We are still investigating this issue. The expected version for the fix is 21.4. We’ll notify you as release gets on-board.

You can also get this information in ChangeInfo class. Have a look at this screenshot.PNG.png (74.1 KB) and this output.zip (38.3 KB). Let us know if you have any further question on this.

Could you please share the source, target and expected output files? We’d appreciate if you share the sample code as well.

Hello,
ChangeInfo gives empty string in Text property:
Change Type: Inserted
Change Text:
Source Text
Target Text
Change ComparisonAction: None - Change Box: X: 47.9844, Y: 1007.672, Width: 719.7661, Height: 35.79636
See Comparison_GroupDocs_Template Horizontal Lines.zip (16.3 KB)

Here is the other example that you requested. In MSWord_Template Page Header.zip (26.0 KB)
document, image is overlapping header and in TextControl_Template Page Header.zip (13.0 KB)
document the image is below inside the body. How can we track them?

Also, please let me know if you have any update on missing line?

Thanks,
Varun

@varun.arora

We are investigating this issue with ticket ID COMPARISONNET-2628.

This issue is also under investigation and the investigation ticket ID is COMPARISONNET-2629.

Please note that all free support issues are handled on first come first served basis. This issue is still in the queue. However, you can avail priority support in order to expedite it.

@varun.arora

We’ve an update on this issue. Currently, the position and layout options for pictures are not tracked. We will try to add this feature in some upcoming API release (probably 21.4). However, you’ll be notified in case of any further update.

@varun.arora

Please have a look at this screenshot.png (19.0 KB).
The “SourceText” and “TargetText” properties are designed to detect and present more advanced information about changes.
For example, finding a paragraph where a change was noticed and providing the state of a given paragraph from both documents.
Based on this, if the paragraph is fully inserted - this means that the “SourceText” will be empty, and the value in “TargetText” will fully correspond to the value of “Text”.
If we consider another example: one word from the paragraph was changed, in the properties “SourceText” and “TargetText” will be the state of the paragraph from different documents as shown in the screenshot.

Hello,

We have thousands and thousands of documents to compare that are same but comes from different source.
We are trying to automate the comparison and know what are the differences.

Would you mind setting up a meeting invite for presenting the demo and answering couple of our scenarios?

We work in US eastern time zone.

Thanks,
Varun

1 Like

@varun.arora

Please have a look at our free support policy. I am afraid but all the free support matters are only handled over this forum.

Hello Atir,

We have got one stylechanged where all three text fields were empty. What does that imply?
What does it imply?

Change Type: StyleChanged
Change Text:
Source Text
Target Text
Change ComparisonAction: None - Change Box: X: 46.0517, Y: 111.2972, Width: 256.9165, Height: 32.98928
Property: Width - Old Value: 188.65 - New Value: 189
Property: Height - Old Value: 21.85 - New Value: 21.75

In the below screenshot of document comparison, a table was shifted into the next line. In the comparison doc it gives something changed but doesn’t tell what’s changed. Its not even being tracked in changes via APIs.
Table-reinserted-next-line.png (2.7 KB)

Regarding that demo, I am not asking for more support, I am asking for a demo so that we can decide that the product meets our requirement for massive comparison task that we have ahead. The management can then make decision to purchase the product as well as paid support.

Thanks,
Varun

@varun.arora

We’ll surely help you out here. Could you please also share those documents? We’ll compare them at our end and answer your questions in details.

Hello Atir,

  1. I am trying to compare MSWord_Template Horizontal Lines.zip (29.6 KB)
    with TextControl_Template Horizontal Lines.zip (13.7 KB)
    I am getting lots of changes for deletion of HeaderFooter.0001_GDResults_Template Horizontal Lines.zip (17.4 KB)

Can you please explain why are those entries coming?

  1. For this document comparison MSWord_Default CC Form.zip (11.8 KB)
    with TextControl_Default CC Form.zip (5.4 KB)
    the position alignment is changed DefaultCCForm.png (2.5 KB)
    I think this resultset is not tracking it. 0001_GDResults_Default CC Form.zip (1.9 KB)

My code looks like:

public int CompareDocs(Stream doc1Stream, Stream doc2Stream, string customerFolder, string templateName, string counter)
{
string extension = Utility.GetFileExtension(doc1Stream);
string outputFileName = Path.Combine($"{ Constants._ComparisonDirectoryPath}{customerFolder}\{Constants._GDResultsDirectoryNameDocx}\", $"{counter}GDResults{templateName}.{extension}");

        using (Comparer comparer = new Comparer(doc1Stream))
        {
            comparer.Add(doc2Stream);
            CompareOptions compareOptions = new CompareOptions()
            {
                InsertedItemStyle = new StyleSettings()
                {
                    HighlightColor = System.Drawing.Color.Yellow,
                    FontColor = System.Drawing.Color.DarkGreen,
                    IsUnderline = true,
                    IsBold = true,
                    IsStrikethrough = true,
                    IsItalic = true
                },
                DeletedItemStyle = new StyleSettings()
                {
                    HighlightColor = System.Drawing.Color.LightGreen,
                    FontColor = System.Drawing.Color.DarkRed,
                    IsUnderline = true,
                    IsBold = true,
                    IsStrikethrough = true,
                    IsItalic = true
                },
                ChangedItemStyle = new StyleSettings()
                {
                    HighlightColor = System.Drawing.Color.LightGray,
                    FontColor = System.Drawing.Color.DarkBlue,
                    IsUnderline = true,
                    IsBold = true,
                    IsStrikethrough = true,
                    IsItalic = true
                },
                GenerateSummaryPage = true,
                ExtendedSummaryPage = true,
                CompareVariableProperty = true,
                //CompareDocumentProperty = true,
                CompareBookmarks = true,
                SensitivityOfComparison = 100,
                HeaderFootersComparison = true,
                CalculateCoordinates = true,
                ShowDeletedContent = true,
                DetalisationLevel = DetalisationLevel.High,
                DetectStyleChanges = true,
                MarkChangedContent = true,
                MarkNestedContent = true,
                ShowInsertedContent = true
            };
            string outputTextFileName = Path.Combine($"{ Constants._ComparisonDirectoryPath}{customerFolder}\\{Constants._GDResultsDirectoryNameTxt}\\", $"{counter}_GDResults_{templateName}.txt");
            try
            {
                comparer.Compare(File.Create(outputFileName), compareOptions);
                ChangeInfo[] changes = comparer.GetChanges();
                string comparison = string.Empty;
                foreach (var change in changes)
                {
                    
                    comparison += $"Change Type: { change.Type.ToString()} {Environment.NewLine} Change Text: {change.Text} {Environment.NewLine} Source Text {change.SourceText} {Environment.NewLine} Target Text {change.TargetText} {Environment.NewLine}";
                    comparison += $"Component Type: { change.ComponentType}, Change ComparisonAction: {change.ComparisonAction.ToString()},{Environment.NewLine}";
                    comparison += $"Change Box: X: { change.Box.X.ToString() }, Y: { change.Box.Y.ToString() }, Width: { change.Box.Width.ToString() }, Height: { change.Box.Height.ToString() } {Environment.NewLine}";
                    comparison += $"Page Info: Page number - { change.PageInfo.PageNumber }, Page Width: {change.PageInfo.Width}, Page Height: {change.PageInfo.Height}{Environment.NewLine}";


                    foreach (var styleChange in change.StyleChanges)
                    {
                        comparison += $"Property: { styleChange.PropertyName} - Old Value: { styleChange.OldValue } - New Value: {styleChange.NewValue} {Environment.NewLine}";
                    }
                    comparison += $"===================================================================={Environment.NewLine}{Environment.NewLine}{Environment.NewLine}{Environment.NewLine}";
                }
                File.WriteAllText(outputTextFileName, comparison);
            }
            catch(Exception ex)
            {
                log.Error($"Error in Convertint Document: {Environment.NewLine}{ex.Message}{Environment.NewLine}{ex.ToString()}");
                File.WriteAllText(outputTextFileName, 
                    $"Error in Converting Document: {Environment.NewLine}{ex.Message}{Environment.NewLine}{ex.ToString()}");
            }
        }
        Console.WriteLine($"\nDocuments compared successfully.\nCheck output in {Directory.GetCurrentDirectory()}.");
        return 1;
    }

Thanks,
Varun

@varun.arora

We are investigating this. The investigation ID for this issue is COMPARISONNET-2651.

This scenario is also under investigation with ticket ID COMPARISONNET-2652.
We’ll notify you as there’s any progress update.