Compare two Word documents with data in tables using .NET

Hello support,

we are comparing two Word documents with some data in tables using GroupDocs.Comparison.
During testing of our POC (.Net) we got some incorrect comeparation issues that we would like to highlight and clarify would this issues be possible to fix in case we purchase the product license GroupDocs.Comparison

Please see in attachment next documents:

  • v1.docx - version #1
  • v2.docx - version #2
  • GroupDocs.ComparisonResult.docx - result of compare by GroupDocs.Comparison
  • LiteraTables.docx - result of compare by Litera, not perfect but readable

Please follow the link to find docs and results:

Best regards,
Olga Kyrychuk

@olgakyrychuk,

Thank you for taking interest in GroupDocs.Comparison for .NET and posting your concerns. In order to further investigate this issue at our end, we need following details from you:

  • API Version that you integrated in your project/code
  • Sample project or code that you wrote for the comparison

@atirtahir3,

Thank you for your quick answer!
Version of GroupDocs.Comparison we are using 18.3.0

This is a peace of code we wrote for the comparison:

private static void RunGroupDocsComparison(Options opts, string sourceFilePath, string targetFilePath,
string resultFilePath)
{
// define and set comparison settings and properties.
var objComparisonSettings = new ComparisonSettings
{
DeletedItemsStyle = new StyleSettings
{
Color = System.Drawing.Color.Red,
StrikeThrough = true
},

           GenerateSummaryPage = opts.Summary,
           DetailLevel = opts.Level
       };

       // Get instance of GroupDocs.Comparison.Comparer and call method Compare.
       var comparer = ComparisonHelper.GetComparer();
       var result = comparer.Compare(sourceFilePath, targetFilePath, objComparisonSettings);

       // save result document to a file.
       //result.SaveDocument(resultFilePath);
       using(var fs = File.Create(resultFilePath))
       {
           result.GetStream().CopyTo(fs);
       }
   }

@olgakyrychuk,

Thank you for sharing the required code snippet. We tried to reproduce same issue at our end but the output file generated at our end is way different than yours. Please find the output file attached and let us know if this is the output you were expecting.
output.zip (29.4 KB)

Hello, we simplified our solution and ran GroupDocs.Comparison for .NET for few more times.

We got results that are not stable:
Sometimes we get result you sent in output.zip, sometimes we get the result we sent you earlier.
Please notice that 2nd table in output.zip is not correct.

I am sending you our solution and video with code runs ([https://www.youtube.com/watch?v=ijGGu4uTbWs]) that shows result is not stable and generates different result with the same input docx files.

Please find solution here: [Rrd.ActiveDocument.Comparison.7z - Google Drive]

v1 and v2.docx are same and located here [GroupDocsCompare.7z - Google Drive]

Thank you ,
Olga

@olgakyrychuk,

Thank you for providing the details. We successfully reproduced this issue at our end. Hence, it has been logged in our internal issue tracking system with ID:COMPARISONNET-1529. We’ll now investigate the root of this issue. And as we have any updates on it, we shall notify you.

@atirtahir3, thank you for you answer and creating an issue.

Could you please provide some approximate estimate when this issue will be taken to work and be fixed?
We are currently deciding which tool to use for comparation and this functionality is very important to us.

Thank you,
Olga

@olgakyrychuk,

Issue COMPARISONNET-1529 is under investigation. As we have any updates on it, we shall share with you.

@olgakyrychuk,

We would like to apprise you that COMPARISONNET-1529 shall be resolved in 18.4 version of the API. In middle of the April, we shall release GroupDocs.Comparison for .NET 18.4. As release gets on-board, we shall notify you.

@atirtahir3
Thank you very much for such quick answer and estimates!

Olga

@olgakyrychuk,

You are welcome.

Hello,

could you please share an update when release GroupDocs.Comparison for .NET 18.4 will be available?
Where could I find a list of features added/fixed in this release?

Thank you a lot!

Olga

@olgakyrychuk,

Release will be published till the end of this week.

Please go through GroupDocs.Comparison for .NET 18.4 release notes in order to see features added or fixed.

Hello thank you for your feedback!

Today I have updated library with new release version 18.4.

I tried to compare v1 and v2.docx that are attached to this bug.
I got result that you can find by link: Screenshot by Lightshot

I have found a few issues in the new results:

  1. The second table looks similar as 18.3 version.
  2. What does it mean “Evaluation Only. Created with Aspose.Words. Copyright 2003-2018 Aspose Pty Ltd.” and 'This document was truncated here because it was created in the Evaluation Mode." text inside compared doc? I haven’t seen it 18.3.

Could you please try to compare v1 and v2.docx on your side?

Thank you very much!

Olga

@olgakyrychuk,

We compared both problematic files at our end as well using latest release (18.4).

We are further investigating the output behavior.

It is the evaluation tag. It appears in the output files or documents when the license is not properly applied or is expired. Please make sure that your license is not expired and its been applied and loaded properly in the application.
This is the output file we got at our end - 18.4 output.zip (30.8 KB)

Hello Group Docs,

result at your end using 18.4 looks similar to our result with 18.4 library.
But table looks strange.
Would it be possible to have only data is compared in the tables but table cells are not compared (in fact table is same in both versions and only data was changed)?
Please see this screenshot of one possible option [Screenshot by Lightshot]

@olgakyrychuk,

We are further investigating this scenario. You shall be notified about the outcomes.

@olgakyrychuk,

We investigated the scenario you posted earlier and here are the outcomes. If you see this screenshot sameparts.png (69.4 KB)
the highlighted or the pointed out box/content appears in both source and target documents.So, API reacts to it as an unchanged part. That is why in result file other columns were drawn in red and blue (deleted and inserted).
However, if you see this result different content.png (2.0 KB), this is because the content was absolutely different (or almost) in both documents. That is why it appears in result file as deleted/inserted.
There is no possibility to compare just data inside the tables, API works by comparing content and its properties (family, font-size, etc) as well.