x
Reset Search
 
 

 

Article

How we compare paragraphs and tables and how we display the redline?

« Go Back

Information

 
StatusDraft
Main
We keep in mind an important fact when comparing documents and displaying the redline:
This is to maintain readability.

Imagine that you have two paragraphs to compare. A technically correct (but bad) comparison of these would match up all the common word like 'the', 'and', 'as', 'they', 'of', etc and then show the important words inserted and deleted around those tiny matches. That would strictly speaking show the changes between the two documents, but would be pretty much unreadable to a human - almost all the text would be inserted or deleted around tiny matches and it would be incredibly hard to understand the changes.

Workshare Compare understands this problem and has code to fix it - it will detect - that there is hardly any matched text and a much larger amount of insertions and/or deletions. If the ratio of the two passes a certain threshold then it will show the entire paragraph as deleted and the new one as inserted instead of showing the small matches and lots of individual changes.

The thing to note here is that there is a threshold for paragraph complexity - when the complexity of the changes goes past the threshold we change to the complete insert/delete representation for the paragraph. We have set this threshold based on internal testing, but sometimes there will be cases where the complexity value is near the threshold in which making the 'other' decision might seem to be a better choice. Paragraphs with change complexity a bit less than the threshold that are still hard to read or paragraphs with change complexity a bit over the threshold that are shown as insert/delete that might have been OK if we had shown the individual changes.

Now, what does this have to do with tables? Well, we have a similar change complexity threshold for table comparisons that will show the table as completely insert/delete if the complexity threshold is exceeded - so exactly the same discussion applies, but it's easier to talk about it initially for paragraph text to get the concepts clear.
Attachment 
Attachment 
Attachment 
Attachment 
Attachment 
We keep in mind an important fact when comparing documents and displaying the redline:
This is to maintain readability.

Imagine that you have two paragraphs to compare. A technically correct (but bad) comparison of these would match up all the common word like 'the', 'and', 'as', 'they', 'of', etc and then show the important words inserted and deleted around those tiny matches. That would strictly speaking show the changes between the two documents, but would be pretty much unreadable to a human - almost all the text would be inserted or deleted around tiny matches and it would be incredibly hard to understand the changes.

Workshare Compare understands this problem and has code to fix it - it will detect - that there is hardly any matched text and a much larger amount of insertions and/or deletions. If the ratio of the two passes a certain threshold then it will show the entire paragraph as deleted and the new one as inserted instead of showing the small matches and lots of individual changes.

The thing to note here is that there is a threshold for paragraph complexity - when the complexity of the changes goes past the threshold we change to the complete insert/delete representation for the paragraph. We have set this threshold based on internal testing, but sometimes there will be cases where the complexity value is near the threshold in which making the 'other' decision might seem to be a better choice. Paragraphs with change complexity a bit less than the threshold that are still hard to read or paragraphs with change complexity a bit over the threshold that are shown as insert/delete that might have been OK if we had shown the individual changes.

Now, what does this have to do with tables? Well, we have a similar change complexity threshold for table comparisons that will show the table as completely insert/delete if the complexity threshold is exceeded - so exactly the same discussion applies, but it's easier to talk about it initially for paragraph text to get the concepts clear.

Helpful?      

Feedback

Please tell us how we can make this article more useful.

Characters Remaining: 255