Skip to content
Saleh Yusefnejad edited this page Jul 19, 2021 · 2 revisions

This page covers the basics for getting started comparing HTML markup with each other.

Contents:

Concepts and background

This library allows you to get the differences between sets of HTML (AngleSharp) DOM nodes. We call the the nodes that we want to test for test nodes, and the nodes we are comparing against for control nodes.

Differences: There are three types differences that can be reported:

  • NodeDiff/AttrDiff: Represents a difference between a control and test node or a control and test attribute.
  • MissingNodeDiff/MissingAttrDiff: Represents a difference where a control node or control attribute was expected to exist, but was not found in the test nodes tree.
  • UnexpectedNodeDiff/UnexpectedAttrDiff: Represents a difference where a test node or test attribute was unexpectedly found in the test nodes tree, but did not have a match in the control nodes tree.

Simple markup comparison

To find the differences between a control HTML markup fragment and a test HTML markup fragment, using the default options, the easiest way is to use the DiffBuilder class, like so:

var controlHtml = "<p>Hello World</p>";
var testHtml = "<p>World, I say hello</p>";
var diffs = DiffBuilder
    .Compare(controlHtml)
    .WithTest(testHtml)
    .Build();

Markup comparison with custom options

If you need custom options/strategies used during comparison, these can be passed to the DiffBuilder.

This example adds the default options, as well as an additional custom SpanFilter, which will filter out any <span> elements from the comparison.

var controlHtml = "<p>Hello World</p>";
var testHtml = "<p>Hello World <span>John</span></p>";
var diffs = DiffBuilder
    .Compare(controlHtml)
    .WithTest(testHtml)
    .WithOptions(options => options.AddDefaultOptions().AddFilter(SpanFilter))
    .Build();

static FilterDecision SpanFilter(in ComparisonSource source, FilterDecision currentDecision)
{
    return source.Node.NodeName == "SPAN"
        ? FilterDecision.Exclude
        : currentDecision;
}

To learn more about custom diffing strategies, see the Creating Custom Diffing Strategies page.

Create a reusable diffing instance

The HtmlDifferenceEngine and DiffBuilder cannot be reused. However, if you need a class that contains a set of diffing strategies and Compare methods, the HtmlDiffer class has you covered.

To use it, you need to pass in an IDiffingStrategy. Currently, the library comes with one, the DiffingStrategyPipeline, which allows you to register any custom strategy with it, as well as the built-in ones.

For example:

// Create a strategy pipeline and add the default options/strategies to it
var strategy = new DiffingStrategyPipeline();
strategy.AddDefaultOptions();

// Create the differ and pass in the strategy
var differ = new HtmlDiffer(strategy);

// Use the differ to compare markup
var controlHtml = "<p>Hello World</p>";
var testHtml = "<p>Hello World <span>John</span></p>";
var diffs = differ.Compare(controlHtml, testHtml);