catarsa.com
čeština English (United Kingdom)
Story

User: NotSigned (NotSigned)

Logon: Logon

Language: en-US

LINQ to XML and XInclude

LINQ to XML and XInclude

Motivation

Split XML to more files (desing time), assemble them together in one XDocument (run-time). Solution should use the XInclude syntax.

There is a .xml file which you are using for instance for some configuration stuff. The content is parsed in your code with the Linq To Xml.

    var doc = XDocument.Load(path);
    doc.Root.Elements()....
    ...
For simplicity we will expact that the path contains the correct absolute/relative path. But be ready to write some PathBuilder for provided "relative" path in nested files...

The XML file could after some time reach size which is hard to maintain, hard to navigate. You would like to use something like the XInclude syntax to split the file into many parts:

  • Container.xml (including the Part_A and Part_B)
  •     Part_A.xml
  •     Part_B.xml
  •         PartB1.xml (included in Part_B.xml)
  •         PartB2.xml (included in Part_B.xml)

We would like to have these XML files mapped together and available as one XDocument, ready for Linq To XML querying

XInclude syntax

There is an element, often used for these purposes, XInclude:

<xi:include href="other.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />

The usage of this statement in the Container.xml file would look like:

// Container.xml
<?xml version="1.0" encoding="utf-8" ?>
<root xmlns="http://ProjectBase/Config.xsd" >    
    ...
    <xi:include href="Part_A.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
    ...
    <xi:include href="Part_B.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
    ...
</root>

To have it more hierarchical, there are other levels expressed by the Part_B_1 and Part_B_2 included in the Part_B

// Part_B.xml
<?xml version="1.0" encoding="utf-8" ?>
<node xmlns="http://ProjectBase/Config.xsd" >    
    ...
    <xi:include href="Part_B_1.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
    ...
    <xi:include href="Part_B_2.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
    ...
</root>

Linq To XML and native XInclude support

There isn't native support for XInclude in the Linq to XML. That's reasonable. XInclude syntax has essential attribute href which informs about the path to other '.xml' file. That file must be read and included; and what's more, this must be done when the XDocument is assembled (before the first Linq To XML extension method call). And that's out of the scope of the LINQ To XML, this must be handled during XDocument construction (not parsing).

Linq To XML is a set of extenison method which are able to operate with XDocument parts. But the XDocument must be already assembled.

We need to extend XDocument creation. We have to replace the XInclude elements with the target file content, before the Linq to XML would be applied...

Solution

We've started with the statement

    var doc = XDocument.Load(path); // the XDocument creation

Before we will continue playing with the doc variable (executing Linq to XML methods), we will replace the XInclude elemnts with their targets. Hierarchically (repeat the same for children of the children). That all will be nested in

  1. one recursive method reading the included files and
  2. its call just just next to doc container is loaded.
Recursive call

Let' start with the recursive method. The snippet contains comments, which should explain almost everything.

public static readonly XNamespace ConstNsInclude = "http://www.w3.org/2001/XInclude";
public const string ConstInclude = "include";
public const string ConstHref = "href";

protected virtual XElement ReplaceInclude(XElement include)
{
    if (include.IsNull() || include.Attribute(ConstHref).IsNull())
    {
        return null; // this is not an include in fact
    }
    var path = include.Attribute(ConstHref).Value;
    if (!System.IO.File.Exists(CreatePath(path)))       // CreatePath() see below
    {
        return null; // the hyperlink reference is not targeting existing file
    }
    XDocument doc; // tharget could be some binary, or damaged file
    try
    {
        doc = XDocument.Load(CreatePath(path));            // CreatePath() see below
    }
    catch
    {
        return null; // the content of the target is not an xml
    }
    while (doc.Descendants(ConstNsInclude + ConstInclude).Count() > 0)
    {
        var child = doc.Descendants(ConstNsInclude + ConstInclude).First();
        child.ReplaceWith(ReplaceInclude(child)); // the same for child XInlcudes
    }
    return doc.Root; // return changed complete graph
}

protected virtual string CreatePath(path)
{
    // implement the path creation, e.g.
    // href="part_B_1.xml" --> D:\data\part_B_1.xml

    // example implementation
    var target = path.Replace(&#39;/&#39;, &#39;\\&#39;);
    if (target.StartsWith("\\", StringComparison.Ordinal))
    {
        target = target.Remove(0, 1);
    }
    return Path.Combine(AppDomain.CurrentDomain.BaseDirectory, target);
}

Good, this recursive call will replace the XInclude elements with the content of referenced files.

Keep in mind, that at this moment, you cannot call doc.Save() because it would propagate the content of all these files into the container.xml
doc.Load() extending

We have one recursive XInclude reader, and we have to call it after the container.xml is loaded.

    ...
    var doc = XDocument.Load(CreatePath(path));
    // XInclude
    while (doc.Descendants(ConstNsInclude + ConstInclude).Count() > 0)
    {
        var child = doc.Descendants(ConstNsInclude + ConstInclude).First();
        child.ReplaceWith(ReplaceInclude(child));
    }

    // doc is including all the files, and can be target with Linq to XML
    ...

Summary

It is a good decision to split the XML into more files, when it starts to overgrow. One recursive method and its call described above will help us to act with this wide content as with one XDocumentg inside the ultimate LINQ to XML

Detail view Printable view