LINQ

API Based XML Streaming & Functional OO Programming

Ralf Lämmel, who is the man behind LINQ to XSD, has a couple of new papers on his site that I have not seen up until now. The first paper is on Function OO Programming and the second is on XML Steaming. I've only skimmed each but they're bound to be interesting to anyone interested in functional programming, streaming XML API's, and LINQ.


API-based XML streaming with FLWOR power and functional updates
Functional OO Programming with Triangular Circles

You can also find a bunch of other interesting papers on Ralf's website at: http://homepages.cwi.nl/~ralf/

Converting a CSV file to XML using LINQ to XML and Functional Construction

In this post we aim to transform a text file into a hierarchical XML document.  As shown in Listing 12.11, the text file will contain the following book information: the ISBN, Title, Author(s), Publisher, Publication Date, and Price.

Listing 12.11    CSV of Books

0735621632,CLR via C#,Jeffrey Richter,Microsoft Press,02-22-2006,59.99
0321127420,Patterns Of Enterprise Application Architecture,Martin Fowler,Addison-Wesley Professional,11-05-2002,54.99
0321200683,Enterprise Integration Patterns,Gregor Hohpe,Addison-Wesley 04 Professional,10-10-2003,54.99
0321125215,Domain-Driven Design,Eric Evans,Addison-Wesley Professional,08-22-2003,54.99
1932394613,Ajax In Action,Dave Krane;Eric Pascarello;Darren James,Manning Publications,10-01-2005,44.95

Our goal is to parse the data in the text file and produce a hierarchy of XML as shown below:

Listing 12.12    XML Output

<?xml version="1.0" encoding="utf-8" ?>
<books>
  <book>
    <title>CLR via C#</title>
    <authors>
      <author>
        <firstName>Jeffrey</firstName>
        <lastName>Richter</lastName>
      </author>
    </authors>
    <publisher>Microsoft Press</publisher>
    <publicationDate>02-22-2006</publicationDate>
    <price>59.99</price>
    <isbn>0735621632</isbn>
  </book>
  <book>
    <title>Patterns Of Enterprise Application Architecture</title>
    <authors>
      <author>
        <firstName>Martin</firstName>
        <lastName>Fowler</lastName>
      </author>
    </authors>
    <publisher>Addison-Wesley Professional</publisher>
    <publicationDate>11-05-2002</publicationDate>
    <price>54.99</price>
    <isbn>0321127420</isbn>
  </book>
  …
</books>

The XML is constructed in a bottom up manner with functional construction, and query expressions that select the relevant data out of the individual lines of the file are intertwined to produce the desired XML.

In order to create our desired XML we’ll need to open the text file, split each line in the file into an array, and place each item in the array into the appropriate XML element.  Let’s start with opening the file and splitting it into parts.

from line in File.ReadAllLines("books.txt")
let items = line.Split(',')
// add functional construction statements for creating the XML

We leverage the static ReadAllLines method available on the File class to read each line within the text file.  Since ReadAllLines returns a string array we can safely use it in our from clause.  To split each line we make use of the Split method available on string, as well as the let clause that is available in C#.  The let clause allows us to perform the split operation once and refer to the result in subsequent expressions.  Once we have our line split apart we can wrap each item into the appropriate XML element.  

var booksXml = new XElement("books",
  from line in File.ReadAllLines("books.txt")
  let items = line.Split(',')
  select new XElement("book",
    new XElement("title", items[1]),
    new XElement("publisher", items[3]),
    new XElement("publicationDate", items[4]),
    new XElement("price", items[5]),
    new XElement("isbn", items[0])
  );

We conveniently left the authors out of the above query since they require a little extra work.  Unlike the other fields in our text file, there can be more than one author specified for a single book.  If we go back and review the sample text file, we see that the authors are delimited by a semicolon (“;”).  

    Dave Krane;Eric Pascarello;Darren James
 
As we did with the entire line, we can Split the string of authors into an array, with each author being an individual element in the array.  To be sure we get our fill of Split, we make use of it one final time to break the full author name into first and last name parts.  Finally, we place the statements for parsing out the authors into a query, and wrap the results of our many splits into the appropriate XML.


new XElement("authors",
  from authorFullName in items[2].Split(';')
  let authorNameParts = authorFullName.Split(' ')
  select new XElement("author",
    new XElement("firstName", authorNameParts[0]),
    new XElement("lastName", authorNameParts[1])
  )
)


When we add it all together we get the final solution, which can be seen in Listing 12.13.  

Listing 12.13    Final Implementation
using System;
using System.Query;
using System.Xml.XLinq;
using System.IO;

namespace LinqToXmlSamples.FlatFileToXml {
  class Program {
    static void Main(string[] args) {
      XElement xml =
        new XElement("books",
        from line in File.ReadAllLines("books.txt")
        where !line.StartsWith("#")
        let items = line.Split(',')
        select new XElement("book",
          new XElement("title", items[1]),
          new XElement("authors",
            from authorFullName in items[2].Split(';')
            let authorNameParts = authorFullName.Split(' ')
            select new XElement("author",
              new XElement("firstName", authorNameParts [0]),
              new XElement("lastName", authorNameParts [1])
            )
          ),
          new XElement("publisher", items[3]),
          new XElement("publicationDate", items[4]),
          new XElement("price", items[5]),
          new XElement("isbn", items[0])
        )
      );
      Console.WriteLine(xml);
    }
  }
}

As we’ve seen over and over again, Linq to XML allows us to mix and match data from varying data sources into functional construction statements.  The result is a very consistent programming API for developers, which makes the way XML is created from other data sources – whether they be relational, object, or a text file – consistent and predictable.

Tags: xlinq, linq to xml, linq

LINQ Links - November 28, 2006

When can I start using LINQ in "production"?  Not soon enough...

Technorati tags: , , , ,

LINQ Links - November 15th, 2006 Edition

As an admitted LINQ addict I'm going to start posting "LINQ Links" which will contain the most recent, and most interesting articles, videos, blog posts, books, or etc that I've found about LINQ.

  • Rough Spots in the LINQ to XML Learning Curve - Mike Champion talks about some of the rough spots that he's seen from those learning LINQ to XML. 
  • LINQ for Visual C# 2005 - It appears all the major publishers are scrambling to get LINQ material out to developers.  Apress is the latest, with this 150 page eBook (Price: $12.49)
  • LINQ MSDN Forums - Ok, I admit it, I couldn't bear to have only 2 links in my inaugural "LINQ Links" post so I went scrounging for something else to include.  The LINQ Forums are a great place to visit if you have any questions or comments for the LINQ folks.
Technorati tags: , ,

C# Automatic Properties

Bart has a report from Anders talk at TechEd on C# 3.0 Future Directions.  I'm hoping that the various talks from TechEd will be posted somewhere for us all to enjoy.  One of the new features Bart mentions... that Anders mentions...that I mention now....is Automatic Properties.  Up until this morning I had not heard of it as a feature, but apparently it's coming in a future release.  Like some of the other C# 3.0 features its sounds like it involves a bit of compiler magic!  Typing this:

public string Bar { get; set; }

Results in the compiler generating this:

private string foo;
public string Bar {
  get { return foo; }
  set { foo = value; }
}

Cool, eh?

 

Technorati tags: , ,

More LINQ on Channel 9 & LINQ PDF from O'Reilly

For those looking to feed your Linq hunger you might want to checkout:

Technorati tags:

Concepts behind the C# 3.0 language

Tomas P has a nice post on the concepts behind the C# 3.0 language

Extra Linq Extension Methods

Troy Magennis has been posting a lot of interesting content on Linq.  He recently posted about the limitations that he ran into when working with the Standard Query Operators, and how he's looking to overcome those limitations by writing a set of "Extra Linq Extension Methods" that do what he wants.  It will be interesting to see how many alternate implementations and extensions for the Standard Query Operators are released as people find flaws or shortcomings in the default set.

tags: , , ,

ActiveRecord Queries get Operator Overloading

Ayende has been doing some cool stuff with operator overloading in ActiveRecord queries.  As I mentioned in some of my previous posts we've done a lot of the same things with our "Query API".  While it's not as nice as Linq, its a decent substition for the time being.

 

Technorati tags: , ,

Introduction to Functional Programming

Eric White has put together a nice tutorial on Functional Programming using C# 3.0.  Eric walks through the steps that he took while trying to learn about functional programming.  On my way to and from work I read Eric's tutorial and I really enjoyed it.  Eric talks a lot about the language features in C# 3.0 that allow functional programming, as well as how Linq takes advantage of ideas such as lazy evaluation.  Anyway, its well worth the read.

Technorati tags: , ,

PLinq Me!

Sounds cool!  One of the most exciting things about Linq is the ability for alternate implementations to be created for processing Linq expression trees.

Microsoft is working on a parallel implementation of its Language Integrated Query technology that will help programs execute faster, said the creator of the LINQ technology.

Anders Hejlsberg, a Microsoft technical fellow and lead architect for the C# language, said Microsoft has an internal project known as PLinq, which is an effort to create a parallel implementation of LINQ.

http://www.eweek.com/article2/0,1895,2009167,00.asp

tags: ,

Linq to XML needs an XNamespaceScope class

When working with XML data we inevitably have to concern ourselves with XML namespaces + XML namespace prefixes.  The Linq to XML API has been designed to make dealing with namespaces and namespaces prefixes as direct and straightforward as possible.  Rather than having to deal with XmlNamespaceManagers and the like we simple reference all of our elements and attributes using their fully expanded name (namespace + local name).

While the simplification provided by Linq to XML makes dealing with namespaces slightly more straightforward it doesn't go as far as I think it needs to.  We still need to remember to include our namespaces in every query we perform.  When working with XML trees that only contain one default namespace I'd like something simpler.  Enter the XNamespaceScope class.  The XNamespaceScope class would be used similar to how we make use of the TransactionScope class for managing transactions.  When we're about to work with an XML tree that only contains one namespace which we're interested in we can new up a XNamespaceScope class, place it inside a using block that surrounds our query expressions, and have Linq to XML use the XNamespace that's passed to the XNamespaceScope in all queries within the block.  So rather than this code where we have to repeatedly include our namespace (ns)

    1 XNamespace ns = "http://webservices.amazon.com/AWSECommerceService/2005-10-05";

    2 var booksToImport =

    3   from amazonItem in amazonXml.Descendants(ns + "Item")

    4   let attributes = amazonItem.Element(ns + "ItemAttributes")

    5   select new Book {

    6     Isbn=(string) attributes.Element(ns + "ISBN"),

    7     Title=(string) attributes.Element(ns + "Title"),

    8     PubDate=(DateTime) attributes.Element(ns + "PublicationDate"),

    9     Price=ParsePrice(attributes.Element(ns + "ListPrice")),

   10     BookAuthors=GetAuthors(attributes.Elements(ns + "Author"))

   11   };

We instead do:

    1 XNamespace ns = "http://webservices.amazon.com/AWSECommerceService/2005-10-05";

    2 using(new XNamespaceScope(ns)) {

    3   var booksToImport =

    4     from amazonItem in amazonXml.Descendants("Item")

    5     let attributes = amazonItem.Element("ItemAttributes")

    6     select new Book {

    7       Isbn=(string) attributes.Element("ISBN"),

    8       Title=(string) attributes.Element("Title"),

    9       PubDate=(DateTime) attributes.Element("PublicationDate"),

   10       Price=ParsePrice(attributes.Element("ListPrice")),

   11       BookAuthors=GetAuthors(attributes.Elements("Author"))

   12     };

   13 }   

 

Thoughts? 

tags: , ,

VB9 Prepares for World Domination

With people like Erik Meijer and Brian Beckman leading the charge, Visual Basic is on a course for World Dominiation .  Checkout the channel 9 video with Brian to learn more about how they're aiming to make VB the language of choice among the developers of the world.  As I stated before there are a lot of very nice features making their way into VB that could begin to grab my attention away from C#.  Then again maybe not

Generating an RSS Feed from the Event Log using Linq to XML

Jim Wooley put together a nice "cool code" sample for the Jacksonville Code Camp.  Obviously anything that uses Linq gets my vote.  He took third place for his Linq to XML code that generates an RSS feed from the Event Log using VB9's XML Literals.

Download

tags: , ,

Called on the carpet for not solving ALL Scoble's problems

M. David Peterson calls me on the carpet for not really solving ALL of Scoble’s problems and for not using a fancy algorithm to boot.  I was waiting for someone to expose me and it appears it’s happened.    Anywho, he has a nice post on how he solved Scoble’s problem using XSLT.  He ends up with 29 lines of XML (data + transform) compared to my 48 lines of Linq code.  Of course I like my version much more since it uses Linq but his post does point out an important point, while Linq and Linq to XML do an amazing job of providing a consistent programming API for accessing all sorts of data there may be times when XSLT (or some other technology) are more – maybe not more but just as – appropriate.  I for one really like what Linq provides.  The fact that I can write what is essentially transform code in my preferred programming language (C#) using my preferred data query api (Linq) and do so rather quickly rocks.

tags: , , ,