Monday, November 8, 2010

LINQ – In-Memory Collections

LINQ – In-Memory Collections

In this article we will cover only the querying of in-memory collections.

This article has been designed to give you a core understanding of LINQ that we will rely heavily on in subsequent parts of this series.

Before diving into the code it is essential to define what LINQ actually is. LINQ is not C# 3.0, and vice versa. LINQ relies heavily on the new language enhancements introduced in C# 3.0; however, LINQ essentially is the composition of many standard query operators that allow you to work with data in a more intuitive way regardless of the data source.

The benefits of using LINQ are significant – queries are a first class citizen within the C# language, benefit from compile time checking of queries, and the ability to debug (step through) queries. We can expect the next Visual Studio IDE to take full advantage of these benefits – certainly the March 2007 CTP of Visual Studio Orcas does!

In-Memory Collections

The best way to teach new technologies is to just to show you an example and then explain what the heck is going on! – That will be my approach throughout this series; hopefully it is a wise decision.

For our first example we will compose a query to retrieve all the items in a generic List collection (Fig. 1).

Figure 1: Selecting all the items in a generic List collection

private static List<string> people = new List<string>() 
{ 
  "Granville", "John", "Rachel", "Betty", 
  "Chandler", "Ross", "Monica" 
};
 
public static void Example1() 
{
  IEnumerable<string> query = from p in people select p;
  foreach (string person in query) 
  {
    Console.WriteLine(person);
  }
}

The code example given in Fig. 1 is very basic and its functionality could have been replicated easier by simply enumerating through the items in the List via a foreach loop.

In Fig.1 we compose a query that will return each of the items in the people List collection by aliasing the people collection with a variable p and then selecting p (p is of type string remember as the people List is a collection of immutable string objects).

You may notice that query is of type IEnumerable - this is because we know that query will hold an enumeration of type string. When we foreach through the query the GetEnumerator of query is invoked.

At this time it is beneficial to look at exactly what the compiler generated code looks like (Fig. 2).

Figure 2: Compiler generated code for Fig. 1

public static void Example1()
{
  IEnumerable<string> query = people.Select<string, string>(delegate (string p) 
  {
    return p;
  });
  foreach (string person in query)
  {
    Console.WriteLine(person);
  }
}

Fig. 2 reveals that our query has actually been converted by the compiler to use an extension method (in this case just the Select extension method is used) taking a delegate as its argument.

You will find that queries and lambda expressions are simply a facade that we deal with in order to make our lives easier – under the covers the compiler is generating the appropriate code using delegates. Be aware of this internal compiler behavior!

Also be aware that a cached anonymous delegate method is generated at compile time as well (Fig. 3) – we will discuss this particular feature in future articles.

Figure 3: Compiler generated cached anonymous delegate method

[CompilerGenerated]
private static Func<string, string> <>9__CachedAnonymousMethodDelegate1;

We will now take a look at a more complex query of the same collection which retrieves a sequence of all strings in the List whose length is greater than 5(Fig. 4).

Figure 4: A more complex query

public static void Example2() 
{
  IEnumerable<string> query = from p in people where p.Length > 5 
  orderby p select p;
 
  foreach (string person in query) 
  {
    Console.WriteLine(person);
  }
}

The example in Fig. 4 relies on the use of two other standard query operators – Where and orderby to achieve the desired results.

If we examine the code generated by the compiler for the Example2 method you will see that shown in Fig. 5 – notice as well that we now have another two cached anonymous delegate methods (Fig. 6) – each of which having the type signature of their corresponding delegates (Where delegate and orderby delegate).

Figure 5: Compiler generated code for Fig. 4

public static void Example2()
{
  IEnumerable<string> query = people.Where<string>(delegate (string p) 
  {
    return (p.Length > 5);
  }).OrderBy<string, string>(delegate (string p) 
  {
    return p;
  });
  foreach (string person in query)
  {
    Console.WriteLine(person);
  }
}

Figure 6: Cached anonymous delegate methods for their respective Where and orderby delegates defined in Fig. 5

[CompilerGenerated]
private static Func<string, bool> <>9__CachedAnonymousMethodDelegate4;
[CompilerGenerated]
private static Func<string, string> <>9__CachedAnonymousMethodDelegate5;

The type signature of the Where delegate (Fig. 5) is Funcdelegate takes a string argument and returns a bool depending on whether the string was greater than 5 characters in length. Similarly the orderby delegate (Fig. 5) takes a string argument and returns a string.

No comments: