Saturday, April 26, 2008

Who is faster??

Friday 24. of April I attended a conference which was organized by the dotnetpro magazine. The theme of the conference was Jump! and it was intended to give developers an overview of the new features of Visual Studio 2008, .Net framework 3.5 and C# 3.0. It was very interesting to see the new technologies and in which direction we are going. Something that impressed me was the new query language LINQ which can be used to query many kinds of data such as lists, xml files or databases. I was curious about the performance of this new technology and therefore I did some performance tests. I wanted to compare the traditional for loop, the nowadays mainly used foreach loop, the FindAll method of the collections class and LINQ. I executed every "method" two times because especially for LINQ this makes a big difference. The task that each "method" had to perform was to find all values which are bigger than 5 in a list of integers which contains all together 1,000,000 elements and add it to a result list. The result list contained about 400,000 elements because I filled it with a random number between 0 and 10.
Since under windows it is almost impossible to get accurate performance results because services and applications in the background interrupt my test application I run it altogether 10 times and then I took the least value for every "method". I was very impressed of the results, but first I will show you the code that I used:
This is the simple for loop that everybody knows how it works.
for (int i = 0; i <> 5)
result.Add(numbers[i]);
}

Here we have the well known foreach loop which is and should be used.
foreach (int num in numbers)
{
if (num > 5)
result.Add(num);
}

This the FindAll method of the collections class. It uses a predicate to decide which elments should be returned and I used an anonymous delegate to state my condition.
result = numbers.FindAll(new Predicate(delegate(int i) { return i > 5; }));

And finally here we have the LINQ query which looks like SQL and allows to make structured queries in the code. The var keyword in front states that the compiler decides of which type the variable "res" is. In this case it will be a collection of integers.
var res = from num in numbers
where num > 5
select num;

To talk about syntax; I prefer the LINQ syntax, because it is compact, structured and SQL like which many developers know quite well. Moreover when you have to make grouping and you do this with a foreach loop then you require much more lines of code as when you do it with LINQ. And as a speaker on the conference said:
More lines of code contain more errors.

Ok here there is the result of my measurements:



As you can see on the chart above, LINQ is by far the fastest "method" to perform such "queries". I was very impressed by the velocity. The difference between the first run and the second run lies in the implementation of LINQ. A speaker on the conference said that behind the scenes they build a binary tree and use it for searching. Therefore when this tree is build once all following queries are very very fast, also if you change the search condition.
Other things that you can see from the chart is that the good old foor loop is compared to the foreach and the FindAll "methods" the fastest. Despite this fact I will use the foreach loop in future because it is more structured and requires less thinking about the underlying array, list,... because the collection class does the work for you. I was disappointed of the performance of the FindAll method of the collection class. It was the slowest in the test. This could be because it uses predicates and anonymous delegates which may use more resources. The syntax of the FindAll is also compact and after some learning you know how to create anonymous delegates. The advantages of the methods of the collection class such as FindAll, Find, ConvertAll,... is that this features are available already under the .Net 2.0 framework and that you need only one line of code to do such "queries". The big disadvantage is as already said above that it is the slowest method to find a list of elements which satisfy a condition. For the other methods (Find, ConvertAll) I will maybe do some tests to see their performance.

If you are interested to run this performance test on your own machine you can download it here.

Tools used:
Microsoft Visual Studio 2008 Express Edition

1 comment:

Peter Gfader said...

Did you try to look at the MSIL Code? Maybe you can get some anwers to performance questions there...

peter