Saturday, April 26, 2008

Who is faster??

Friday 24. of April I attended a conference which was organized by the dotnetpro magazine. The theme of the conference was Jump! and it was intended to give developers an overview of the new features of Visual Studio 2008, .Net framework 3.5 and C# 3.0. It was very interesting to see the new technologies and in which direction we are going. Something that impressed me was the new query language LINQ which can be used to query many kinds of data such as lists, xml files or databases. I was curious about the performance of this new technology and therefore I did some performance tests. I wanted to compare the traditional for loop, the nowadays mainly used foreach loop, the FindAll method of the collections class and LINQ. I executed every "method" two times because especially for LINQ this makes a big difference. The task that each "method" had to perform was to find all values which are bigger than 5 in a list of integers which contains all together 1,000,000 elements and add it to a result list. The result list contained about 400,000 elements because I filled it with a random number between 0 and 10.
Since under windows it is almost impossible to get accurate performance results because services and applications in the background interrupt my test application I run it altogether 10 times and then I took the least value for every "method". I was very impressed of the results, but first I will show you the code that I used:
This is the simple for loop that everybody knows how it works.
for (int i = 0; i <> 5)

Here we have the well known foreach loop which is and should be used.
foreach (int num in numbers)
if (num > 5)

This the FindAll method of the collections class. It uses a predicate to decide which elments should be returned and I used an anonymous delegate to state my condition.
result = numbers.FindAll(new Predicate(delegate(int i) { return i > 5; }));

And finally here we have the LINQ query which looks like SQL and allows to make structured queries in the code. The var keyword in front states that the compiler decides of which type the variable "res" is. In this case it will be a collection of integers.
var res = from num in numbers
where num > 5
select num;

To talk about syntax; I prefer the LINQ syntax, because it is compact, structured and SQL like which many developers know quite well. Moreover when you have to make grouping and you do this with a foreach loop then you require much more lines of code as when you do it with LINQ. And as a speaker on the conference said:
More lines of code contain more errors.

Ok here there is the result of my measurements:

As you can see on the chart above, LINQ is by far the fastest "method" to perform such "queries". I was very impressed by the velocity. The difference between the first run and the second run lies in the implementation of LINQ. A speaker on the conference said that behind the scenes they build a binary tree and use it for searching. Therefore when this tree is build once all following queries are very very fast, also if you change the search condition.
Other things that you can see from the chart is that the good old foor loop is compared to the foreach and the FindAll "methods" the fastest. Despite this fact I will use the foreach loop in future because it is more structured and requires less thinking about the underlying array, list,... because the collection class does the work for you. I was disappointed of the performance of the FindAll method of the collection class. It was the slowest in the test. This could be because it uses predicates and anonymous delegates which may use more resources. The syntax of the FindAll is also compact and after some learning you know how to create anonymous delegates. The advantages of the methods of the collection class such as FindAll, Find, ConvertAll,... is that this features are available already under the .Net 2.0 framework and that you need only one line of code to do such "queries". The big disadvantage is as already said above that it is the slowest method to find a list of elements which satisfy a condition. For the other methods (Find, ConvertAll) I will maybe do some tests to see their performance.

If you are interested to run this performance test on your own machine you can download it here.

Tools used:
Microsoft Visual Studio 2008 Express Edition

Wednesday, April 9, 2008

Dynamic number of parameters

Lastly I realized a very interesting feature of C#; the possibility to have a dynamic number of parameters for methods. This feature can easily be used in any method by specifying the "params" keyword and then providing an array of objects which holds the parameters. This technique allows you to create interesting methods and giving you the possibility to make your code shorter and better understandable.
First of all I'd like you to show an add method which accepts dynamic number of parameters.
public static int Add(params int[] numbersToAdd)
int sum = 0;
foreach (int number in numbersToAdd)
sum += number;
return sum;

It can be called in the following two ways:
int[] numbers = new int[4];
numbers[0] = 3;
numbers[1] = 8;
numbers[2] = 22;
numbers[3] = 2;
* 35

Console.WriteLine(Add(3, 8, 22, 2));
* 35

Which version do you like more? I prefer the second one because I don't have to create an array of integers. Sure this method makes not really sense but it is only for demonstration purposes.
Another interesting method could be written which is the "Concat" method. It basically concatenates every object passed as parameter to a string and returns it:
public static string Concat(params object[] objects)
StringBuilder sb = new StringBuilder();
foreach (object obj in objects)
return sb.ToString();

The usage of this method would be the following:
Console.WriteLine("\n" + Concat("hello world ", 4, " ", true));
* hello world 4 True

Two other methods which could be interesting are the "Copy" and "CreateAndCopy" methods. The "CreateAndCopy" method takes an object as parameter, creates a new instance of the same type and copies the properties which are specified by their name in the parameters. In addition it is a generic so that no cast is necessary and it can be used with any object. Note that when calling this method the type has not necessarily be specified because C# defers the type from the object passed.
public static T CreateAndCopy<T>(T source, params string[] propertiesToCopy)
T target = (T)System.Activator.CreateInstance(typeof(T));
foreach (string propertyName in propertiesToCopy)
PropertyInfo property = typeof(T).GetProperty(propertyName);
property.SetValue(target, property.GetValue(source, null), null);
return target;

It can be used like this:
MyObject o1 = new MyObject();
o1.Text = "hallo";
o1.Number = 99;
o1.Boolean = true;

* MyObject:
* Text -> hallo
* Number -> 99
* Boolean -> True

MyObject newObj = CreateAndCopy(o1, "Text", "Boolean");
* MyObject:
* Text -> hallo
* Number -> 0
* Boolean -> True

The "Copy" method does something similar but it accepts a destination object to which the values of the specified properties are copied to:
public static T Copy<T>(T source, T destination, params string[] propertiesToCopy)
foreach (string propertyName in propertiesToCopy)
PropertyInfo property = typeof(T).GetProperty(propertyName);
property.SetValue(destination, property.GetValue(source, null), null);
return destination;

The possibility to specify variable number of parameters may sometimes help to make the code easier to read and better understandable but in some circumstances it is better to create a class which holds all the required values and pass an instance of it to a method.

Monday, April 7, 2008

Generic progress dialog

Quite often I have to perform long operations in an application and then I want to show a progress bar to the user that he can see that the application is working and how long it will take. My requirements for the progress bar dialog are:
  • it should be generic and therefore not be tied with the work which has to be done
  • it should not block or freeze the main application
  • the user must have the possibility to abort the operation
It turned out that it is quite challenging to meet all these requirements. With different approaches I had different problems and could not fulfill my requirements. For example there was a version where the progress dialog was not updated or the user could not abort the operation. After some trying I found a quite usable solution. The main application uses a backgroundworker to do the work; this has the advantage that the computation is done in a different thread and the backgroundwoker provides cancel and progress notification. My progress bar dialog is started also in its own thread to make sure that its graphics are not blocked and updated when needed. With this "architecture" it is even possible that the main application performs some other operations while a heavy work is done and the progress shown to the user. The progress dialog provides all its functionalities through static methods which access a "private singleton". This means that the developer never gets an instance of the progress dialog and that at most one instance of the progress dialog exists. To notify the main application about the cancel event I used an event with a delegate. Therefore the main application can do what it wants when the user presses cancel.
When the user presses cancel then a message box is shown which asks if he really wants to abort. The nice thing with this implementation is that while this message box is shown the computation goes on and is not blocked.

The following code shows the progress bar, sets the title and message, assigns a method to the cancel event and starts the computation if the background worker is not already working:
if (!this.backgroundWorker.IsBusy)
this.Enabled = false; /*lock the main application*/
ProgressDialog.SetTitle("this is the title");
ProgressDialog.SetMessage("I am the message for the very long task. Please be patient and wait...");
ProgressDialog.CancelEvent += new ProgressDialog.CancelEventHandler(pd_CancelEvent);

Important is to set the cancellation property of the background worker to true; this is best done in the designer or in the constructor of the main form:
this.backgroundWorker.WorkerSupportsCancellation = true;

The computation is done in the do_work event of the background worker. Here is important that in the loop the check for the cancel event is done because otherwise the backgroundworker will not stop on cancel. To update the progress bar it is enough to call the SetValue method of the progress dialog. This is only a very stupid operation for demonstration:
private void backgroundWorker_DoWork(object sender, DoWorkEventArgs e)
for (int i = 0; i < max; i++)
/*this check is needed in order to abort the operation if the user has clicked on cancel*/
if (this.backgroundWorker.CancellationPending)
e.Cancel = true;

ProgressDialog.SetValue(i); /*update the value*/

Another thing that I have to mention. In debug mode there occurs always an exception which says that this operations are not thread safe. Till now I did not had the time to look how I can make this code thread safe, but as soon as I have a thread safe version (if one exists) I will post it.
I finally had the time to learn how to write thread safe method calls. The code below shows how it is done. First you have to check if the mehtod was called by a Thread other than the own Thread. This does the property "InvokeRequired". If so a delegate is created and the method is called with "Invoke".
public static void SetTitle(string title)
if (instance.InvokeRequired) /*if another thread called this method*/
SetTitleCallback s = new SetTitleCallback(SetTitle);
instance.Invoke(s, title);
instance.Text = title;

You can download the full demo project here.

Finally a screenshot on how my progress bar dialog looks like:

If you have suggestions, ideas or You know how to improve this code or if You have a thread save version You are welcome to post a comment.