Tuesday, September 29, 2009

Full text search for Entity Framework

Recently I played a bit with the "new" Entity Framework (basically the Object Relational mapper of Microsoft). It seems to be made quite good and it makes it very easy to work with objects in your application without having to map them back and forth from and to SQL. Also the possibility to launch LINQ queries over the data is very interesting because this avoids many errors and improves usability.
But I missed something: a fulltext search.
SQL Server has a very good fulltext search integrated and it would be nice to be able to perform queries through LINQ with it; but since Microsoft wanted to support also other database management systems they had to use only standard SQL for their mappings.

Faced with this problem I resolved it by writing my own pseudo fulltext search by using the possibility of creating dynamic LINQ queries.
First I will show You the code and then I explain a bit what I have done.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
using System.Text;
using System.Reflection;
using System.Data.Objects;


public static class ObjectContextExtensions
{
/*
/// <summary>
/// Searches in all string properties for the specifed search key.
/// It is also able to search for several words. If the searchKey is for example 'John Travolta' then
/// all records which contain either 'John' or 'Travolta' in some string property
/// are returned.
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="query"></param>
/// <param name="searchKey"></param>
/// <returns></returns>*/
public static IQueryable<T> FullTextSearch<T>(this IQueryable<T> queryable, string searchKey)
{
return FullTextSearch<T>(queryable, searchKey, false);
}

/*
/// <summary>
/// Searches in all string properties for the specifed search key.
/// It is also able to search for several words. If the searchKey is for example 'John Travolta' then
/// with exactMatch set to false all records which contain either 'John' or 'Travolta' in some string property
/// are returned.
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="query"></param>
/// <param name="searchKey"></param>
/// <param name="exactMatch">Specifies if only the whole word or every single word should be searched.</param>
/// <returns></returns>*/
public static IQueryable<T> FullTextSearch<T>(this IQueryable<T> queryable, string searchKey, bool exactMatch)
{
ParameterExpression parameter = Expression.Parameter(typeof(T), "c");

MethodInfo containsMethod = typeof(string).GetMethod("Contains", new Type[] { typeof(string) });
MethodInfo toStringMethod = typeof(object).GetMethod("ToString", new Type[] { });

var publicProperties = typeof(T).GetProperties(BindingFlags.Public | BindingFlags.Instance | BindingFlags.DeclaredOnly).Where(p => p.PropertyType == typeof(string));
Expression orExpressions = null;

string[] searchKeyParts;
if (exactMatch)
{
searchKeyParts = new[] { searchKey };
}
else
{
searchKeyParts = searchKey.Split(' ');
}

foreach (var property in publicProperties)
{
Expression nameProperty = Expression.Property(parameter, property);
foreach (var searchKeyPart in searchKeyParts)
{
Expression searchKeyExpression = Expression.Constant(searchKeyPart);
Expression callContainsMethod = Expression.Call(nameProperty, containsMethod, searchKeyExpression);
if (orExpressions == null)
{
orExpressions = callContainsMethod;
}
else
{
orExpressions = Expression.Or(orExpressions, callContainsMethod);
}
}
}

MethodCallExpression whereCallExpression = Expression.Call(
typeof(Queryable),
"Where",
new Type[] { queryable.ElementType },
queryable.Expression,
Expression.Lambda<Func<T, bool>>(orExpressions, new ParameterExpression[] { parameter }));

return queryable.Provider.CreateQuery<T>(whereCallExpression);
}
}


I created this method as an extension method for an IQueryable<T>.
This means if you have your ObjectContext (from the Entity Framework) then when accessing all objects of some type e.g. "context.Customers" then this returns an IQueryable<Customers>
This means that You can call the above method like:

context.Customers.FullTextSearch("serachkey");

or

context.Customers.FullTextSearch("searchkey", true);

which searches only for whole words also if they contain spaces. This gives You then again an IQueryable<Customers> which contains only the filtered objects.

Ok. Now something about the code.
This method "simply" creates a lambda expression which gets all the string properties of the type T, calls the "Contains" method with the search key as parameter on them and connects them with an OR. This means that it would be similar to do something like:

context.Customers.Where(c => c.FirstName.Contains("searchkey") || c.LastName.Contains("searchkey") || c.Street.Contains("searchkey") || ...);

But this can become a very long, nasty and error prone task if You have for example 30 string columns. Exactly because of this I wrote this code :-)

Hope this helps You...