Eager Loading with Entity Framework and NHibernate

At CB Richard Ellis, Rich Alger and I wrote and maintain an internal ordering system that uses NHibernate. It was our first experience with NHibernate, but it was a huge success. Much of our initial focus was to just get things working, not to optimize it, since NHibernate has a bit of a steep learning curve. Now that we have more experience with it, we’re revisiting some of our strategies to improve them. My approach is this:

Set up a SQL Trace on the dev database
Load up the dev deployment of the ASP.NET app
Load a page I want to optimize
Count the number of queries going to the database as a result of loading that page
Optimize queries till I’m making the fewest db trips possible

Some pages were sending hundreds of queries! We had a lot of the SELECT N+1 problem going on, and some of those were pretty easy to fix. In general, we are trying to improve our loading strategies, to eagerly load everything we know we’ll need the FIRST time around, and to avoid the use of lazy loading, since it requires additional trips to the database. Lazy loading is great in dynamic scenarios, but if you already know you’re going to need something, don’t wait. Get it the first time around.

Which brings me to my current point. Eager loading with NH versus EF. Ayende recently asked what EFv4 can do that NH can’t. In a subsequent post he replies to the many responses he received. I have an ongoing discussion with him in the comments regarding eager fetching.

Suppose you had these tables:
User <1---*> User_x_Role <*---1> Roles <1---*> Role_x_Permission <*---1> Permission
(The reason for the associative tables is that I hate many-to-many relationships. They are evil).
These tables set up a basic security model. Certain operations and resources require that you have a specific permission. Permissions are assigned to roles, and roles are assigned to users. A user’s permissions, then, are inherited from the roles he is assigned to. Permissions are not directly assignable to users. You must use a Role.

These tables would result in a model like this:

User	The User class
Role	The Role class
Permission	The Permission class
User.Roles	An IList<Role> that contains the Roles assigned to the User
Role.Permissions	An IList<Permission> that contains the permissions assigned to the Role
Role.Users	An IList<User> that contains all of the Users that are assigned this Role
Permission.Roles	An IList<Role> that contains all of the Roles are that are assigned this Permission

Checking if a User has a Permission may look like:

someUser.Roles.Permissions.Contains(somePermission);

So when fetching a user from the database, we may want to eagerly fetch his roles, and the permissions of those roles. With EF, I can do this:

using (MyEntities context = new MyEntities())
{
	int userID = 6;
	var qUser =
		from u in context.User.Include("Roles.Permissions")
		where u.ID == userID
		select u;
	User user = qUser.FirstOrDefault();

	// Roles is already populated
	int numRoles = user.Roles.Count;

	// As are the Permissions of each Role
	int numPermissions = user.Roles.First().Permissions.Count;
}

Note the use of ObjectQuery<T>.Include(string path). Note that it’s chainable. If you had some other related entities or collections of User that you wanted to eagerly fetch you could do this:

from u in context.User
	.Include("Roles.Permissions")
	.Include("Orders.OrderDetails")
	.Include("Employer")
	.Include("ContactInfo")
select u;

Basically, you have complete control over what data will be eagerly fetched, and you’ll get back the exact object graph you want, with all of the collections populated, and related entities loaded. If you think that’s impressive, fire up SQL Profiler and do a trace on your database. All of the data for all of those entities and collections is retrieved in one query. It may be full of subselects, joins, unions, etc, but it’s one query. One trip to the database, to issue one query, and you have all the data you want. Because of the ease of use of ObjectQuery<T>Include(string path), it’s easy to make methods on your entity classes that give you a LOT of flexibility when it comes to getting the data you want:

		public  static IList<Affiliate> GetPage(int pageIndex, int pageSize,
			string where, string orderBy, out int rawRecordCount, params string[] includes)
		{
			// Start with a basic ObjectQuery<T> that would just return ALL rows (no criteria or sorting)
			ObjectQuery<Affiliate> query = Jam2DataLayer.Instance.Context.Affiliate;

			// Apply the where-clause if they provided one.
			if (!String.IsNullOrEmpty(where))
				query = query.Where(where);

			// Apply the orderby-clause if they provided one
			if (!String.IsNullOrEmpty(orderBy))
				query = query.OrderBy(orderBy);
			else
				// If no sort-by clause was provided, sort by the ID desc (show newest records first)
				// (There has to be sorting applied for paging to work with Skip() and Take())
				query = query.OrderByDescending(ent => ent.ID);

			// Execute the query at this point, but only to retrieve a count of the records it returns.
			// We want to know the number of records our query WOULD return, if it weren't paged.
			rawRecordCount = query.Count();

			// Set up any eager loading that was specified
			foreach (string include in includes)
				query = query.Include(include);

			// Get results just for the page we're interested in.
			query = query.Skip(pageIndex * pageSize).Take(pageSize);

			// Execute the query, storing the results in a List<T>
			List<Affiliate> list = query.ToList();
			return list;
		}


		// Example usage: Get the first 10 active Affiliate objects, with their User entity eagerly loaded
		int totalRecords;
		IList<Affiliate> activeAffiliates = Affiliate.GetPage(0, 10, "it.Active == true", null, out totalRecords, "User");

Definitely a handy method to have on each of your entity classes (note: in an upcoming post, I’ll be showing my T4 templates that use the SFS Plugin for Visual Studio 2008 to generate just such code for each of your entity classes).

Now, can this be done in NHibernate? Honestly, I don’t know yet. I’ve been trying. My first attempt went something like this:

public virtual IList<T> GetPage<T>(int pageIndex, int pageSize, params string[] eagerFetches)
{
	ICriteria crit = CurrentSession.CreateCriteria<T>();
	crit.SetFirstResult(pageIndex * pageSize);
	if (pageSize > 0)
		crit.SetMaxResults(pageSize);

	// Now set up the eager fetching for whatever the caller specified
	foreach(string entity in eagerFetches)
		crit.SetFetchMode(entity, FetchMode.Eager);

	return crit.List<T>();
}

That seemed to work! GetPage<User>(0, 10, “Roles”) DID indeed fetch 10 users, with their Roles collections already populated. Then I tried “Roles.Permissions”. No dice! The data that comes back is correct, but it issues 10 separate queries. It’s still lazy loading. And, when you start to include multiple collections for the root entity, as in GetPage<User>(0, 10, “Roles”, “Orders”), you start getting Cartesian products, producing millions of rows sometimes. NH doesn’t sort it out. This question on StackOverflow has an answer that quotes from the NHibernate In Action book: From “NHibernate in Action”, page 225: NHibernate currently limits you to fetching just one collection eagerly.

Perhaps that is still the case. I mentioned this on Ayende’s blog, suggesting that perhaps this is an area where EF has the features in tow, and NH is still lacking. He told me that FetchMode.Eager is the same as FetchMode.Join, which would be the reason for the cartesian products. He said to look at FetchMode.Subselect, and Future queries. Well FetchMode.Subselect can only be set in the mapping file, it can’t be set arbitrarily on an ICriteria (the NHibernate.FetchMode enum does not have a field named Subselect), so that’s out. I want to leave my mappings as lazy, and specify eager only when I want it, on a per query basis. As for the Future<T> queries, I haven’t been able to get them to do what I want either. I might be able to get individual lists back, that all contain the related entities I’m interested in, by delaying the query to the database, storing the queries in an IMultiCriteria, but that isn’t going to give me a connected object graph, just disparate lists. I’ll keep poking around with different ways to hopefully accomplish the same thing in NH. For now, hopefully the discussion stays alive in the comments on his blog. I’d love to find a way to do this in NH, as it’s a powerful feature. I have a feeling that even if we conjure something up, it’s not going to be the same as what EF does, issuing only a single query to the database. Time will tell.

Latest Images

Trending Articles

Latest Images