Category Archives: Entity Framework

Repository + Unit Of Work Pattern Demystified

The repository and unit of work patterns are intended to create an abstraction layer between the data access layer and the business logic layer of an application. Implementing these patterns can help insulate your application from changes in the data store and can facilitate automated unit testing or test-driven development (TDD).

Creating a repository class for each entity type could result in a lot of redundant code, and it could result in partial updates. For example, suppose you have to update two different entity types as part of the same transaction. If each uses a separate database context instance, one might succeed and the other might fail. One way to minimize redundant code is to use a generic repository, and one way to ensure that all repositories use the same database context (and thus coordinate all updates) is to use a unit of work class.

A repository is nothing but a class defined for an entity, with all the operations possible on that specific entity. For example, a repository for an entity Customer, will have basic CRUD operations and any other possible operations related to it. A Repository Pattern can be implemented in Following ways:

  • One repository per entity (non-generic) : This type of implementation involves the use of one repository class for each entity. For example, if you have two entities Order and Customer, each entity will have its own repository.
  • Generic repository : A generic repository is the one that can be used for all the entities, in other words it can be either used for Order or Customer or any other entity.

Unit of Work in the Repository Pattern

Unit of Work is referred to as a single transaction that involves multiple operations of insert/update/delete and so on kinds. To say it in simple words, it means that for a specific user action, all the transactions like insert/update/delete and so on are done in one single transaction, rather then doing multiple database transactions. This means, one unit of work here involves insert/update/delete operations, all in one single transaction.

Repository pattern minus the Unit of Work Pattern provides of lot challenges an anti patterns

  • Each repository requires one interface and one concrete class per “provider” (“entity framework” or “in memory”). If you are starting to have quite some repositories, then interfaces and implementation grows. If you have “custom” specialized methods for each repository, it also means implementing them for all the providers.
  • One abstract type per repository equals one parameter in your controller constructor per repository that needs to be accessed by the controller. If a controller is performing operations on multiple repositories, this can quickly become a mess with a lot of parameters in the constructor (even if constructor will never be called explicitly by your code but by the DI container -constructor injection- it still looks kind of crapy). Moreover if you need to add access to a new repository in a controller, it means adding a new parameter to the constructor and doing a new binding in your DI container.
  • By injecting repositories individually in each controller, the true power of the unit of work pattern is completely bypassed.
    Indeed, through this basic design, to illustrate via EF Code First, a DbContext (EF UnitOfWork) is usually instantiated per repository and “at best” a Commit method is present in each repository to call SaveChanges on the DbContext (and apply all modifications to DB) once all operations have been done on the repository. At worse the call to SaveChanges on the DbContext is done in each repository method performing modifications in the repository. You are using a nice loose coupled design, playing nicely with dependency injection and unit testing but you are shooting yourself a bullet in the head by not correctly using the UnitOfWork pattern.The main problem with this incorrect use of the unit of work pattern is that for a specific user request, triggering call to action method and potentially accessing multiple repositories, doing work on them, you are creating multiple unit of works whereas a single unit of work should be used !The definition of the Unit Of Work pattern is rather clear : “Maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems.” (http://martinfowler.com/eaaCatalog/unitOfWork.html).The business transaction here is typically triggered by the end user, indirectly calling an action method. It starts when the end-user triggers the operation, and ends when the operation is completed, whatever the number of repositories accessed and the number of CRUD operations performed on them. This means that a single unit of work should be used in the context of the operation/transaction (the request) and not many different ones.Typically, to use Entity Framework as a provider example, following this bad design would result in calling SaveChanges multiple times, meaning multiple round trips to DB through multiple transactions which is typically not the behavior wanted (and absolutely not the Unit Of Work philosophy).Apart from the performance aspect, it also leads to a problem when an error/exception happens in the middle of an operation in an action method. If you already made some changes in some repositories and commited the changes, but the global operation is not complete (potentially other repositories should have been updated as well but have not been), it will leave your persisted data part of the operation in an incoherent state (I wish you good luck to rollback each changes). Whereas if you only use a single UnitOfWork for the operation, if it fails before completing (before reaching the end of the action method), then no data is updated at all part of the operation, your data store stays clean (and it also does a single round trip to the DB, in a single transaction for changes done accross all repositories).

Design and Approach

Define an generic repository type interface, containing very basic atomic operations

public interface IRepository<T> where T : class
{
    IQueryable<T> AsQueryable();
 
    IEnumerable<T> GetAll();
    IEnumerable<T> Find(Expression<Func<T, bool>> predicate);
    T Single(Expression<Func<T, bool>> predicate);
    T SingleOrDefault(Expression<Func<T, bool>> predicate);
    T First(Expression<Func<T, bool>> predicate);
    T GetById(int id);
 
    void Add(T entity);
    void Delete(T entity);
    void Attach(T entity);
}

Define an unit of work interface , containing all the generic repositories being part of the unit of work, along with a single Commit() method used to persist all changes done in the repositories to the underlying data store

public interface IUnitOfWork
{
    IRepository<Organiazation> OrderRepository { get; }
    IRepository<Employee> CustomerRepository { get; }
    
    void Commit();
}

Employee and Organization are pure POCO classes typical entity framework entities

Add class implementing the abstract generic repository, which just delegates all calls to the associated Entity Framework DbSet

public class EntityFrameworkRepository<T> : IRepository<T>
                                   where T : class
 {
     private readonly DbSet<T> _dbSet;
 
     public EntityFrameworkRepository(DbSet<T> dbSet)
     {
         _dbSet = dbSet;
     }
 
        #region IGenericRepository<T> implementation
 
     public virtual IQueryable<T> AsQueryable()
     {
         return _dbSet.AsQueryable();
     }
 
     public IEnumerable<T> GetAll()
     {
         return _dbSet;
     }
 
     public IEnumerable<T> Find(Expression<Func<T, bool>> predicate)
     {
         return _dbSet.Where(predicate);
     }
 
     public T Single(Expression<Func<T, bool>> predicate)
     {
         //TODO: To Be Implemented
         throw new NotImplementedException();
     }
 
     public T SingleOrDefault(Expression<Func<T, bool>> predicate)
     {
         //TODO: To Be Implemented
         throw new NotImplementedException();
     }
 
     public T First(Expression<Func<T, bool>> predicate)
     {
         //TODO: To Be Implemented
         throw new NotImplementedException();
     }
 
     public T GetById(int id)
     {
         //TODO: To Be Implemented
         throw new NotImplementedException();
     }
 
     public void Add(T entity)
     {
         //TODO: To Be Implemented
         throw new NotImplementedException();
     }
 
     public void Delete(T entity)
     {
         //TODO: To Be Implemented
         throw new NotImplementedException();
     }
 
     public void Attach(T entity)
     {
         //TODO: To Be Implemented
         throw new NotImplementedException();
     }
        #endregion
 }

Add class implementing IUnitOfWork and also inherits from DbContext (EF Code First unit of work).
It contains DbSets, which can be seen as repositories from EF point of view (in fact in EF Code First, you can substitute in your mind the word “Unit Of Work” with “DbContext” and “Repository” with “DbSet”).
The constructor just instantiate all the repositories by passing them the corresponding DbSet in their constructor. This can be improved by instantiating repositories only when they are accessed. Indeed if your unit of work contains 20 repositories and your controller is just going to use one, this is a lot of useless instantiation.

public class EntityFrameworkUnitOfWork : DbContextIUnitOfWork
 {
     private readonly EntityFrameworkRepository<Organization> _organizationRepo;
     private readonly EntityFrameworkRepository<Employee> _employeeRepo;
 
     public DbSet<Organization> Organizations { getset; }
     public DbSet<Employee> Employees { getset; }
 
     public EntityFrameworkUnitOfWork()
     {
         _organizationRepo = new EntityFrameworkRepository<Organization>(Organizations);
         _employeeRepo = new EntityFrameworkRepository<Employee>(Employees);
     }
 
        #region IUnitOfWork Implementation
 
     public IRepository<Organization> OrganizationRepository
     {
         get { return _organizationRepo; }
     }
 
     public IRepository<Employee> EmployeeRepository
     {
         get { return _employeeRepo; }
     }
 
     public void Commit()
     {
         this.SaveChanges();
     }
 
        #endregion
 }

Now if you need to implement another provider, let’s say “InMemory” (usefull for unit testing) all you have to do is to create two other classes InMemoryRepository and InMemoryUnitOfWork

Let’s say your HomeController needs to access all three different repositories.
With the old design, your HomeController constructor signature would have looked as

public HomeController(IEmployeeRepository empRepo, IOrganizationRepository orgRepo)
        { 
            
        }

And of course new repository to be used by HomeController, means new parameter + new private field + new binding in DI container.

With the improved design, now the signature looks much cleaner

public HomeController(IUnitOfWork unitOfWork)
        { 
        
        }

If we add new repositories that we want to use in the controller we don’t have to change anything in the controller class, nor bindings in DI container. We can just use the new repository directly from the controller through the unit of work.
No more thinking about which repositories should the controller have access too and customizing the constructor as such. Now all controllers constructors needing to access repositories just need a single unitOfWork parameter.

On DI container side, all that is needed is to bind the abstract IUnitOfWork to the desired provider implementation (EF, InMem, other …). You’ll also want to make sure that the dependency is created in a “per request scope” (considering your DI container allow this), meaning that a single unit of work will be instantiated per request and not each time the decency is required.

Let’s say you need in multiple places in your client code to run a complex query on a specific repository (IEmployeeRepository). Through the old design it’s relatively straightforward, you would just define a method in IEmployeeRepository and implement it for all the concrete providers (what a pain !).
However with the improved design we can’t add this method to IGenericRepository.

Extension methods can help in this scenario

Advertisements

IEnumerable and IQueryable + Entity Framework

Linq to SQL and Linq to Objects queries are not the same.

LINQ to Objects queries operate on IEnumerable collections. The query iterates through the collection and executes a sequence of methods (for example, Contains, Where etc) against the items in the collection.

LINQ to SQL queries operate on IQueryable collections. The query is converted into an expression tree by the compiler and that expression tree is then translated into SQL and passed to the database.

IQueryable inherits from IEnumerable

All LINQ to Objects queries return IEnumerable or a derivative of IEnumerable, all IEnumerable expressions and executed in memory against the full dataset

IQueryable uses a DbQueryProvider (IQueryProvider) to translate the expression (the chained extension methods) into a single database query (in this case, it generates T-SQL to run against the database). Once the query is invoked (by say, enumerating it), the query is executed against the database and the results are returned back to be consumed

All of your queries for data when using Entity Framework are written against DbSet
public class DbSet : DbQuery, IDbSet, IQueryable, IEnumerable, IQueryable, IEnumerable where TEntity : class
{

}