Simplifying Entity Framework: Data-Driven Design


As described on MSDN, ADO.NET Entity Framework is a set of technologies that enables developers to work with data in the form of domain-specific objects and properties, such as customers and customer addresses without having to concern themselves with the underlying database tables and columns where this data is stored. Developers can use Entity SQL and traditional ADO.NET reader/adapter/dataset patterns or the new LINQ to Entities to write their code. With LINQ to Entities, the Entity Framework generates strongly-typed entity classes you can use in application code and rely on the runtime to figure out how the data needs to be retrieved from the underlying data source.

ADO.NET Entity Framework

Design of Entity Framework has been inspired by the traditional methodology of relational database design which suggests creating a conceptual (class) model, transforming it into logical (relational) model and finally into physical database model. While these activities are typically performed during design and initial implementation of the information system, the Entity Framework implements the physical model, makes the conceptual model a first-class citizen and explicitly defines the mapping model.

  • Conceptual model defines entities and their relationships from the application point of view. This model is defined using Conceptual Schema Definition Language (CSDL) and stored in a separate .csdl file or as part of a combined .edmx file in your project.
  • Storage model defines entities and their relationships from the database point of view. This model is defined using Store Schema Definition Language (SSDL) and stored in a separate .ssdl file or as part of a combined .edmx file in your project.
  • Mapping model defines how conceptual model translates into storage model. It is defined using Mapping Specification Language (MSL) and stored in a separate .msl file or as part of a combined .edmx file in your project.

The Entity Framework uses these XML-based models to transform create, read, update, and delete operations against entities and relationships in the conceptual model to equivalent operations in the data source. The storage and mapping models can change as needed without requiring changes to the conceptual model, data classes, or application code. Because storage models are provider-specific, you can work with a consistent conceptual model across various data sources. The Entity Data Model even supports mapping entities in the conceptual model to stored procedures in the data source.

What is wrong with this picture?

Entity Framework was designed to overcome the object-relational impedance mismatch and allow you to hide a difficult-to-use legacy database schema behind a modern, object-oriented entity model and make developers more productive. This description almost sounds to good to be true. Unfortunately, just like other frameworks designed to solve complex problems, the Entity Framework tends to make simple things difficult.

Having three separate database layers (or models) in the Entity Data Model means having each entity represented three times, each from a different point of view. Every time you need to make a change in your database model, which also needs to be implemented in the application, you have to make it three times. Take adding a new column to a table as an example. No matter in which order you do it, you still have to add a new property to the conceptual model, a new column to the storage model and a new mapping from one to the other. This task is tedious and error prone even with the best modeling tools imaginable. Unfortunately, Entity Framework designer limits you to a single diagram, which gradually becomes more difficult to use as the number of entities increases, becoming virtually useless when it exceeds 50.

In simplistic terms, this all means that Entity Data Model is three times as complex as a traditional database model. To implement all but trivial logic, either one developer has to understand all three models or you have to have two or more people involved in implementing it. In the end, you have to spend three times more effort to develop and maintain the EDM, you are three times more likely to make mistakes, i.e. the cost of having the EDM in your solutions is three times higher than the cost of a solution that uses a single hypothetical data model.

What is the alternative?

Majority of information systems in existence today are built using relational databases and data-oriented code. A tremendous amount of tooling, knowledge and experience has been developed over the past two decades by the industry, making low-cost development resources readily available. Although solving the object-relational impedance mismatch sounds very good in theory, in practice, the overall cost of such a solution is typically higher, in part due to additional effort required to maintain three separate database models and in part due to additional knowledge and experience required from developers.

However, the existing as well as new information systems could benefit tremendously from having a modern, strongly-typed, LINQ-enabled, automatically generated data access layer (DAL). Such a DAL would allow to significantly reduce the amount of plumbing code developers have to write today with DbConnection, DbDataReader and DataSet classes without requiring significant changes to the application architecture. In other words, as long as the effort required to maintain such a DAL is less than the effort required to maintain the data access code manually, we can take advantage of the language and data access enhancements offered by the current version of the .NET framework while reducing the overall cost of the information systems.

Entity Framework DAL generator in T4 Toolbox

In order to implement an automatically generated DAL based on Entity Framework today, you would have to use EdmGen. Unfortunately, it does not give you access to some of the options supported by the framework. The new version of T4 Toolbox includes a code generator that encapsulates the functionality built into Entity Framework and makes it available in T4-based code generation scripts. This code generator is based on the idea of using database modeling tools built into Visual Studio, which are easier to use and unlike Entity Framework’s own designer, are proven to work on the number of tables typically found in real-world databases. In the rest of this article we will review the process of using this code generator to illustrate its pros and cons.

Getting Started

In order to follow this sample, you need to have Visual Studio 2008 or 2010, Standard Edition or higher, SQL Server Express 2005 or later, T4 Toolbox and T4 Editor installed on your computer.

  • Create a new C# or Visual Basic console application project.
  • Add a new item to your project and select the Code Generation->Entity Framework Database template in the Add New Item dialog.

Entity Framework Database template in the Add New Item Dialog

  • Enter the desired database name and click Add.

This template adds a new, empty SQL Express database to your project. Unfortunately, Visual Studio automatically displays the the Data Source Configuration Wizard whenever you add a .mdf file to a project. Normally, you would use it to create a data access layer with built-in visual designers. In this case, we have the entire data access layer generated automatically for us.

  • Click Cancel in the Data Source Configuration Wizard.

Data Source Configuration Wizard Database and generated files in Solution Explorer

Notice that a number of files have been added to your project, including a .mdf file (SQL database), a .ldf file (SQL log), a .tt file (code generator), a .cs/.vb file (entity classes and object context), a .Views.cs/.vb file (precompiled entity views), a .csdl file (conceptual model), a .ssdl file (storage model) and a .msl file (mapping model). If you are using Visual Basic, you will need to click "Show All Files” button in solution explorer in order to see the files nested under .mdf and .tt.

Database Modeling

Visual Studio includes a subset of database modeling functionality from SQL Server Management Studio, including database diagrams, table designers, key and relationship designers.

    Database in Server Explorer.
  • Double-click the .mdf file in Solution Explorer (shown above). This will cause the database to be opened in Server Explorer (shown on the left).
  • Right-click Database Diagrams in Server Explorer and select Add New Diagram from the context menu.
  • The first time you add a diagram to a database, Visual Studio will prompts you to create database objects required for diagramming. Click Yes when prompted.

Use the diagram to create a new database model.

Database Diagram ToolbarDatabase Diagram Designer

Use Properties Window to configure tables and columns.Database diagrams allow you to quickly model tables and their relationships. You can use the Properties Window of Visual Studio to configure individual tables and columns selected on the design surface. The toolbar and context menu provide access to commands that allow you to model keys, relationships, indexes, unique and check constraints, as well as customize the look of your diagram for display and print, add text annotations and more. Also, unlike with Entity Framework designer, you can create additional diagrams when the number of tables is too big to fit on a single diagram. This seemingly trivial feature is crucial for large database models.

Having the model database included in your project has several important benefits. It is placed under source control with the rest of the files in your project, giving you the ability to go back to previous versions when necessary. Although it may seem excessive to check in an entire database in source control, the size of a .mdf file is comparable to the size of a similar database model stored in a Visio .vsd file or an Enterprise Architect .eap file. You can also use the database file itself to store any seed data you need and with “Copy to Output Directory” option turned on, the database deployment becomes a simple xcopy on greenfield projects.

  • Make sure to save changes you make in any database diagram or designers you open before you proceed to the next step of generating code.
Code Generation
  • Double-click the .tt file in Solution Explorer (shown above) to open it in the T4 Editor.
C#
<#@ template language="C#" hostspecific="True" #>
<#@ output extension="cs" #>
<#@ include file="T4Toolbox.tt" #>
<#@ include file="T4Toolbox\EntityFramework.tt" #>
<#
    var generator = new EntityFrameworkGenerator();
    generator.DatabaseFile = "Database1.mdf";
    generator.Run();
#>
VB
<#@ template language="VB" hostspecific="True" #>
<#@ output extension="vb" #>
<#@ include file="T4Toolbox.tt" #>
<#@ include file="T4Toolbox\EntityFramework.tt" #>
<#
    Dim generator As New EntityFrameworkGenerator()
    generator.DatabaseFile = "Database1.mdf"
    generator.LanguageOption = LanguageOption.GenerateVBCode
    generator.Run()
#>

This is a T4 code generation script that creates a new instance of EntityFrameworkGenerator, configures its properties and uses it to generate all model and code files required for a ready-to-use data access layer based on Entity Framework. Note that this script is not directly associated with the database file in the Visual Studio project and has to be run manually.

  • Right-click the .tt file in Solution Explorer and select Run Custom Tool from its context menu to trigger the code generation.
  • Double-click the  generated .cs/.vb file in Solution Explorer to open it in the Visual Studio editor. If you are using Visual Basic, you will need to click “Show All Files” in the toolbar of Solution Explorer in order to see this file.
C#

Generated C# Code

VB

image

This generated file contains definitions of both entity classes and object context required to access the database using LINQ to Entities.

Also note the other generated code file, in our example Database1.Views.cs/.vb. This file contains pre-generated views, which significantly improve reduce the amount of time required to connect to a database using an Entity Framework object context for the first time. This can take tens of seconds even on a medium-size database with a hundred of tables. While this performance hit does not have a meaningful effect on applications in production due to Entity Framework caching the views on per-AppDomain basis, this delay is a significant problem for developers who need to recompile and restart the application frequently. The Entity Framework designer does not generate views automatically, causing much frustration for developers and, perhaps, being the main reason for Entity Frameworks reputation of having poor performance.

Coding

At this point, you have everything you need to start coding against your new database using LINQ to Entities.

  • Double-click Program.cs or Module.vb in Solution Explorer.
  • Enter the following code in the editor.
C#
using System;
using System.Linq; 

namespace EfDalSampleCS
{
  class Program
  {
    static void Main(string[] args)
    {
      var context = new Database1Entities();
      context.AddToProducts(new Product() { ProductName = "Internet Explorer", UnitPrice = 0 });
      context.AddToProducts(new Product() { ProductName = "Windows 7 Ultimate", UnitPrice = 319.99M });
      context.SaveChanges(); 

      var products = from p in context.Products select p;
      foreach (var product in products)
      {
          Console.WriteLine("{0} ({1})", product.ProductName, product.UnitPrice);
      }
    }
  }
}  

VB
Imports System
Imports System.Linq
Imports EfDalSampleVB.EfDalSampleVB 

Module Module1
    Sub Main()
        Dim context As New Database1Entities
        context.AddToProducts(New Product() With {.ProductName = "Internet Explorer", .UnitPrice = 0})
        context.AddToProducts(New Product() With {.ProductName = "Windows 7 Ultimate", .UnitPrice = 319.99})
        context.SaveChanges() 

        Dim products = From p In context.Products Select p
        For Each product In products
            Console.WriteLine("{0} ({1})", product.ProductName, product.UnitPrice)
        Next
    End Sub
End Module 

 

  • Press F5 to run the program. You may want to set a breakpoint and observe execution of the program in Visual Studio debugger.

A couple of things are worth explaining. How does this code know which database to connect to? The answer lies in the default constructor of Database1Entities class, which is generated to use a connection string with the same name. This connection string was automatically added to the configuration file when we added Database1.mdf to our project.

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <connectionStrings>
    <add name="Database1Entities"
      connectionString=
      "
        metadata=res://*/Database1.csdl
                |res://*/Database1.ssdl
                |res://*/Database1.msl;
        provider=System.Data.SqlClient;
        provider connection string=
        ‘
          Data Source=.\SQLEXPRESS;
          AttachDbFilename=|DataDirectory|\Database1.mdf;
          Integrated Security=True;
          User Instance=True
        ‘
      "
      providerName="System.Data.EntityClient" />
  </connectionStrings>
</configuration>

imageMetadata section of the connection string specifies that .csdl, .ssdl and .msl files are embedded as resources in the executable. Provider section of the connection string instructs Entity Framework to use ADO.NET SQL Client to connect to the database. The provider connection string section contains the traditional connection string and indicates that SQL Server Express edition will be used to automatically attach the database file located in DataDirectory. For windows applications (such as the console app we have here), DataDirectory token refers to the directory where the  executable is located. Our database file is copied to the output directory because the “Copy to Output Directory” option was set to “Copy Always” for the .mdf file in our project.

Customizing the list of database objects exposed by DAL

You may have noticed that among the generated entity classes, we also have sysdiagrams, which was actually the table created by SQL Server when we added the first diagram to our database. Unless you are planning to access this table from your application, it would be best to remove it from the data access layer. You can can do that by adding a filter to the code generator.

  • Add the following code to the .tt file before generator.Run() method is called.
C#
  generator.Filters.Add(
    new EntityStoreSchemaFilterEntry(
      null,   // catalog
      null,   // schema
      "sys%", // name
      EntityStoreSchemaFilterObjectTypes.Table,
      EntityStoreSchemaFilterEffect.Exclude));
VB
  generator.Filters.Add( _
    New EntityStoreSchemaFilterEntry( _
      Nothing, _
      Nothing, _
      "sys%",  _
      EntityStoreSchemaFilterObjectTypes.Table, _
      EntityStoreSchemaFilterEffect.Exclude))

This code creates a filter entry object that indicates that all tables whose name starts with “sys” should be excluded from the generated DAL. EntityFrameworkGenerator class contains a property called Filters to which you can add one or more filter entries and make sure that only appropriate database objects are accessible through the DAL. For complete details, please refer to EntityStoreSchemaFilterEntry, EntityStoreSchemaFilterObjectTypes and EntityStoreSchemaFilterEffect on MSDN.

Unfortunately, not all of the possible combinations of object types and filter effects are supported by the underlying Entity Framework generator. For example, creating a filter entry to include functions in the DAL has no effect because the functions are only automatically generated in the storage model (.ssdl) and not in the conceptual model (.csdl). In other words, you cannot access stored procedures using the DAL produced by this code generator in its current version.

Customizing pluralization of entity class names and set names

Entity Framework 4.0 in Visual Studio 2010 provides an option to automatically convert table names to singular form in names of entity classes and to plural form in names of entity set properties. Without this automatic conversion, both entity name and set name would be either in plural form or singular form, depending on the naming convention you use in your database model. This poses a significant limitation in Entity Framework 3.5 in Visual Studio 2008, making the resulting DAL more confusing. A complete discussion of pluralization by the Entity Framework team is available on their blog.

The code generator in T4 Toolbox has the pluralization turned on by default in Visual Studio 2010. If you need to generate the DAL without pluralization, perhaps for compatibility reasons, you can turn this option off by adding the following line to the .tt file before generator.Run() method is called.

C#
  generator.Pluralize = false;
VB
  generator.Pluralize = False

This option is not available in Visual Studio 2008 because the underlying code generator in Entity Framework does not support it.

Customizing generation of foreign key properties

Entity Framework 4.0 in Visual Studio 2010 also provides an option to generate and use foreign key properties (such as Order.ProductId) in addition to navigation properties (such as Order.Product). Detailed discussion of foreign key properties is also available on the Entity Framework team blog.

The code generator in T4 Toolbox offers an option to generate foreign key properties, however you need to turn it on explicitly, by adding the following line to the .tt file before generator.Run() method is called.

C#
  generator.GenerateForeignKeyProperties = true;
VB
  generator.GenerateForeignKeyProperties = True

This option is not available in Visual Studio 2008 because the underlying code generator in Entity Framework does not support it.

Customizing location of the model database

When storing the model database under source control is not desirable on your project, you can change the code generator to use a standalone database as the model.

  • Change the .tt file to specify a connection string instead of the database file.
C#
<#@ template language="C#" hostspecific="True" #>
<#@ output extension="cs" #>
<#@ include file="T4Toolbox.tt" #>
<#@ include file="T4Toolbox\EntityFramework.tt" #>
<#
  EntityFrameworkGenerator generator = new EntityFrameworkGenerator();
  generator.ConnectionString = "Server=.;Database=Northwind;Integrated Security=True";
  generator.Run();
#>
VB
<#@ template language="VB" hostspecific="True" #>
<#@ output extension="vb" #>
<#@ include file="T4Toolbox.tt" #>
<#@ include file="T4Toolbox\EntityFramework.tt" #>
<#
  Dim generator As New EntityFrameworkGenerator()
  generator.ConnectionString = "Server=.;Database=Northwind;Integrated Security=True"
  generator.LanguageOption = LanguageOption.GenerateVBCode
  generator.Run()
#>

When you save the modified .tt file, the DAL will be regenerated based on the metadata in the database you specified in the connection string. By default, database name determines the name of the generated object context class. Changing the name of the database from Database1 to Northwind will change the name of the context class from Database1Entities to NorthwindEntities.

  • Rename the .tt file to match the name of the database (Northwind.tt in our example).

Although this is not required, renaming the code generation script will make the names of the generated .csdl, .ssdl and .msl files match the name of the database and make the solution easier to understand.

  • Update the configuration file to match the changes made in generated code.
<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <connectionStrings>
    <add name="NorthwindEntities"
      connectionString=
      "
        metadata=res://*/Northwind.csdl
                |res://*/Northwind.ssdl
                |res://*/Northwind.msl;
        provider=System.Data.SqlClient;
        provider connection string=
        ‘
          Server=.;
          Database=Northwind;
          Integrated Security=True;
        ‘
      "
      providerName="System.Data.EntityClient" />
  </connectionStrings>
</configuration>

In particular, the name of the connection string has to match the new name of the object context class (NorthwindEntities in this example); file names in the metadata section of the connection string have to match the names of the generated model files (Northwind.csdl, Northwind.ssdl and Northwind.msl in this example); and finally, the provider connection string section of the connection string has to point to the actual database in your current environment.

Note that database connection string is present in both .tt and .config files. In this example, the same Northwind database on local SQL server is used during both code generation and execution. You may want to use different databases for these purposes, perhaps an empty local database for code generation and a shared database for development and testing. In that case, make sure to use appropriate connection strings in the .tt and .config files.

Other customization options

Additional details about customization options are available in the T4 Toolbox documentation installed as part of the Visual Studio 2008 help collection. If you have it installed, you can click here to open the list of properties available for you to use in the code generation script.

Conclusion

ADO.NET Entity Framework allows you to access database in object-oriented manner at the cost of additional effort required to maintain conceptual, storage and mapping model. While sounding good in theory, solving the object-relational impedance mismatch in practice typically increases the cost of information systems due to additional knowledge, experience and effort it requires with tools available today. By implementing an automatically-generated, modern, strongly-typed, LINQ-enabled data access layer based on Entity Framework, we can improve developer productivity and reduce the system development cost without introducing the overhead and complexity of fully featured object-relational mapping. Unlike the tools built into Entity Framework, the database modeling functionality in Visual Studio and the code generator described in this article allow you to use Entity Framework effectively with more than just a few dozen tables. Unfortunately, due to limitations in the current version of Entity Framework, the solution described in this article does not allow accessing stored procedures and functions through the automatically generated DAL, making it inferior compared to a similar DAL based on LINQ to SQL today.

Download

Sample source code


Write a Comment

Take a moment to comment and tell us what you think. Some basic HTML is allowed for formatting.

Reader Comments

[…] Simplifying Entity Framework: Data-Driven Design (Oleg Sych) – EF and T4 templates… nice. […]

Great stuff Oleg. So - is it fair to say, from a Data Driven Design point of view, we have the maintenance simplicity we enjoyed with LINQ to SQL while using a Microsoft supported framework?

I’m hoping that approaches like this, along with the improvements in EF in .Net 4.0, will facilitate reasonable use of EF in medium and large scale projects.

Jeff, I think two things are still missing in EF 4.0 that we have in LINQ to SQL - automated import of stored procedures and support for enums in entity classes and LINQ. In my mind, these two limitations make L2S superior to EF from the standpoint of implementing a cost effective data access layer in .NET 4.0

Hello Oleg, i tried to create the source code using your sample download and got struck due to an issue on the default constructor MyTest.vb

Public Sub New()
MyBase.New(”name = MyTestEntities”, “MyTestEntities”)
MyBase.ContextOptions.LazyLoadingEnabled = true
OnContextCreated()
End Sub

I am getting an error like this
“Format of the initialization string does not conform to specification starting at index 123.”

The .VB file cannot able to access the configuration file.Can you advice what went wrong ?

Regards
Dhinesh Kumar

Dear Oleg

I was able to resolve this issue. This is just a connection string issue which i was able to resolve now

regards
Dhinesh Kumar

[…] Oleg Sych's Blog : Entity Framework Tools in Visual Studio and the T4 Template Toolboxhttp://www.olegsych.com/2009/12/simplifying-entity-framework-data-driven-design/ […]

When I try to add an Entity Framework DB to my sln, it says “Connection to SQL Server files require SQL Server Express 2005″ and won’t let me proceed. But I have SQL Server 2008 Pro! Argh!

The project item template in T4 Toolbox assumes you have SQL Server 2005 or 2008 Express Edition installed. Standard, Enterprise or Developer editions of SQL Server don’t support the ability to attach databases in the connection string.

In the article you mentioned the “T4 Toolbox documentation”

Where can i get this so i can add it to my VS2008 documentation?

Thanks
Tony W

Tony,

I pulled documentation out of the T4 Toolbox build after discovering that Visual Studio 2010 uses a completely different help system. It was hard enough to build it for 2008 with Sandcastle, I simply didn’t have time to deal with two help systems.

Oleg

I would suggest not using pluralization right now. Doing so makes any future code generation quite painful. I am feeling this pain now.

If you want to pluralize something, do it by convention. For example use “CompanyList” instead of “Companies”. The English language makes pluralization/singularization a bear.

I wish Microsoft made this easier for those who need to be able to determine singular and plural forms of a given type.

Maybe they do, and I just haven’t found it.

Why do we need to use an MDF, this seems incredibly short sited. What about simply specifying a connection or a pre existing EDMX?

Very dissapointed I cannot use this template.

I’m sorry Tom. I’m sure CodePlex will gladly refund the money you paid for this template. :)

P.S. You can see an example of how t o specify model database via a connection string in the “Customizing location of the model database” section of the article.

Hi, I find T4 templating really poor.
I am trying to use a Class.cs inside the .tt file and it doesn’t find this Class

Also i tried to use a .dbml with a linq query inside the .tt file, and i was not able to do this :

DataClasses1DataContext da=new DataClasses1DataContext();

Can’t wait to see the DbContext/POCO (with data-annotations?) template.

This is really good stuff Oleg.
I am only sad I did not find this approach earlier. We are now saddled with a huge EDMX file that continually corrupts and requires manual hacking. Is there a version of this template that will output DbContext EF4.1 classes rather than ObjectContext?