Entity Framework

Introducing LINQ

Paul Vick has a nice introduction to LINQ.  Sounds cool, we’ll have to wait and see how cool it really becomes…

"LINQ" uses Attributes!!!

I’ve written a lot about the debate on whether O/R Mappers should use attributes or an xml mapping file.  I’ve more or less conceded that XML mappings are more flexible and the better solution.  But wait!

Today as I was listening in on the PDC Keynotes I saw Anders demo “LINQ” which looks to use attributes pretty extensively!  Perhaps I wasn’t so nuts after all when I posted “If Attributes are good enough for Indigo why aren’t they good enough for O/R Mappers?”   I’ll be interested to see if “LINQ” also has Xml mapping capabilities.  I’d be surprised if it didn’t line up with the Indigo model of allowing things to be defined declaritivly as well as programatticly.

Also worth mentioning is my Entity Framework category which has a bunch of different thoughts on O/R Mappers and things such as whether xml mappings or attributes are the way to go.

Anyone know what's shakin with NHibernate?

I’m still trying to decide on what O/R Mapper I’m going to support in ActiveType.  I’ve been playing with WilsonORMapper a good bit recently and have liked my experiences.  I want to make sure I give other mappers a test run as well before making the final decision.  One of the things that has me a little concerned about NHibernate is there doesn’t appear to be any active development going on.   I’d like the mapper I choose to support .NET 2.0 (mostly generics) which is why I was hoping some activity would be happening in the NHibernate source tree.

Does anyone know what’s up with NHibernate?  Who’s actively working on the project? Do they plan on supporting .NET 2.0?  If so when?

O/R Mapping: Attributes vs. XML For Mapping

Jeff Perrin posted a response to my If Attributes are good enough for Indigo why aren't they good enough for O/R Mappers? entitled O/R Mapping: Attributes vs. XML for Mapping.  In the post he disputes some arguments made for XML Mapping.  If your interested in such things check it out.

DataSets vs. O/R Mappers

Sam has a very interesting post comparing DataSets and O/R Mappers.  I think he makes a good point that they aren’t really equivalent and shouldn’t be compared against one another.  They serve different purposes and are apporpriate in different application scenarios.  If your follwoing Domain Driven Design then an O/R Mapper is probably more appropriate, and if your a “Database Driven Design” kind of guy the DataSet may be more appropriate.  The important point is just give up on the DataSet and use an O/R Mapper for everything.  Ok, not really, but you had to suspect an OO bigot like myself would try and spin it that way wouldn’t you?

For more information from me on Domain Driven Design checkout my DDD category! and of course for more on O/R Mappers checkout my poorly named Entity Framework category.

If Attributes are good enough for Indigo why aren't they good enough for O/R Mappers?

I’ve recently been reviewing the code for my Entity Framework (simple O/R Mapping, Entity validation, etc) to determine if I should consider swapping out the O/R Mapping for something else such as NHibernate, NPersist, Wilson OR Mapper, etc.  The reason I’ve been considering this is that swapping out the O/R Mapping functionality within my Entity Framework with one of the more widely used frameworks may make it easier for people to extend ActiveType (CMS) for their unique needs.

As part of my review I was investigating the differences in functionality between the other O/R Mappers out there and the lightweight O/R mapper I’ve developed in my base Entity Framework.  There are a lot of obvious differences, but, as you might be able to tell from the title of this post I’m going to focus on how the mapping is configured in the various O/R Mappers.  Most O/R Mapping products out there today have the mapping of objects to database tables defined in an external configuration file.  There are some (mine included) that define the mapping within the classes themselves via custom attributes. 

I’ve recently been reading up on Indigo and as I’ve worked through all the code samples I couldn’t help but notice their extensive use of attributes.  Although O/R Mapping and Indigo are clearly not one in the same I can’t help but notice the obvious similarities.  I can’t help but think there are others that agree:

One last point, [DataContract] is just not an "Indigo" feature -- it is a feature for the entire .Net Framework.

Thinking about it in only Web services terms doesn't do justice to the host of important scenarios that it addresses elsewhere (version-tolerate persistence for any CLR type in any store).

We are defining a common abstract data model for CLR types for N scenarios -- many of which are Web service related.

http://www.douglasp.com/PermaLink.aspx?guid=01095547-b7f1-4e1d-8c4e-31038296f164

If [DataContract] is not just an Indigo feature but a feature for the entire .NET Framework then why couldn’t it become the center point for an O/R Mapping engine?  And if attributes are good enough for defining [ServiceContract], [OperationContract], and [DataContract] why couldn't it also be good enough for defining details to be used by a persistence engine?

Hibernate & NHibernate

I recently came across an interview with Gavin King, founder of the Hibernate open source object/relational mapping project.  Gavin talks about some of the challenges with o/r mapping and the reasons he decided to join JBoss.

In related news I received a comment on my .NET O/R Mappers post from szoke letting me know that the NHibernate (sf.net) project, which is a .NET port of Hibernate, is “alive.”

So reflection isn't as bad as I made it out to be

Over the last week or two I've exchanged a couple emails with Paul Wilson in regards to his WilsonORMapper.  For those of you who haven't check it out yet, I'd encourage you to do so.  It provides an excellent example of what a O/R Mapper does, and a peak behind the scenes of how a O/R Mapper can perform its “magic.”  When I first checked out Paul's O/R Mapper I questioned how performant it would be in a production environment due to the amount of reflection that was being used.  This is what prompted me to do a little investigation, and resulted in my “The cost of reflection” post.  Paul's follow up posts regarding the work he did to optimize the performance of his O/R Mapper hit on a very important point.  When measuring performance be sure to look at it in the context of the total application.  There's a good chance that what you believe is the slow part of the application isn't actually that bad.  For a perfect example of this take a look at where I stood last week.  I declared that reflection on private fields was slow.  In fact much slower then accessing public properties directly.  From this I concluded that the amount of reflection in Paul's O/R Mapper would make his mapper perform less then optimally.  Now take a look at Paul's most recent post on how he optimized his O/R Mapper.  Although reflection on private fields was slow, it wasn't anywhere near the top of the list.  My assumption that the reflection would slow things down was wrong.  Even with reflection his mapper performed comparably to using DataSets.  At least everyone now knows to ignore everything I say

Should an O/R Mapper use attributes or a mapping file?

Paul Wilson asks the all important question.  Should an O/R Mapper use attributes or a mapping file?

I've always leaned toward attributes for mapping objects to the database.  I liked having my mappings defined right there with my class.  Most of the time when doing development if I'm changing the name of a column I'm also changing the name of the associated property, so having the mappings right there in the class wasn't too much of a problem. 

Recently I've found myself being drawn to the external mapping file.  I still don't like the fact that its yet another thing for me to manage, however, it has obvious benefits.  By taking things out of the class you reduce the coupling between the class and its mappings.  The mappings get totally separated, and allow them to be updated without recompiling.  As I've mentioned in previous posts I also have a set of custom attributes that define validation rules for my classes.  The arguments that Paul makes for O/R Mappings in external mapping files also applies to validation rules.  Having everything in an external file allows things to be updated “on the fly,“ and removes the need to recompile and redeploy.

So, although I currently use custom attributes to handle my mappings, I think I like the route Paul has taken better.  You live and learn.

Do you use marker interfaces?

Within a current class library I've developed a validation component that uses custom attributes.  By decorating properties of my classes with the “validation attributes” I can have the library automatigically create validators for my classes.  The below is an example of a class that makes use of the auto validation component:

public class Customer : IAutoValidated {
   [Required]
   public string Name {
     get {...} set {...}
   }

   [MaxLength(50)]
   public string MyNotTooLongString {
     get {...} set {...}  
   }
}

When the above customer objects is saved the validation component will use the Required, and MaxLength attributes to validate the object.  If the Name property is not set or if the MyNotTooLongString is greater then 50 characters a ValidationException will be thrown.  The validation component uses reflection to do some of its work.  Since reflection is generally not the speediest process I've decided to use a marker interface to tell my class library which components should be auto-validated.  This will prevent the framework from trying to create validators for objects which don't need them.  Within the code that saves an object I only call the validation if the object implements the IAutoValidated interface.

if(anEntity is IAutoValidated)
   Validate(anEntity);

I'm using the IAutoValidated interface as a marker.  If the object implements the interface I know that it wants to use the auto validation, if it doesn't implement the interface I know I shouldn't do the extra work that is required to “create” the validator. 

Do you use marker interfaces?   Why?  Why not?

Why I don't like ObjectSpaces.

Over the past year I have looked at a large number of .NET O/R Mappers, hoping to find something that I really liked  This afternoon I read Jan Tielens Getting Started with ObjectSpaces article on MSDN.  I had high hopes for ObjectSpaces.  While there are a lot of things within ObjectSpaces that I do like, my overall feeling isn't too good.

Here are the things which I do like:

  • An ObjectSpaces object doesn't look any different then any other object.  I'm not forced into inheriting from a base “ObjectSpacesObject“, and I don't have to decorate the properties of my object with an ObjectSpaces attributes for the mapping.
  • In theory I like the concept behind OPath.  When your working with objects you should be able to query them using something that makes sense for objects. 
  • A mapping tool that is integrated withing VS.NET.  Although it isn't currently integrated Jan says that it will be by the time the beta rolls around.  I definitely want to stay within VS.NET so this is a plus.
  • The ability to create compiled ObjectExpressions for querying objects.  http://longhorn.msdn.microsoft.com/lhsdk/ndp/daconusingobjectexpression.aspx

And now the things I don't like:

  • It only supports SQL Server.  I want a O/R Mapper that can support whatever I want.  I want built in Sql Server and Xml support, and I want to be able to create whatever other support I need by creating my own custom ObjectProviders.  I want to plug in new providers through some simple configuration.
  • It makes working with objects messy.  I have to do all sorts of ObjectSpaces “stuff“ that I don't want to.  Lets look at how I would go about saving an object as an example:

    First I need to create a SqlConnection, and hook up my connection to the ObjectSpace.  Next I change the properties of the object.  Finally I tell my ObjectSpace that I want it to track changes to the object, and then persist.

    SqlConnection conn =  new SqlConnection("Data Source=localhost;Integrated Security=SSPI;");
    ObjectSpace os = new ObjectSpace(@"C:\ObjectSpacesDemo\ObjectSpacesDemo.msd.xml", conn);

    Data.Customer c = new ObjectSpacesDemo.Data.Customer();
    c.Name = “Steve Eichert“;
    // set other props
    os.StartTracking(c, System.Data.ObjectSpaces.InitialState.Inserted);
    os.PersistChanges(c);

    Yuck!  I'd prefer this instead:

    Customer c = new Customer();
    c.Name = “Steve Eichert”;
    // set other props
    c.Save();

    And then their's deleting data:

    SqlConnection conn =  new SqlConnection("Data Source=localhost;Integrated Security=SSPI;");
    ObjectSpace os = new ObjectSpace(@"C:\ObjectSpacesDemo\ObjectSpacesDemo.msd.xml", conn);

    Data.Customer c =
    (Data.Customer)os.GetObject(typeof(Data.Customer),
    "Name = '" + textBox2.Text + "'");

    os.MarkForDeletion(c);
    os.PersistChanges(c);

    instead of something like this:

    Customer c = new Customer().GetByName(textBox2.Text);
    c.Delete();

    Now I'm sure anyone who uses ObjectSpaces will write wrappers to reduce the amount of code necessary for working with ObjectSpaces objects, but, should they have to?  Wouldn't it be nice if it was designed so that we didn't have to do a bunch of setup, wouldn't it be nice if it was transparent?

At this point I've only taken a high level look at ObjectSpaces.  There is a lot that I like, but, also some things that I don't like.  What I really want is an O/R Mapper that is transparent.  It sits there in the background doing all the work, without me really knowing or caring.  I don't like having to go through all these broker, context, and objectSpace objects to save, retrieve, and remove my objects. 

Should an O/R Mapper also be a code generator?

With all the hype surrounding O/R Mappers of late Mark Bonafe decided it was time to look at LLBLGen Pro which he “thought was among the best O/R tools available.” 

When I originally looked at LLBLGen Pro I had some of the same feelings as Mark.  In my opinion an O/R Mapper should not be a code generator.  An O/R Mapper should perform all its magic through its core components, not by generating lots of code.  I don't view LLBLGen Pro as a “pure” O/R Mapper.  I tried the demo version of the product several times and each time I was left with the feeling that I was evaluating an n-tier code generator with a little bit of O/R Mapping thrown in. 

In what I call a “pure O/R Mapper“, the developer doesn't see any generated code.  The mapper *may* generate some of the data access code at runtime, but, it's not something that the developer using the mapper has to worry about.  The code doesn't clutter up their project, and introduce another thing for them to manage. 

.NET O/R Mappers

Now that I've attempted to describe what is an O/R Mappers, described how an O/R Mapper is different then a code generator, and attempted to provide advantages dynamic sql provides an O/R Mapper I thought it might be useful to provide a list of some of the .NET O/R Mappers I've come across.

Feel free to leave your favorite .NET O/R Mapper in the comments of this post!

What advantages does dynamic sql provide an O/R Mapper that procs don't?

Ok, so now that I've given my O/R Mapper overview, as well as shared how an O/R Mapper is different then a code generator let me try and attack the question that prompted me to write these posts. 

What advantage does dynamic sql provide an O/R Mapper that stored procedures don't?

To answer this question lets first look at the type of SQL the O/R Mapper has to generate.

  • Save - INSERT and UPDATE statements to update all the properties of the object when the .Save() method is called on the object.  Possibly generate SQL to only update certain columns, for example set the Active flag to true.
  • Delete - DELETE statement to remove an object from the data store by its primary key, by a column, or group of columns.
  • Select - SELECT statement to retrieve an object by its primary key, or any combination of column values.

One can easily create dynamic sql OR stored procedures to provide the necessary functionality.  Clearly neither dynamic SQL or stored procedures offer an advantage if we look at it purely from a functionality point of view.  I haven't come across anything that I could do in dynamic SQL that I couldn't in stored procedures.  However, there are certain things which dynamic sql makes easier.  Lets look at an simple example.

Imagine that you want to be able to query the system for all employees who live in Pennsylvania, have a salary greater then $45,000, and were hired after 1/1/2002.  If your O/R Mapper is using dynamic SQL the mapper will create some SQL that looks like this:

SELECT * FROM t_Employee WHERE State = 'PA' AND Salary > 45000 AND HireDate > '1/1/2002'

Lets assume you want to query the system slightly differently.  This time you want to find all employees who live in Virginia, have a salary less then $150,000, and where hired before 1/1/2003.  Easy enough, this time the OR Mapper creates the following dynamic sql:

SELECT * FROM t_Employee WHERE State = 'VA' AND Salary < 125000 AND HireDate < '1/1/2003'

Now lets take a look at how the stored procedure would be written.  In the examples above we're querying the t_Employee table by three criteria.  However, in reality we want to be able to query the t_Employee table by any of the columns within the table.  How do you write a stored procedure to handle that?  As I've noted in Optional Parameters in SQL Server Search Queries there is a pretty clean way to handle a bunch of optional parameters.  The question becomes what columns do I want to allow the user to query against.  At the start you probably don't make every possible column an optional parameter, since it may introduce some overhead into the procedure.  The other thing which now has to be considered is the fact that your may want to query the columns using different operators (<, >, <>, <=, >=) as in our example above.  How do you handle that in your stored procedure?  I know you can do it, it just takes a lot more work, and a lot more code.

Let me disregard the argument above for the time being.  I'll assume that somebody out there will respond with a really snazzy way to handle a lot of optional parameters as well as the ability to query the table's columns using whatever operator you like (Please leave any solutions in the comments!!).  The real reason I'm starting not to like working with stored procedures is because of the overhead in maintenance they require.  When I change a business object I have to perform several tasks to get my stored procedures and tables in sync.  I have to update each Save, and Select proc that references the table to properly handle the change.  This may mean adding additional columns to the select, changing the WHERE clause to support an additional optional parameter, or changing the save to persist an additional property.  Now consider the number of changes of this type that occur during the development of a project.  It happens often, and it can add a lot of overhead. With dynamic SQL this isn't a problem.  I update my business object, and my table and away I go. 

In summary the two main reasons I like dynamic sql for a O/R Mapper are:

  • Using dynamic sql the O/R Mapper is better able to handle ad-hoc queries.  With dynamic sql we can allow the developer to query the system for objects by any combination of properties.  They can choose to use whatever operator they so desire and don't have to worry about hacking a stored procedure together to accomplish what they want.
  • The maintenance overhead required in an O/R Mapper that uses stored procedures is higher.  When an object is changed you have to go through all your stored procedures and perform all the necessary updates.  Considering the amount of object changes that are made during development of a typical application this can add a lot of overhead to the project.

With that said let me conclude with a little anti-dynamic sql thought.

A good O/R Mapper that supports stored procedures should make the maintenance problem almost non-existent.  If the O/R Mapper is responsible for creating the stored procedures it should be able to update them when a change is made to an object.  The O/R Mapper should also be able to make schema changes to the database.  Maybe the tool could even allow the developer to select two assemblies containing the business objects for the application, and do a diff on the assemblies to create a SQL Script to update the schema, as well as the stored procedures.  If someone creates that (me?) I think I may have less reason to move to dynamic sql.  We shall see....

 

What's the difference between an O/R Mapper and a code generator?

In my previous post I provided an O/R Mapper overview.  One of the common responses I get from people after I describe an OR Mapper is...so its a fancy code generator?

Although my initial reaction is always “Its not a code generator!?!?“, I catch myself each time I get the response and realize it “sorta-kinda” is a super duper fancy code generator.  So what is the difference between an OR Mapper and a code generator?

A code generator generates code.  You run the generator using a set of templates that you setup, and it spits out a whole bunch of code.  When something changes you open up the code generator and re-generate all the code.  A code generator makes it so that when things change you need to re-generate the code, and recompile.  ( I realize this is a huge generalization but for the most part I think it holds true )

An OR Mapper is a framework of components.  The framework may use runtime code generation to aid in the mapping of objects to relational databases, but it doesn't just generate a DAL.  Let me give you a quick example.

In my OR Mapper (not sure if it can really be called an OR Mapper yet but oh well) I use code generation at runtime heavily.  The code that is generated, however, is never seen or compiled by the developer using the mapper.  Below is a step by step of how/when code generation is used in my mapper.

The first step in the process involves a traditional code generator as discussed above.  The generator creates the object with it's properties, along with the correct custom attributes necessary for the mappings.  After the object is created the object can be compiled into the assembly.  At this point none of the code/SQL for saving, deleting, and retrieving my objects from the data store exists.  When the application is run, and the user of the applications goes to save information within one of my objects the code and SQL is generated by the framework and compiled into a dynamic assembly.  The dynamic assembly is then cached and on subsequent save requests used by the framework for saving the object.  This offers a couple of advantages. 

  • The developer never sees, and thus never worries about this code. 
  • The developer doesn't have to go to a separate tool and say to regenerate the code.
  • The generated assembly is always up to date.

In summary, a code generator just generates code.  An O/R mapper may use runtime code generation within its framework of objects to aid in mapping objects to databases, but, it isn't just a code generator.  You don't see the code that it generates (usually). It removes the burden of writing and managing DAL code from the developer.  Rather then worrying about writing (or using a code generator to generate) a bunch of Save, Delete, and Retrieve routines for objects, the developer worries about the business rules and requirements for the application.  The O/R Mapper handles the rest!