October 2004 - Posts
Ok so I lied, I'm not really mid-way through
Enterprise Integration Patterns....more like a quarter of the way. So far I've really enjoyed the book. I've been trying to read a couple patterns a night and have been falling asleep while doing so I have to admit it's not a book that will keep you on the edge of your seat wondering what's coming next, but, it is an excellent book for anyone wanting to get up to speed on
messaging. Yeah so this is a lame midway review but I wanna go read some more before going to bed so that's all your getting for now, more to come later...
I've been a proponent of Agile development methodologies within my organization for over a year, but, frankly haven't made much progress in getting our projects to go agile. Some of this is because of the type of work that we get at times, more marketing fluff, that deep technical challenges. The one thing that I still have a hard time with when it comes to Agile is the whole idea of giving the customer the opportunity to cancel the project after any iteration if the project isn't proving to be of enough value. How does a small to medium size company plan for such a thing. If at the end of an iteration the customer says that's it, pack your bags we done, then what? Does the project team then get sent back to the “home office” to sit on the “bench” until another project starts up? What if there isn't work that's ready to begin when the project ends?
With “typical” projects using “typical” methodologies (read waterfall and fixed bid) all this can be planned for. The end date of the project is set for a certain date (which is usually pushed because of “scope creep”) so that “management” can schedule resources for new projects and get a good idea of when people will be freeing up. If every team member was on an agile project which was providing the customer an opportunity to cancel it every iteration (every 2-3 weeks) that seems like a lot of uncertainty that could cause a lot of trouble when it comes to scheduling resources and planning out projects that are getting ready to kick off. I'd love to hear from some Thoughtwork'ers on this as I'm sure this is something they deal with quite a bit. Perhaps Martin and friends are just so good that they don't ever get kicked out before they planned. Perhaps they can see the end coming an iteration or two before hand and can use that in planning. Frankly I just don't know and it's one of the things that I'm not able to explain/defend when I talk agile with my co-workers and managers. :-(
ThoughtWorks just released
Selenium as an open source "testing tool for browser-based testing of web applications." It looks similar to FIT in the way in which you lay out your tests. I'm definitely going to have to spend some time checking it out as web application testing is one of the areas where I think additional testing tools are needed.
Interesting report from John Lam on a presentation by Werner Vogels (of Amazon.com) given at Middleware 2004.
"They’re also pushing databases beyond what they are capable of. Amazon.com E-Bay does not maintain integrity constraints in the database – they’re maintained at the application layer. They don’t maintain indexes in the databases; instead lookups are done in Berkeley DB indices since lookup speeds are an order of magnitude faster there than on a relational database. Effectively, Amazon E-Bay is using their expensive database system as a transactional file system!"
Note: Amazon has been changed to E-Bay per the information from Werner himself in the comments of this post. Check out his post on How Databases Used at Big Customers for more info...
I've been using Bloglines for the last several weeks for my blog reading. Since I got my Mac its been important that I'm able to read my feeds on multiple computers and have the read status sync'd. Until RssBandit becomes Mono compliant that means using an online aggregator. While Bloglines has served me well I was never real crazy about the UI. I recently moved my subscriptions over to NewsGator Web Edition and am liking it so far. There are some real annoying things that I hope they fix but overall I like the interface and feature set.
We just got started on a new project that is going to require communicating with Windows 98 workstations. Our initial design has a .NET Windows Service running on the Windows 98 machine(s) with communication between the workstations and the central server occurring via an ASP.NET web service which will utilize WSE. I came across this thread which states that this is possible as long as IE is on the client machine and the Microsoft.Web.Services.dll is installed in the GAC of the client machine.
Has anyone out in the blogosphere successfully accomplished such a feet? (Newsgroups didn't turn up a minimal amount of information...)
I think somewhat recently I blogged about how my ultimate goal is to become an architect. When I think of an architect I think of it as Martin Fowler describes in his "Who Needs an Architect" article. I believe an architect should be the center piece of development efforts. Bottom line is I think an architect should write code. In a lot of organizations today architects aren't people who write code, they're people who draw nice diagrams and talk in theoreticalls. I enjoy writing code, I think I'm good at it, and I think there is a need to have an individual in an architectural role on the team that develops the deliverable (software).
At my current employer we don't really have an architect as I define it. We have "Tech Leads" who are responsible for identifying risks, as well as developing the high level design of the system. A tech lead can have anywhere from 1-3 projects on his plate at any one time. What this means is the tech lead does very little of the actual development. Identifying the risks, assigning tasks to the development team, and writing any kind of specifications fills up the day rather quickly. In turn the tech lead ends up writing very little code, and often leans on the "lead programmer" for the major coding on the project. The other role within our organization is a System Architect. The System Architect does a lot of design for the applications we build but rarely, if ever, writes much code.
Within your organization what roles do you have? Do architects write code? Does the role I'm looking for (my "architect") exist in the world or have we broken the world into "grunt" programmer, and "UML drawing" architect?
Jon Tirsen is feeling sick because of the overuse of Data Transfer Objects (DTO).
"You would think that people start thinking something just gotta be wrong when a significant portion of their system is just about shuffling data in and out of DTOs. Or when the domain model is left completely atrophied and looks more like a relational database schema expressed in an awkward Java syntax. Or as their carefully crafted architecture just falls apart in a mumbo jumbo of completely unmaintainable procedural code."
Aren't patterns great! They present solutions to problems that people can take and use in totally inappropriate ways. It just goes to show that simply reading about patterns isn't enough, one must fully understand the patterns in all their glory. When should they be used, what problem are they meant to solve, what problem are they NOT meant to solve?
Steve Maine took some of the ideas I had for my AddressProxy and created a more generic Associate class. The results are quite nice. What's funny is forgot what exactly the GoF proxy patterns was, so made it up on the fly as I was writing up my post. Anyway I like Steve's solution, I just hope Whidbey Beta 2 hits the wire soon so I can start using it on some "semi-production" apps I'm working on.
I'm also very interested in doing some more investigation on the DynamicProxy that Deyan pointed out in the comments of my post. That could ease my concerns over having to implement all the properties on my proxy (assuming I implement the proxy in the traditional GoF sense.)
Now that our CustomerRepository can save customers as well as associated addresses in a decoupled manner we need to move onto the next problem, reconstituting our customer object with an address. As others have mentioned we may want to Lazy Load the address since there will be instances when it isn't needed.
public class Customer {
public int addressKey;
public Address Address {
get {
if(address == null) {
address = new AddressRepository().Load(addressKey);
}
return address;
}
}
Ok, so the above would work but is it really what we want? Using the above method we introduce a coupling between our Customer class and the AddressRepository. I'd like to remove the coupling to allow us to support different address repositories as we can with the .Save within our CustomerRepository (via RepositoryFactory). I'd also like to remove any knowledge of repositories from my domain objects. Although it isn't always possible I prefer for application code to have knowledge of repositories NOT my domain objects. So how can we remove the coupling to the AddressRepository AND remove the Repository outright?
public class Customer {
public int addressKey;
public Address address;
public Address Address {
get { return address; }
}
}
One option is to have to separate methods on our CustomerRepository for returning Customer objects with and without Addresses loaded.
public class CustomerRepository : DomainRepository {
public Customer Load(int customerKey) {
Customer customer = RetrieveFromDataStore(customerKey);
return customer;
}
public Customer LoadWithAddress(int customerKey) {
Customer customer = Load(customerKey);
DomainRepository addressRepository = this.RepositoryFactory.GetRepository(typeof(Address));
customer.Address = addressRepository.Load(customer.AddressKey);
return customer;
}
}
The negative to the above is that we have two different methods for loading customers. If we use .Load() and then try and access the address property we get a null which could cause users of the class to think that the Customer doesn't have an address which is not the case. On the positive side using the above design forces the users of the class to think about how the customer is going to be used. Rather then lazy loading the address we have it pre-populated which ensures we don't run into performance problems due to users not realizing that the property is lazy loaded and causing additional hits to the database. What other options do we have? Perhaps introducing a proxy object could help?
public class Customer {
public Address Address {
get { return addressProxy.Address; }
}
protected internal AddressProxy AddressProxy {
set { addressProxy = value; }
}
}
public class AddressProxy {
private int addressKey;
private DomainRepository addressRepository;
private Address Address;
public AddressProxy(int addresskey, DomainRepository addressRepository) {
this.addressKey = addressKey;
this.addressRepository = addressRepository;
}
public Address Address {
get {
if(address == null) {
address = addressRepository.Load(addressKey);
}
return address;
}
}
And our CustomerRepository Load methods change to:
public class CustomerRepository {
public Customer Load(int customerKey) {
Customer customer = Load(customerKey);
DomainRepository addressRepository = this.RepositoryFactory.GetRepository(typeof(Address));
customer.AddressProxy = new AddressProxy(customer.AddressKey, addressRepository);
return customer;
}
}
The negative's of this solution is that we're introducing the proxy class into the Customer which is a little messier then I'd like. It'd be nice if our AddressProxy could inherit from our Address and provide a nice way of handling the loading of itself so that the Customer didn't have to use the AddressProxy. We could have the proxy overload all the properties of the address but that would be pretty ugly for all but the simplest of classes. I'm still not real pleased with the solution that I came up with here but it's at least a start. Perhaps all of my wonderful readers can help improve the "design"?
There has been a lot of discussion recently about how to handle aggregates that contain other aggregates when building models using DDD. The discussion started a couple weeks ago when Udi posted his proposed solution. Udi suggested that we should add a .Save() method to our domain object that does nothing other then fire an event that will allow any interested "parties" to catch the event and do the appropriate processing. I don't like the idea of adding methods to our domain objects that only fire events. Having a .Save() method on an object implies that calling that method will save the domain object to the data store. Additionally if we're following DDD we try to avoid adding data access like behavior to any of our domain objects. We leave the data access "logic" to our repositories. Let's quickly review the scenario that Udi presented (which came from a comment on one of my earlier DDD posts). We have a Customer object which has an associated address.
public class Customer {
// ... other props...
public Address Address {
get { return _address; }
set { _address = value; }
}
}
When we save the customer we also want to save the associated address. Now if we're following DDD we know that Repositories will play a large role in the saving of our domain objects. Each of our Aggregates should have a Repository that is responsible for handling all data related tasks for all the objects in the aggregate. If address is part of the Customer aggregate we have nothing to worry about since the CustomerRepository would then be responsible for saving the address itself. For arguments sake let's continue with the assumption that the Address class is the root of it's own aggregate and has it's very own Repository. How should the CustomerRepository handle the address when the customer is saved?
Since we're following DDD we should embrace the fact that we're going to be using repositories to save our domain objects.
public class CustomerRepository : DomainRepository {
public bool Save(Customer customer) {
SaveCustomerToDataStore(customer);
AddressRepository addressRepository = new AddressRepository();
addressRepository.Save(customer.Address);
return true;
}
}
As Steve Maine pointed out in one of his follow up posts, including the AddressRepository directly in the CustomerRepository creates a dependency between our repositories that we don't want. What if all the sudden the address needs to be saved by a different repository? How do we introduce a mock address repository into the equation during testing? Rather then hard coding the repository, we should implement a RepositoryFactory class. The RepositoryFactory will have the responsibility of knowing what repository should be used for each type of domain object. The knowledge will either be provided during initialization, via a configuration file, or perhaps will be covered by a framework such as PicoContainer. By introducing a factory into our design we decouple the AddressRepository from the CustomerRepository.
public class CustomerRepository : DomainRepository {
public bool Save(Customer customer) {
SaveCustomerToDataStore(customer);
DomainRepository addressRepository = RepositoryFactory.GetRepository(typeof(Address));
addressRepository.Save(customer.Address);
return true;
}
}
Since we may not always want to use the default RepositoryFactory we should use dependeny injection to allow users of the CustomerRepository to change the factory that is used.
public class CustomerRepository : DomainRepository {
public CustomerRepository(IRepositoryFactory factory) {
this.repositoryFactory = factory;
}
public IRepositoryFactory RepositoryFactory {
get { return factory; }
}
public bool Save(Customer customer) {
SaveCustomerToDataStore(customer);
DomainRepository addressRepository = this.RepositoryFactory.GetRepository(typeof(Address));
addressRepository.Save(customer.Address);
return true;
}
}
With our constructor in place we can very easily inject the proper repository factory into our CustomerRepository. This will allow us to swap out our repository during the testing of our components and will allow us to keep our repositories decoupled, both good things. Next on the plate is the loading of our customer and its associated address, which will come in another post....
I had my yearly review recently. Overall it went pretty well. As with many of my previous reviews one of the items that I was provided for "opportunities" was to expand my knowledge in areas outside the Microsoft side of the world. This would include technologies such as Java, Oracle, and etc. Now its important to note that we're primarily a Microsoft shop. Since we're a consulting company, we do use whatever technologies the client desires, however, if we have the choice we usually go the MS route. Part of me likes the idea of spreading my expertise into other areas such as Java and Oracle but the other part of me says "why"? Why would I spend time trying to develop an expertise in technologies that we rarely use?
Since my long term goal is to be an "Architect" I think spending time learning these other technologies wouldn't be wasted, however, if I'm not getting projects to apply this knowledge to it really isn't doing me that much good. Knowledge without experience doesn't really give me all the much, does it?