Monday, August 31, 2009

Temporal Database Design, Soft Deletes, and Ayende's Razor

This is a formal follow-up to a post on Ayende's blog on "Avoiding Soft Delete's" in your database where I question the lack of temporal database solutions being applied to these types of problems.

After Oren claimed the following, I felt it necessary to expand on it in a blog post of my own, rather than continuing to clutter his comments, and hopefully finally bring some traffic to my own blog :-)

Ayende’s Razor

This is a response to a comment on another post:

Oren, in all seriousness, I thought that problems that were "(a) complex, (b) hard to understand (c) hard to optimize" were the kinds that folks like you and I get paid to solve...

Given two solutions the match the requirements of the problem, the simpler one is the better.

I could just as well call that statement "Jim's Razor", as I believe in it as much as you do Oren, so no arguments there.

But in the same vane, "wise" (i.e., experienced) software architects/developers strategically chose RDBMS's over flat text files for the same class of reasons that I believe we should be making temporal database concerns first-class citizens of the modern IT toolbox. The "additional features/functionality" gained by employing temporal databases, while never stated up front in requirements, would be priceless in the long run for business systems. Those features include, but are not limited to:
  • automatic audit logging, since nothing is ever UPDATE'd or DELETE'd, you've got a constant trail of changes
  • automatic support for infinite undo/roll-back support of data, as you simply load a prior version and then save as usual
  • automatic support for labeling of versions, much like in source/version control systems, at an individual record level, table level, "aggregate root level", or database level
  • automatic support for "back querying" a system, in search of what the situation looked like last month, last year, etc. (though raising this "aspect", as in AOP, to the ORM level would be crucial)
IMHO, switch/case statements are generally simpler than polymorphism (demonstrated by how switch/case statements are typically taught before polymorphism in academic settings), but we all know why polymorphism in the long run is the better strategy, and therefore why as soon as we see switch/case statements propagating their way into our code bases, we typically change to a polymorphic strategy.

Again, the goal in my eyes is to raise temporal database concepts to the level of first class citizens in the IT world, as opposed to the back water, academic debates that they are today. The major players like Microsoft with SQL Server have never bothered to implement the temporal extensions to ANSI-94 (yes, from 1994 - see this). Interestingly enough Oracle has implemented some of them via Oracle 9i and 10g, but per the work of Snodgrass and friends, still have room to perfect things.

This is also where the "minor players", i.e., the open source software community via projects such as NHibernate could step in and heavily promote something like this. In the same line of thinking of not wanting to constantly roll your own ORM, thus why you chose something such as NHibernate, needing to roll your own temporal solution for every project should be equally unnecessary.

In closing, for all of the "pain" of implementing something as complex as this topic would be, I'd love to see a platform such as Microsoft Dynamics (MS CRM) implement this all the way from the database through to the GUI, as it would clearly represent a paradigm shift in business information systems development. Of course perhaps I should just start a company to do just this...

7 comments:

Al said...

I'd love to see a full sample app that has examples of best practices in temporal database design.

I've found that using temporal designs adds a huge amount of complexity. And often the value added by temporal designs can be achieved outside of the OLTP system, e.g., in a data warehouse.

BigJimInDC said...

Al,

I would argue though that while you think that you can get similar functionality via data warehousing, it's simply not even close to as fully integrated of an experience as when using temporal designs.

Imagine using Subversion, SourceSafe, git, whatever, and every time you wanted to view something that wasn't the current version, you had to drop out of your current tool set, load a whole other tool set, and deal with history viewing, rollbacks, etc from a completely different tool set. The experience would suck hard and you'd be demanding a different version control repository.

It's simply my belief that similar concepts can be applied to OLTP data sets, ultimately enabling some crazy new usage scenarios, and easily solving some decades old recurring issues.

No argument on the added complexity though. I just stand by my original comment to Oren that it is the job of software developer to make complex things easy. NHibernate has made ORM tools easy (at least in my eyes). Someone will eventually make temporal stuff easy, and I'd love to be a part of that. I'm just surprised that it's taking so long to catch on.

Harry Chou said...

The value of adding the complexity of temporal design is that you get the temporal model as the first level model concept that make your model as clear as possible to the user. I am reading the 'Time and Time Again' series now, and I think the authors made it very clear why we need to build temporal concept in the app instead of using data warehouse. (http://www.information-management.com/issues/20070501/1082475-1.html)

Anonymous said...

Hello,

In oracle 11g:

1. CREATE FLASHBACK ARCHIVE fdal TABLESPACE USERS RETENTION 5 YEAR;

2. ALTER TABLE EMPLOYEES FLASHBACK ARCHIVE fdal;

3. SELECT *
FROM EMPLOYEE AS OF TIMESTAMP ('2009-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS')
WHERE ID = 100;

Regards,
jbe

Anonymous said...

Jim

I have the answer for the temporal challenge which completely removes the complication from the application (and therefore the developer). Does not require specialised SQL extension and provides full referential integrity. Fully handling both transaction time and valid time. So who do you take the solution to?

BigJimInDC said...

Well, if you're talking about something for NHibernate, then email Oren for help on getting it integrated into the NHibernate source tree, or whatever else he thinks makes sense.

OR... create a blog post of your own that documents the solution, and share the link to it here.

Martin Trevena said...

Hey guys
Love seeing someone talking about this subject.
I created an Temporal Database with soft deletes and soft associations using EntityFramework, SQL=server and some clever C# coding.

It was the most flexable application I had ever designed and could be extended with small changes to fit many industries.

I wrote this for a company was contracting for and now that I have moved on would like to rewrite this type of solution for commercial use.

I can see so many places this type of approach is just what is required but would like some feedback , is the real world ready for this type of solution or is it too radical for industry.

Would love to hear your thoughts.

Thanks
Martin