Bill Inmon’s DW2.0™ Presentation

This past Tuesday I attended a presentation by Bill Inmon concerning his new concept: DW2.0™. I already knew a little about Inmon methodologies, at a high level, and-ironically enough-have recently become familiar with his “Corporate Information Factory” (CIF) approach because of my recent Kimball training in Chicago. I’ve never really put a lot of faith into most of Inmon’s methodologies, frankly because they seem to make more sense in the world of theory than they do in application. Kimball works and it works well. I could devote and entire article to the differences between Kimball and Inmon methodologies but that’s out of scope; I’ll maybe tackle that another day.

The scope of this article, however, is the Inmon presentation. Or maybe I should call it a sales pitch. Regardless, the presentation focused not on CIF but rather on DW2.0™, Inmon’s new trademarked data warehouse approach. He spent surprisingly little time on the actual architecture of DW2.0™, instead spending much of the time telling us what was not a data warehouse. Turns out a bottle of water and a dog are not data warehouses, despite the fact that anyone can claim they are. I can’t say I didn’t learn anything.

Here are the key points I took from the DW2.0 ™ portion of his presentation:

  • Inmon invented data warehousing. Hardware and software vendors came along and capitalized on it without staying true to his vision.
  • Anyone can now claim to be a data warehouse professional without following Inmon’s methodologies.
  • Only the most recent data in a data warehouse is actually used. Most users access only .05% of the total volume of data in the warehouse. Aging data should be archived onto inexpensive media.
  • Hardware and software vendors want data warehouses to be as large as possible so they can sell you disk and disk management at exorbitant prices.
  • Only Inmon data warehouses are true data warehouses; all others are fakes.

Inmon seemed frustrated during the presentation, depicting hardware and software vendors as cash-sucking vampires, out to drain all the money from your organization. His claim, as I understood it, is that he created the concept of the data warehouse and then the hardware and software vendors came along, marketed the hell out of it, and then made a ton of money in the process. It seems as if he feels he lost control of the data warehouse and is now out to rectify that.

That’s where DW2.0™ comes in.

DW2.0™ is Inmon’s copyrighted and trademarked baby. Since he missed the boat on copyrighting the term “data warehouse” all those years ago he’s now attempting to do just that, only he has to use a different name. According to Inmon the term “data warehouse” is in the public domain now and has lost any exclusive connection to his methodologies. Companies producing “Active” and “Federated” data warehouses have taken their own approach to data warehousing, as has Ralph Kimball with the “data mart” data warehouse. Inmon also claimed that a data mart data warehouse (aka Kimball) is not a data warehouse. It is, but I don’t want to lose focus here.

What I find ironic is that Inmon claimed that hardware and software vendors were out to milk as much money from your company as they could, that they were looking out for their best interest and not yours. He then went on to mention that he offers DW2.0™ certification (for a fee), that he has a specialized ETL tool coming out soon (for a fee), and that he has another book coming out as well (for a fee).

That’s about all I really got from the DW2.0™ section. The rest of his presentation centered around unstructured data, such as free-form e-mails, text documents, etc. He claims that Gartner claims that 80% of a company’s data is of the unstructured nature. Structured (transactional) data makes up only 20%. His DW2.0™ data warehouse contains both structured and unstructured data, and he claims that this unstructured data is immensely valuable.

I don’t disagree that some very large companies might find this useful, particularly those that already have a data warehouse up and running. Most companies, in my opinion, rely on their transactional data to run their business and that’s really all they care about. I find it hard to believe that for most companies it’s worth the time, effort, and money to warehouse this unstructured data, despite transforming it into a format that is quantifiable. Sheer quantity and volume of data doesn’t necessarily mean that the data is valuable; half or more of your unstructured data might be garbage. The ROI on this seems negligible, at best. Also there’s no reason that a Kimball designed warehouse couldn’t handle this data as well or better than Inmon’s warehouse.

Overall I found Inmon’s demeanor to be a bit condescending and very frustrated. Maybe that was just the mood he was in; I don’t know him personally so I can’t fully judge his character from one presentation. DW2.0™ seems to be a late in the game effort to regain control of the fragmented data warehouse name. I didn’t see what it brought to the table but that might have been because it was presented at such a high level. Again, this is what I got from the presentation; I haven’t had an opportunity to study it in more detail. I also didn’t see how it was better than Kimball’s methodologies, of which I am an unabashed fan. Maybe that clouds my judgment a bit but I still tried to go in with an open mind. Despite this I didn’t hear or see anything that would make me want to throw away my Kimball books; if anything I feel like clutching them just a little tighter.

Links:


Leave a Reply