无为

无为则可为,无为则至深!

  BlogJava :: 首页 :: 联系 :: 聚合  :: 管理
  190 Posts :: 291 Stories :: 258 Comments :: 0 Trackbacks

My favored definition of a data warehouse is a slightly modified version of Ralph Kimball's definition on page 310 of The Data Warehouse Toolkit:

A data warehouse is a copy of transaction data specifically structured for querying and reporting.

Ralph states that a data warehouse is "a copy of transaction data specifically structured for query and analysis". Two quibbles I have with Ralph's definition are: 1) Sometimes non-transaction data are stored in a data warehouse - though probably 95-99% of the data usually are transaction data. 2) I say "querying and reporting" rather than "query and analysis" because the main output from data warehouse systems are either tabular listings (queries) with minimal formatting or highly formatted "formal" reports. Queries and reports generated from data stored in a data warehouse may or may not be used for analysis. - For some more information about why the transaction data are copied, you may want to see my essay The Case for Data Warehousing.

What I especially like about Ralph's definition is what he does not say.

The form of the stored data has nothing to do with whether something is a data warehouse.


A data warehouse can be normalized or denormalized. It can be a relational database, multidimensional database, flat file, hierarchical database, object database, etc. Data warehouse data often gets changed. And data warehouses often focus on a specific activity or entity.

Data warehousing is not necessarily for the needs of "decision makers" or used in the process of decision making.


Of course if you want to define every user as a decision maker and all activities as decision making processes, then my assertion is false. But in my experience, the overwhelming uses of data warehouses are for quite mundane, non-decision making purposes rather than for grist for making decisions with wide ranging effects (so-called "strategic" decisions.). In fact, I would assert that most of data warehouses are used for post-decision monitoring of the effects of decisions (or as some people might say, for "operational" issues. By the way, this is not saying that using data warehousing in the decision making process is not a wonderful, potentially high return effort. But my caution is that though the trade press, vendors, and many industry experts trumpet the role of data warehousing vis-à-vis decision making, this is an area in reality we really do not have a clear understanding of. (See the writing ofPeter Keen for more on this perspective.)

 



凡是有该标志的文章,都是该blog博主Caoer(草儿)原创,凡是索引、收藏
、转载请注明来处和原文作者。非常感谢。

posted on 2006-05-25 22:01 草儿 阅读(187) 评论(0)  编辑  收藏 所属分类: BI and DM

只有注册用户登录后才能发表评论。


网站导航: