This post explains about what Data warehousing is and why we should care about it and then speaks about importance of using in enterprises and its design. It is important to understand the what data warehousing is all about, As a database developer you are likely to be involved with Data warehousing in one way or the other.
Data Warehousing Defined:
Data warehousing is a methodological approach for organizing and managing enterprise data to provide a trustworthy,consistent, integrated data foundation for an enterprise’s data-driven applications.
Data warehousing is must for running an enterprise of any size to make intelligent decisions. It enables the competitive advantage.Data warehousing is essentially tells about data and its relationships and it is foundation for Business Intelligence(BI). It clearly draws the distinction between data and information.
Data consists of recorded “facts” – for example, Sales amounts initiated by a customer.
Information involves interpreting facts, identifying the relation between them and find the more abstract meaning. Each characteristic, such as customer, store, date could serve as predicate in queries.
Data warehousing emphasizes organizing, standardizing and formatting facts in such a way that we can derive information from them. BI is then concerned about acting on that information.
Data Warehouse Design
Building a data warehouse requires addressing the following technical and non-technical issues.
- Determine the organizational goals and objectives
- Identification of various audiences and their requirements
- Identify the ETL tool for your data warehousing
- Methods of end user accessing the information, including both reporting and analysis.
The primary goal of any data warehouse is to integrate data from disparate sources into a centralized store, where that data can be used across the enterprise for decision support.
There are two predominant architectures in data warehousing.
Top-Down approach of Inmon
Inmon – Father of DWH, have invented the term DWH in 1990. He characterized the DWH as follows
- Subject-Oriented : Major entities which are common across multiple OpApps.
- Integrated: Data sources are consistent with one another
- Nonvolatile: Data once loaded is usually never changed.
- Time-Variant: Time is the part of the key for everything.
The emphasis is on centralized, normalized data store.
Since the complexity of a normalized model does not directly query for ease of use and performance , this architecture provides various datamarts, which are derived databases whose structure is optimized for query which generally contains the aggregated data derived from data warehouse.
The Bottom-Up Approach of Kimball
A key part of this approach is “conformed dimensions”.
Dimensional model- held to represent the most elegant tradeoffs end-user integrity and ease of use.
The idea is to build the dwh incrementally , something most organizations find more easier than building all at once. The term conformed dimension concentrate on dimensions and grain of measures.
Hope you understand, what DWH is and its importance of design and approaches.