The word data warehouse was first coined by Bill Inmon in 1990. Data warehouse is a subject oriented, integrated and time variant and non-volatile collection of data. A data warehouse provides generalised and consolidated data in a multidimensional view. A data warehouse also provides online analytical processing tools. There are tools that help in interactive and effective analysis of data in a multidimensional space.
Generally data warehouse is a database which is separated from organization’s operational database. Frequent updating is not done in the data warehouse. It contains the consolidated historical data which helps the organisation to analyse its business. The data ware house systems helps to integrate the diversity of system applications.
Features of Data warehouse:
- Subject Oriented- It is subject oriented because it provides information about the subject rather than the organisation’s ongoing operations. The subjects can be product, customer, supplier, sales and revenue.
- Integrated- A data warehouse is constructed by integrating data from many different sources like relational databases, flat files etc.
- Time variant- The data gathered in a data ware house is identified with a particular time period.
- Non-volatile: Non-volatile means the previous data is not erased when new data is added to it. A data warehouse is kept separate from the operational database and frequent changes in operational database is not reflected in the data warehouse.
Example of a data ware house:
Consider an e-commerce website, that has a warehouse where all the data of the project,the demand for the products and procure it from the suppliers and store it in the warehouse suppose the supplier places the order the goods is immediately dispatched. Now consider if there is no data warehouse in the system or website. If any customer searches for the product details the e-commerce website will directly contact the supplier then automatically the strain is caused to the manufacturer to supply those products. On the other hand there would be delay in the delivery of the products to the customers. This would reduce in demand and also affect the business.
The primary reason to have a data warehouse for any company is to get an edge over its competitors.
The Data Warehouse Architecture
The Data warehouse is a three tier architecture:
- Bottom tier- the bottom tier of the data warehouse consists of Database server. It is the relational database system. Back end tools and some utilities are used to feed the data. The back end tools and utilities do the functions like extract, clean, load and refresh.
- Middle-tier- Here we have OLAP server which is implemented in a ways like:
By relational OLAP which is called as an extended relational database management system which maps the operations on multi dimentional data to standard relational operations.
By multi dimentional OLAP model which implements directly the multi dimentional data and operations.
- Top tier- This tier consists of front-end client layer which hold the query tools and reporting tools,analysis tools and data mining tools.
From the data warehouse architecture we have models like:
- Virtual Warehouse- view over an operational data warehouse is called virtual data warehouse.
- Data mart- it contains the subset of organisation wide data.
- Enterprise Warehouse- It collects all the information and the subjects spanning an entire organisation.