The main difference is that creating a Dimension table by combining multiple sets of Hubs, Links and Satellites requires some creativity around navigating through the model.įor example: which path to take through the model, preventing cycling back to tables that are already incorporate – that sort of thing. The difference between PIT tables and Dimension tables isn’t really big and the logic to create both is very similar. But if you need that bit of additional performance loading a Dimension they may come in handy. As a result, I don’t really use PIT structures that often. If in any way possible, I always try to generate a Dimensional model directly off the Data Vault and only use PIT tables if there is really a requirement for additional performance. multiple Hubs, Links and their Satellites). A Dimension is broader than this and can contain more historised sets (e.g. To keep some differentiation between the two let’s agree that a PIT table combines history from its direct surrounding tables (all Satellites for a Hub for example). In the end, PIT tables are meant as performance enhancers that need to be fit-for-purpose.Īs an interesting observation: the more attributes you add to a PIT table, the more the result starts to look like a Dimension (and vice-versa). However, it is perfectly reasonable to add attributes or even business logic if it makes sense for you. The default PIT table is a combination of the various Satellite key and date attributes against the lowest defined grain (the entity, Hub). In other words, you can present the PIT snapshot very easily if you have the combined full history of changes available. The reasoning for this is that a ‘state per interval’ snapshot can always be derived from the full history, and by (re)using the same techniques described in this post. While this approach fits various use-cases, in this post I will focus more on combining all history from the involved tables at the most detailed level. days as per midnight or something like that).\ The PIT explanations in the available books are geared towards storing and maintaining snapshots of ‘state’ for data sets at various intervals (e.g. all attributes) requires a historical / Type 2 perspective.Ī disclaimer is needed for the PIT tables as well: if you have read any of the Data Vault books you may notice that my interpretation of a PIT table is slightly different. For this post I will assume the full Dimension (i.e. How the history of changes is represented can be customised for individual attributes as well, and this is relatively easy once the base data sets are brought together. The resulting Dimension in this example contains the full history of changes for every attribute – completely ‘Type 2’ in Dimensional Modelling terms. This post outlines how merging time-variant data can be applied to Data Vault in order to create Point-In-Time (PIT) and Dimension tables. They can generally be referred to as gaps and islands of time (validity) periods. Merging two or more historised (time-variant) data sources, such as Satellites, reuses Data Warehousing concepts that have been around for many years and in many forms. A given Hub on average has 3 or 4 Satellites, and is it useful to see the full history of changes for that specific Hub across all Satellites at the very least. Merging time-variant dataīeyond creating Hubs, Links and Satellites and current-state (Type 1) views off a Data Vault model, one of the most common requirements is the ability to represent a complete history of changes for a specific business entity (Hub, Link or groups of those). This paper builds on the concepts outlined here and also touches on generating PIT, Type 1, 2 – basically any kind of delivery from a Data Vault model. It is called ‘a pattern for Data Mart delivery’ and is available here. If you are interested in this topic I highly recommend the broader white paper on managing time-variance that I wrote a bit later than this original post. Edit : I reviewed this paper on relevance today and made a few minor tweaks.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |