Autres vendeurs sur Amazon
+ EUR 0,01 (livraison)
Building a Scalable Data Warehouse with Data Vault 2.0 (Anglais) Broché – 5 octobre 2015
|Neuf à partir de||Occasion à partir de|
Produits fréquemment achetés ensemble
Les clients ayant acheté cet article ont également acheté
Description du produit
Présentation de l'éditeur
The Data Vault was invented by Dan Linstedt at the U.S. Department of Defense, and the standard has been successfully applied to data warehousing projects at organizations of different sizes, from small to large-size corporations. Due to its simplified design, which is adapted from nature, the Data Vault 2.0 standard helps prevent typical data warehousing failures.
"Building a Scalable Data Warehouse" covers everything one needs to know to create a scalable data warehouse end to end, including a presentation of the Data Vault modeling technique, which provides the foundations to create a technical data warehouse layer. The book discusses how to build the data warehouse incrementally using the agile Data Vault 2.0 methodology. In addition, readers will learn how to create the input layer (the stage layer) and the presentation layer (data mart) of the Data Vault 2.0 architecture including implementation best practices. Drawing upon years of practical experience and using numerous examples and an easy to understand framework, Dan Linstedt and Michael Olschimke discuss:
- How to load each layer using SQL Server Integration Services (SSIS), including automation of the Data Vault loading processes.
- Important data warehouse technologies and practices.
- Data Quality Services (DQS) and Master Data Services (MDS) in the context of the Data Vault architecture.
- Provides a complete introduction to data warehousing, applications, and the business context so readers can get-up and running fast
- Explains theoretical concepts and provides hands-on instruction on how to build and implement a data warehouse
- Demystifies data vault modeling with beginning, intermediate, and advanced techniques
- Discusses the advantages of the data vault approach over other techniques, also including the latest updates to Data Vault 2.0 and multiple improvements to Data Vault 1.0
Biographie de l'auteur
Aucun appareil Kindle n'est requis. Téléchargez l'une des applis Kindle gratuites et commencez à lire les livres Kindle sur votre smartphone, tablette ou ordinateur.
Pour obtenir l'appli gratuite, saisissez votre numéro de téléphone mobile.
Détails sur le produit
Si vous vendez ce produit, souhaitez-vous suggérer des mises à jour par l'intermédiaire du support vendeur ?
Il n'y a pour l'instant aucun commentaire client
|5 étoiles (0%)|
|4 étoiles (0%)|
|3 étoiles (0%)|
|2 étoiles (0%)|
|1 étoile (0%)|
Évaluer ce produit
Commentaires client les plus utiles sur Amazon.com
I've read the Supercharge Your Data Warehouse book from Dan, I've read all of Hans Hultgren's books, I've read and watched DV fundamentals training material, and I never really felt like I fully understood everything end to end. I grasped the big picture and was excited about the possibilities, but working alone after all of that I felt like I still didn't really have the tools I needed to complete a data vault on my own. With this book, I finally do!
This book takes you from concepts to implementation, from beginning to end. There are actual screen shots of how to create databases, what indexes to put on your vault tables, how to create an SSIS package, TSQL code, Master Data Services and Data Quality Services examples, there's MDX code, all of it. And it's not just details to get you to a data vault and then leaving you on your own to figure out the info marts, this takes you all the way to putting data in your dimension and fact tables.
All in all, this is the most complete Data Vault book that's ever been created and it's a fantastic value for the money. It's like 684 pages. Once I started, it was literally 8 hours a day for 4-5 days of plowing through all the information and going back and re-reading and getting more details out of it before I was done. It's not a book you can just flip through in a couple of hours. I HIGHLY recommend this book for anyone interested in data vault or data warehousing in general.
Using the MD5 haskey is brilliant since it creates a unique 32 character surrogate key based upon the business key(s) phrase which documents the how the haskey was created. I am going to leverage it in PowerPivot which requires a singular key for BISM relationships and if the business keys change so will the hashkey.
As per Number2 , This book takes you from concepts to implementation, from beginning to end. There are actual screen shots of how to create databases, what indexes to put on your vault tables, how to create an SSIS package, TSQL code, Master Data Services and Data Quality Services examples, there's MDX code, all of it. And it's not just details to get you to a data vault and then leaving you on your own to figure out the info marts, this takes you all the way to putting data in your dimension and fact tables.
I would probably move chapter 3 to the end of the book and treat it as a bonus where the rest of the book is the meat and potatoes that made us purchase it. I Also HIGHLY recommend this book for anyone interested in data vault or data warehousing in general.
The fact that this book includes detailed implementation guidance for Data Vault via the Microsoft BI stack should not discourage non-Microsoft industry people from reading it. Here’s why: As Data Vault has matured and evolved as a methodology, no book besides this one has covered the state of the art in a way that combines such narrative clarity and technical depth. Also, Microsoft is a fine BI platform.
Moreover, the coverage of DV 2.0 method details, with the transition to MD5 Hash Keys (vs. auto-numbers) is well documented right down to the level of code samples. The ample ETL code samples are also of enormous benefit. With them, an ETL engineer can pretty quickly appreciate how to not only load the Data Vault (pretty easy), but also how to efficiently pull data out of it (harder, admittedly) and load downstream layers with Business Rules and/or fact and dimension tables or views.
For a moment, back to the Microsoft-specific implementation sections of this book, it’s exciting to see this content, and it reminds me of when The Kimball Group published “The Microsoft Data Warehouse Toolkit: With SQL Server 2008 R2 and the Microsoft Business Intelligence Toolset”, in which they so aptly described the implementation of their compelling message for dimensional modeling in the Microsoft BI platform of that time. Soon afterwards, Microsoft teams were running fast with dimensional models, and the rest is history. I hopes that this excellent book will help to create similar traction for Data Vault. When armies of Microsoft implementers are using a given method, it has indeed hit the mainstream.
Underneath Data Vault’s time-tested methods for handling real-world data ugliness with, oh yes, enforced referential integrity and without breaking, providing logical interoperability between increasingly disparate source data, and the need for fast, easily parallel loading, lies an elegant, wonderfully simple set of design patterns that revolutionize the speed and flexibility with which enterprises can build and support sustainable data integration in our new world of gigantic, oddly structured Big Data and NOSQL, and the seemingly unquenchable demand for analytic insight.
One minor reservation: Although the Microsoft BI Stack is broad and strong, I don't personally regard their Data Quality Services (DQS), to be a particularly useful tool, notwithstanding the interesting implementation coverage it gets in this book. Still, a minor issue, and not directly related to Data Vault architecture anyway.
Data solution architects and BI developers who do not understand Data Vault are, in my view, missing out on a compelling architectural choice for agile RDBMS data integration. For readers ready to take it to the next level, especially insofar as you must tolerate the initial discomfort over the proliferation of tables (> 2x your source tables), this book will assuredly take you a long ways down the road, and you will be rewarded, even astonished, at the flexibility and sustainability that Data Vault affords you once it gets into your blood.
There are many examples using SSIS to implement the Data Vault. These did not apply to me as I am implementing it in Hadoop/Hive, so I cannot speak to their efficacy. I do wish there was more documentation out there on details of implementing the Data Vault using Hadoop, but even web resources are limited.
Overall I highly recommend this book to anyone who is interested in Data Vault.