Commencez à lire Hadoop: The Definitive Guide sur votre Kindle dans moins d'une minute. Vous n'avez pas encore de Kindle ? Achetez-le ici Ou commencez à lire dès maintenant avec l'une de nos applications de lecture Kindle gratuites.

Envoyer sur votre Kindle ou un autre appareil


Essai gratuit

Découvrez gratuitement un extrait de ce titre

Envoyer sur votre Kindle ou un autre appareil

Hadoop: The Definitive Guide
Agrandissez cette image

Hadoop: The Definitive Guide [Format Kindle]

Tom White
5.0 étoiles sur 5  Voir tous les commentaires (3 commentaires client)

Prix conseillé : EUR 30,89 De quoi s'agit-il ?
Prix éditeur - format imprimé : EUR 41,57
Prix Kindle : EUR 19,99 TTC & envoi gratuit via réseau sans fil par Amazon Whispernet
Économisez : EUR 21,58 (52%)

App de lecture Kindle gratuite Tout le monde peut lire les livres Kindle, même sans un appareil Kindle, grâce à l'appli Kindle GRATUITE pour les smartphones, les tablettes et les ordinateurs.

Pour obtenir l'appli gratuite, saisissez votre adresse e-mail ou numéro de téléphone mobile.


Prix Amazon Neuf à partir de Occasion à partir de
Format Kindle EUR 19,99  
Broché EUR 43,13  

Le Pack de Noël: téléchargez gratuitement plus de 175€ de top applis et jeux avec l'App-Shop Amazon. Offre à durée limitée. En savoir plus.

Les clients ayant acheté cet article ont également acheté

Descriptions du produit

Présentation de l'éditeur

Ready to unlock the power of your data? With this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.

You’ll find illuminating case studies that demonstrate how Hadoop is used to solve specific problems. This third edition covers recent changes to Hadoop, including material on the new MapReduce API, as well as MapReduce 2 and its more flexible execution model (YARN).

  • Store large datasets with the Hadoop Distributed File System (HDFS)
  • Run distributed computations with MapReduce
  • Use Hadoop’s data and I/O building blocks for compression, data integrity, serialization (including Avro), and persistence
  • Discover common pitfalls and advanced features for writing real-world MapReduce programs
  • Design, build, and administer a dedicated Hadoop cluster—or run Hadoop in the cloud
  • Load data from relational databases into HDFS, using Sqoop
  • Perform large-scale data processing with the Pig query language
  • Analyze datasets with Hive, Hadoop’s data warehousing system
  • Take advantage of HBase for structured and semi-structured data, and ZooKeeper for building distributed systems

Détails sur le produit

En savoir plus sur l'auteur

Découvrez des livres, informez-vous sur les écrivains, lisez des blogs d'auteurs et bien plus encore.

Quels sont les autres articles que les clients achètent après avoir regardé cet article?

Commentaires en ligne

4 étoiles
3 étoiles
2 étoiles
1 étoiles
5.0 étoiles sur 5
5.0 étoiles sur 5
Commentaires client les plus utiles
1 internautes sur 1 ont trouvé ce commentaire utile 
Format:Broché|Achat vérifié
This is the best book to learn MapRed programming. If you only can get a single book about hadoop, without hesitation get this one. Be careful and get the 3rd edition as it covers the latest API.
Avez-vous trouvé ce commentaire utile ?
5.0 étoiles sur 5 Must have book 7 août 2014
Par nOnO
Format:Broché|Achat vérifié
The ultimate book for those who are interested in the hadoop framework. Lot of clear and precious explaination to set up and use a hadoop cluster. I highly recommend this book !
Avez-vous trouvé ce commentaire utile ?
5.0 étoiles sur 5 The best ever written book of computer science 5 octobre 2014
So easy to read !
Everything is linear to read, well explained !
It's much more the work of a great person, to achieve such clarity and pedagogy.
Avez-vous trouvé ce commentaire utile ?
Commentaires client les plus utiles sur (beta) 4.1 étoiles sur 5  67 commentaires
57 internautes sur 62 ont trouvé ce commentaire utile 
3.0 étoiles sur 5 Fell short of my expectations. Source of much frustration. 25 août 2012
Par Mark - Publié sur
Format:Broché|Achat vérifié
I had read all the positive reviews and really had high hopes for the book, waited for the 3rd edition thinking it would be current, but I've mainly felt frustration in reading it once past the first few chapters.

Reference to the Bible in other reviews are apt. The book is a mishmash of chapters with a wide variety of styles and intents. The writing giving the overview is great. But other chapters are a reference manual dump with little motivation. Other chapters tried to be guided tutorial, but lacked in important details (or were out dated by changes). Wish it could have been written with a clearer editorial point of view, or better organized in sections with similar purposes.

Keeping up with a such a fast moving project with a paperback book is no doubt a difficult task. I didn't feel the book did a good job of dealing with the changes that happened with the shift to 1.x .

Most frustrating were the mentions of the "book's website" as a source of up-to-date information. Which website? (,, Wouldn't it make sense to use a URL instead of the phrase "book's website?"

Minor complaint, don't like the code listings without filenames.

Expect to find a lot of time looking for stuff on the web that should have been included in the book or at least documented with a concrete URLs.

There are certainly example of truly fine technical writing in the book. Just wish that level could have been maintained through out the book.
62 internautes sur 68 ont trouvé ce commentaire utile 
5.0 étoiles sur 5 My Experience Getting Certified In Hadoop 13 avril 2014
Par Big Data Paramedic - Publié sur
Format:Broché|Achat vérifié
This book is the single best source to begin your career in Big Data Development. However this book should not be the first entry point, which will frustrate you. This review hopes to help the juniors and newbies, who want to enter the big data world.

Cloudera CCD-410 certification ranges between tough to very tough. Period.

TRAINING : You are not mandated to take a training. I took a relatively inexpensive training ($300) from edureka dot in, an online training website in India. They give a good overview at 10,000 feet are very good for the price,but no where close enough to get certified. Check out their first session available for free at Youtube. They do have steps to install your own VM, simple project , HIVE,PIG etc. If time and money permits, I strongly suggest going to official cloudera training. It costs about $3000 and includes a free test voucher , so effectively about $2700. Saves you months in preparation time and distinct advantage over your peers that should pay for itself.

Install VM, try few commands, PIG, hive commands, Also try Amazon elastic mapreduce which reduces lot of manual typing and allows you to focus on the coding itself.

LEARNING FROM THIS BOOK: After a training, start with this book. The first Eight chapters are critical (Approximately 300 out of 550 pages). If you are smart,sharp and young , expect to read these eight chapters about three times, more is just fine. Add some time to read rest of chapters once Or twice before the test and all the external links. If you are a busy professional, give a six month window to take the test. Knowing Java is a definitive plus. Buy the Cloudera mock examination after getting comfortable and familiar with Mapreduce($125). It is a nice resource. Explains every answer, links to where you can get more information . Just as an FYI, the real test was far more complex and difficult.

You will need to go through the example code, understand what each line does, why it is there, what happens if you comment out a line of the code. As an example,
return job.waitForCompletion(false) ? 0 : -1;

> What does waitForCompletion mean?,
> Is Reduce Job Must Or Optional ?
> How Many Files will running a Map job produce?
> Will the code compile or will it error at run time based on datatypes.?
> What will happen if you run the same job twice ?
> What happens to the map data after the job?
> How does Hadoop handle huge files that cross block boundaries ?
> What happens if you do not explicitly set a mapper or reducer ?
> Will a combiner help , based on a scenario ?
> Which daemon decides the number of Map job to run ?
> How does hadoop handle the blocks when a node crashes?

This is an extension of previous scenarios. A small table, a simple SQL query ( example : select stationid,max(temp) from tableX. Answer choice are four set of mapreduce code and you have to chose the right one. Expect to read and understand the mapreduce that emulates how you create a distinct, how you do a sum, average, max, min etc. According to Cloudera website, these are the percentage of questions.

CHAPTER 3 : 17 Percent
CHAPTER 4 : 6 Percent
CHAPTER 5 : 7 Percent
CHAPTER 6 : 18 Percent
CHAPTER 7 : 6 Percent
CHAPTER 8 : 7 Percent
PIG /HIVE/SQOOP/Zookeeper : 8 percent combined (no Hbase)

Chapter no 2 has no reference but is very important. Expect several questions from that chapter since it gives a good overview. Remaining is all the links that cloudera suggests to read and get familier. SQOOP import syntax, creating a hive table via sqoop , creating and populating hive table via sqoop are must knows.

I have heard the tiring argument that certification is purely academic. Tell that to your doctor or your Dentist. Sound fundamentals are the foundations behind real world experience. Big Data is no different. Understanding the basics will give the confidence; experience will follow while you keep your client happy.

My interest on Big Data was spooked by the Harvard Business Review Article claiming that "Data Scientist" was the hottest job of the 21st century. Follow that by googling for "Rayid Ghani", claimed as the data scientist behind Obama's second term victory.
hbr dot org forwardslash 2012 forwardslash 10 forwardslash data-scientist-the-sexiest-job-of-the-21st-century forwardslash ar forwardslash1

> Coursera provides a free course "Introduction To Data Science". I signed up for their first batch but could not finish with office commitments.
> Youtube for "Stanford University Hadoop" by Amr Awadallah

I was impressed with these books; You also might like them.
> Big Data: A Revolution That Will Transform How We Live, Work and Think
> Big Data at Work: Dispelling the Myths, Uncovering the Opportunities
> Data Science for Business: What you need to know about data mining and data-analytic thinking

Some day Big Data will become a commodity skillset,but not now. I did a search in glassdoor to see the demand for Hadoop vs some other hot ones. Hadoop is head and shoulders above the rest.
Hadoop - 30,011 postings on Apr 2014
Oracle DBA - 9227 postings ( A Perpetual hot skillset)
Salesforce - 9968 postings

Please post any questions in the comment section and I will certainly try to answer them.
158 internautes sur 183 ont trouvé ce commentaire utile 
1.0 étoiles sur 5 Useless as a Tutorial 19 septembre 2012
Par Frustrated Hadoop Learner - Publié sur
Format:Broché|Achat vérifié
I bought this book as a very experienced programmer but no prior experience with Hadoop, which I need to come up to speed on for a new project. I am extremely disappointed in the book and feel I wasted my money. If there's one thing you want from a book on a new technology, it's the ability to get a basic "Hello World" equivalent program running, from which you can then start iterating. This book completely falls down on this most basic requirement - when you get to the very first example program in the book, it tells you that you need to first compile a bunch of example code from the book's website. That shouldn't be required, but ok, whatever. Then when you go to the book's website, you are told that you first need to install a bunch of extra stuff covered later in the book before you can compile the libraries apparently needed to get anything at all to run. This really makes no sense at all - there's no way I should be having to read all the later chapters to figure out what these things are in order to get my very first example program running. Tossed it into the trash and off in search of a resource done by someone who understands how to structure a tutorial properly.
51 internautes sur 59 ont trouvé ce commentaire utile 
1.0 étoiles sur 5 poorly organized and hard to get examples working 13 octobre 2012
Par Y. Yuan - Publié sur
Format:Broché|Achat vérifié
I purchased this book a few months ago based on many earlier 5-star reviews. I had high hopes that it would be as good as those reviewers highly praised. However, the book is actually unbelievably poorly organized - essentially written in a spaghetti fashion. Yes - it contains a lot of information about Hadoop, but with three basic issues: 1) examples are trivial and hard to get working due to insufficient, unclear or no procedures; 2) many subjects (e.g. streaming) are spread over several chapters and readers have to stitch them together after reading all relevant chapters; and 3) many stataments are either inaccurate or lack supportive data. Ironically, one has to apply MapReduce to all the subjects in order to sort out various subjects in a more logic order. I look forward to the 4th edition with significant quality improvement.
17 internautes sur 19 ont trouvé ce commentaire utile 
3.0 étoiles sur 5 Good general guide, poor if looking for detail, poor if looking for Hadoop 2.0 information 6 janvier 2013
Par Al - Publié sur
Format:Broché|Achat vérifié
If you're looking to learn about what Hadoop is, all of the buzzwords/terms you've heard about (i.e. HDFS, MapReduce), and get an overview of software in the Hadoop ecosystem (Pig, Hive, etc.) this is a good book that will give you a good overview and pointers in the right direction.

However, the book isn't going to give you a lot of detail on programming MapReduce and things like that.

In other words, it's a good breadth book, not a good depth book. So YMWV depending on what you're looking for.

I bought the previous edition of this book and gave it 4 stars. I bought this newer edition looking for information about Hadoop 2.0, Yarn, and all of the new stuff coming out. It provided a little bit of information about this, but overall was lacking in these details. So I notched it down 1 star because of that. It was just too much duplicate information from the prior edition.
Ces commentaires ont-ils été utiles ?   Dites-le-nous
Rechercher des commentaires
Rechercher uniquement parmi les commentaires portant sur ce produit

Discussions entre clients

Le forum concernant ce produit
Discussion Réponses Message le plus récent
Pas de discussions pour l'instant

Posez des questions, partagez votre opinion, gagnez en compréhension
Démarrer une nouvelle discussion
Première publication:
Aller s'identifier

Rechercher parmi les discussions des clients
Rechercher dans toutes les discussions Amazon

Rechercher des articles similaires par rubrique