EUR 46,07
  • Tous les prix incluent la TVA.
Il ne reste plus que 6 exemplaire(s) en stock (d'autres exemplaires sont en cours d'acheminement).
Expédié et vendu par Amazon.
Emballage cadeau disponible.
Quantité :1
Practical Data Science wi... a été ajouté à votre Panier
Amazon rachète votre
article EUR 14,61 en chèque-cadeau.
Vous l'avez déjà ?
Repliez vers l'arrière Repliez vers l'avant
Ecoutez Lecture en cours... Interrompu   Vous écoutez un extrait de l'édition audio Audible
En savoir plus
Voir les 2 images

Practical Data Science with R (Anglais) Broché – 10 avril 2014

Voir les formats et éditions Masquer les autres formats et éditions
Prix Amazon Neuf à partir de Occasion à partir de
"Veuillez réessayer"
EUR 46,07
EUR 29,26 EUR 33,57

Offres spéciales et liens associés

Produits fréquemment achetés ensemble

Practical Data Science with R + An Introduction to Statistical Learning: With Applications in R
Prix pour les deux : EUR 120,47

Acheter les articles sélectionnés ensemble

Descriptions du produit

Présentation de l'éditeur



Simply put, data science is the discipline of extracting meaning from data. While it can involve deep knowledge of statistics, mathematics, machine learning, and computer science, for most non-academics, data science looks like applying analysis techniques to answer key business questions.


Practical Data Science with R
lives up to its name. It explains basic principles without the theoretical mumbo-jumbo and jumps right to the real use cases faced while collecting, curating, and analyzing the data crucial to the success of businesses. Readers will apply the R programming language and statistical analysis techniques to carefully-explained examples based in marketing, business intelligence, and decision support, while learning how to create instrumentation, design experiments such as A/B tests, and accurately present data to audiences of all levels.




Demonstrations of need-to-know statistical ideas

Covers all aspects of the project lifecycle

Data science for the motivated business professional



Written for the business analyst, technical consultant or technical director— no formal statistics or mathematics background is required. Readers should be comfortable with quantitative thinking plus light scripting or programming. Some familiarity with R is a plus.



R is a programming language which is used for developing statistical software programs. Data Science is the process of collecting data and developing analysis techniques and software over that data to answer key business questions.


Biographie de l'auteur



Nina Zumel
and John Mount are co-founders of Win-Vector, a data science consulting firm in San Francisco. Nina holds a Ph.D. in robotics from Carnegie Mellon and was a content developer for EMC's Data Science and Big Data Analytics Training Course. John has a Ph.D. in computer science from Carnegie Mellon and over 15 years of applied experience in biotech research, online advertising, price optimization and finance. Both contribute to the Win-Vector Blog, which covers topics in statistics, probability, computer science, mathematics and optimization.

Vendez cet article - Prix de rachat jusqu'à EUR 14,61
Vendez Practical Data Science with R contre un chèque-cadeau d'une valeur pouvant aller jusqu'à EUR 14,61, que vous pourrez ensuite utiliser sur tout le site Les valeurs de rachat peuvent varier (voir les critères d'éligibilité des produits). En savoir plus sur notre programme de reprise Amazon Rachète.

Détails sur le produit

  • Broché: 416 pages
  • Editeur : Manning Publications; Édition : 1 (10 avril 2014)
  • Langue : Anglais
  • ISBN-10: 1617291560
  • ISBN-13: 978-1617291562
  • Dimensions du produit: 23,1 x 18,5 x 2 cm
  • Moyenne des commentaires client : 4.0 étoiles sur 5  Voir tous les commentaires (1 commentaire client)
  • Classement des meilleures ventes d'Amazon: 63.343 en Livres anglais et étrangers (Voir les 100 premiers en Livres anglais et étrangers)
  •  Souhaitez-vous compléter ou améliorer les informations sur ce produit ? Ou faire modifier les images?

En savoir plus sur les auteurs

Découvrez des livres, informez-vous sur les écrivains, lisez des blogs d'auteurs et bien plus encore.

Dans ce livre (En savoir plus)
Parcourir les pages échantillon
Couverture | Copyright | Table des matières | Extrait | Index | Quatrième de couverture
Rechercher dans ce livre:

Quels sont les autres articles que les clients achètent après avoir regardé cet article?

Commentaires en ligne

4.0 étoiles sur 5
5 étoiles
4 étoiles
3 étoiles
2 étoiles
1 étoiles
Voir le commentaire client
Partagez votre opinion avec les autres clients

Commentaires client les plus utiles

Format: Broché Achat vérifié
Clair, ambitieux, et relativement complet. Bien sûr, on pourrait aller plus loin sur la théorie (mais c'est abordé ailleurs) sur les codes, pour les rendre plus efficaces (mais là aussi, d'autres livres le font).
Ce petit livre est parfait pour avoir une introduction, et quelques codes pour se lancer soi même
Remarque sur ce commentaire Avez-vous trouvé ce commentaire utile ? Oui Non Commentaire en cours d'envoi...
Merci pour votre commentaire. Si ce commentaire est inapproprié, dites-le nous.
Désolé, nous n'avons pas réussi à enregistrer votre vote. Veuillez réessayer

Commentaires client les plus utiles sur (beta) 22 commentaires
61 internautes sur 69 ont trouvé ce commentaire utile 
Lost in the middle 20 avril 2014
Par Dimitri Shvorob - Publié sur
Format: Broché
A problem with the other reviews is that they consider the book in isolation, as if no alternatives were available. "Practical data science" is not the only machine-learning-lite book on the market: Manning itself had published Harrington's Python-based "Machine learning in action", Packt offers "Machine learning with R" by Lantz, O'Reilly boasts "Doing data science" by Schutt and O'Neil, and, finally, Springer has "Introduction to statistical learning" by James, Witten, Hastie and Tibshirani. I have seen and reviewed all except Harrington's; for the purposes of this review, I'll ultra-briefly describe each contender ("Machine learning with R" - thin, average-quality, superficial, but effective at what it sets out to achieve; "Doing data science" - a mash-up of a textbook and a magazine article about kewl data scientists; below-average quality, but a lot of pop appeal; "Introduction to statistical learning" - high-quality, accessible and visually appealing textbook with R illustrations) and get to "Practical data science" - which, to me, comes across as a better-organized, earnest version of "Doing data science". The book's forte is its effort to go beyond a catalogue of R-illustrated machine-learning methods - and you have to have seen similar books to know how standard this repertoire is - and discuss practical skills useful to a budding "data scientist", from version control to presenting. I appreciate this effort, but feel that this content was not sufficiently substantial or polished to develop into a "unique selling proposition" of the kind that each of its competitors has - hence the title of my review.
23 internautes sur 24 ont trouvé ce commentaire utile 
Effective starting point for your data science project 24 avril 2014
Par Christopher G. Loverich - Publié sur
Format: Broché
tl;dr: A well rounded, occasionally high-level introductory text that will leave you feeling prepared to participate in the Data Science conversation at work, from earliest planning to presentation and maintenance.


Was excited to see this book coming to publication. I'm a fan of practical, non-academic approaches to subjects and prefer working from concrete examples to abstract principles (rather than the other way around). I think this is both the most difficult and most needed type of resources that can be put into print. This book handles the task ok; it falls a bit short on practical, concrete, use cases as it alternates between working with hands on datasets and shotgun coverage of principles and techniques at a higher level. I'd have much preferred sticking with single data-sets for longer (say, a couple chapters per data set), but didn't feel cheated out of hands on work.

- Easy access to the datasets via Github; good documentation on where to find others
- Key Takeaways provided at end of chapter are good summaries of overall information provided.
- A good focus on not just data analysis, but the process as a whole; very Agile like, practical, and non-dogmatic.
- Battle tested advice: You can tell some of the advice comes from hard-fought battles - ex: Why not use the sample() function instead of manually creating a sample column? Because with a sample column, you can repeatably sample the same data (e.g. all columns < 2) for repeatable output and for regression testing (avoiding introducing bugs).
- Builds your analyst vocabulary, increasing your all-important google-fu skills. Not knowing what to Google is, imho, the single hardest problem when learning a new set of problems / api's.
- Good use of Appendices for introducing R syntax / installation, rather then stuffing it into one of the early chapters.

- Doesn't stick with data sets long enough. I went to the trouble of setting up a true database to use the first dataset (chapter 2); only to move on to a different data set in the very next chapter (book did eventually return to the data set).
- Feels a bit back and forth at times on whether it wants to be a truly pragmatic, focused work or a principles driven, broadly scoped book (thinking of chapters 5-7 here). Not necessarily a knock depending on what your looking for.

I've ready a few books on getting started in data analysis, R, statistics, etc. This book is solid enough that were I to choose among them, I'd recommend it first. I think if the book focused down on using data-sets for longer stretches, allowing you to learn the data well and apply multiple types of analyses on top of it (especially earlier on), it would be a bit more engaging.

Lastly, its has good coverage of R principles but (per its scope) doesn't get into the nitty gritty. I'd recommend "The Art of R Programming" for that, which would be a good companion to this book (e.g. covers R but not Data Analysis). I've heard R in Action is good as well, though haven't read it. Caveat emptor.

Disclaimer: I received a e-copy of the book from Manning for review.
31 internautes sur 36 ont trouvé ce commentaire utile 
Good intro to data science 12 mai 2014
Par Scott C. Locklin - Publié sur
Format: Broché
I've had to hire recent graduates with degrees in machine learning, operations research and even "data science." One of the problems with such people: they don't know anything practical. They probably know the basics of regression and some classification routines, as learned in their coursework. They've probably worked on one or many data science like problems, using machine learning techniques or regression or what not. Many of them have never done a SQL query, or done the dirty business of data cleaning which takes up most of the data scientist's time. They'll always have gaps in their education; maybe they wrote a dissertation on an application of trees or deep learning, and have never used any of the other myriad tools available to the data scientist. None of them have ever done data science for money, and so none of them know about practical things like git or what the process looks like in an industrial setting. It is for these people that this book appears to be written. In an ideal world, all larval data scientists would be taught a course based on this book, or at least go through it themselves. It is also useful to experienced practitioners, as it covers many things, and can be a good practical reference to keep around. The book is ordered as a data science project would be ordered, from start to finish; so, as you proceed down an engagement, reviewing the chapters in order will be helpful.

Ch1 describes the job of the data scientist, the workflow, and the characters you run into on a project.
Ch2 outlines some of the tools used to get at the data, including the authors tool, "SQL Screwdriver." I'd have liked some genuflections at the unix tools used to clean data before it is put anywhere important; sed, awk, tr, sort and cut here, but I'm not sure if there is a graceful way of doing this. Or perhaps I'm the only weirdo who uses these in the ETL process.
Ch3 exploring data; using the various plot utilities in ggplot2 (the graphics library everyone should be using); bar charts, histograms, summary statistics and scatter plots.
Ch4 managing data: what they call "cleaning data" -I call reshaping data (and I use reshape, sometimes anyway; I would have mentioned this, though I got on well without it for years)
Ch5 gets into specifying the problem; is it a classification problem? scoring? recommendation engine? How do I quantify success? This chapter is very helpful in doing this. Of course, problems evolve over time, and customers change their minds, but there are very helpful mappings here which will point you in the right direction There are a few new techniques which should probably be included in future editions of this chapter, depending on how they pan out: I'm impressed with using drop out techniques to prevent overfitting, for example (this is bleeding edge stuff, generally in context of deep learning).
Ch6 Memorization techniques covers Naive Bayes, KNN and decision trees. It would have been nice to have more information on the various kinds of variable selection techniques (particularly important for NB and KNN), but mentioning this will allow the practitioner to go find their own information.
Ch7 Logistic and Linear regression: most would have done these first, but these are actually more complex than memorization techniques, and there are more things to know to keep the practitioner out of trouble. In my opinion, this chapter really shines: everyone who is going to do this for a living has had some exposure to regression models: this chapter makes it practical.
Ch8 Unsupervised methods; covers clustering; heirarchical clustering (one of the most useful tricks you will use in data science), kmeans (it has to be done, though I never found it to be useful) and association rules.
Ch9 Advanced methods: GAMs, SVM, bagging and random forests (the importance measure trick: if you don't know it, pay attention: this is a very good trick). These are the "industrial strength" tools used in industry. I, personally would have stuck GAMs in their own chapter, and mentioned boosting here, but everyone is a little different in their tastes.
Ch10 Documentation and deployment: they use Knitr; I just use vanilla Sweave (I've tried brew, but never took to it). They introduce git here: something I would have done in chapter 1 or 2, but it is a fairly natural place to mention it. They use the Rook tool to deploy HTTP services; I've never used it, though I have used Shiny, which I can recommend. They mention PMML briefly (I've never used it).

The appendix on R is helpful, though it doesn't include the most valuable advice of all for using R in production: you need to maintain a distribution of R and all used packages, as well as a dependency toolchain if the code will be deployed on multiple servers.
7 internautes sur 8 ont trouvé ce commentaire utile 
I really think Practical Data Science with R is a brilliant book to tap into data science domain from an engineer's perspective. 18 juillet 2014
Par Wing Chen - Publié sur
Format: Broché
I personally wrote a blog post about this book from an engineer's perspective. Here comes the url:

I attach the content here too, so you don't have to go all the way over:

I am not a trained data scientist. I am a software engineer who happens to be interested in data science. Thus I am going to write up this review from that perspective.

A lot of people may tell you along the way that if you are a software engineer, you are half way there to be a data scientist.

However, if you dig into the definition of data science from Wikipedia, data science is the study of the generalizable extraction of knowledge from data, the domain is still heavily focused on the 'study of data' side, the statistician side, like the following quote: "Data scientist is a statistician who lives in San Francisco."

After all, what engineers are good at is not the statistics part, it's the data infrastructure and hacking part. So far as I know, all my engineer friends who tap into data domain are more focused on the Hadoop (and Pig/Hive), Spark (and Shark), or online learning infrastructure building, not model building, evaluation, not that much of making sense of data part. Practically, that's still what statisticians are better at. It's natural for them.

But that's not an excuse for us, the engineers. In order to do our jobs better and because we, the engineers, are born to be hackers in all domains, we would like to learn how to make sense of data too. This book, Practical Data Science with R, is by all means a very good starting point.

In order to take full advantage of this book, there are two things you should at least know: basic knowledge of R and statistics. In the appendixes, the authors do introduce fundamental R and statistics. Nevertheless, I still think having a statistic textbook and R language book like R in Action by your side to refer to is a better idea. You will need them.

As the data science journey begins, the book takes you through making sense of data and different distributions, data massaging, plotting, algorithm choosing, and model evaluations. Although model evaluation is discussed in chapter 5, I personally think it makes more sense to jump to model building (with different algorithms, chapter 6 to 9) first, practice it, and come back to lean how to evaluate your models.

R has everything you need to play with data built-in. Once you are comfortable with it, it's not too hard to explore into python scikit, java mahout, or scala MLlib for building a more scalable production environment.

Once you reach the point to scale up, you are back to engineers sweet spot again. You know how to handle it.

Overall, I really think Practical Data Science with R is a brilliant book to tap into data science domain from an engineer's perspective. I highly recommend it.
4 internautes sur 4 ont trouvé ce commentaire utile 
Practical data science - Emphasis on Practical 4 mai 2014
Par David M. Steier - Publié sur
Format: Broché
This is a great book that fills a gap in the many books available today purporting to be about data science or business analytics, which are either so high-level, the reader finishes with no idea of how to get started, or so focused on algorithms or particular languages that the challenges of how to deliver data science in real organizations are never discussed. Thankfully, this book is a welcome bridge.

As you'd expect from the authors, both experienced practicing data scientists with PhDs from Carnegie Mellon, Part 2 of the book presenting individual modeling techniques is comprehensive and useful. But Parts 1 and 3 that complement the algorithmic detail are also terrific: typical roles in a data science project, and practical guidance on data exploration and visualization in Part I, and on documentation, delivery, and presentation in Part 3; that content is rarely available, illustrated with examples and runnable code, in a single book as it is here.

I used early versions of some of the chapters in a graduate class I taught on Managing Analytics Projects at CMU last fall, and was very happy with the results; I would not hesitate to recommend this to other practitioners or faculty looking for a data science textbook for their classes.
Ces commentaires ont-ils été utiles ? Dites-le-nous


Souhaitez-vous compléter ou améliorer les informations sur ce produit ? Ou faire modifier les images?