undrgrnd Cliquez ici NEWNEEEW nav-sa-clothing-shoes Cloud Drive Photos cliquez_ici Rentrée scolaire Cliquez ici Acheter Fire Shop Kindle Paperwhite cliquez_ici Jeux Vidéo Bijoux Montres Montres boutique Tendance
Doing Data Science: Straight Talk from the Frontline et plus d'un million d'autres livres sont disponibles pour le Kindle d'Amazon. En savoir plus
EUR 37,01
  • Tous les prix incluent la TVA.
Il ne reste plus que 12 exemplaire(s) en stock (d'autres exemplaires sont en cours d'acheminement).
Expédié et vendu par Amazon. Emballage cadeau disponible.
Quantité :1
Doing Data Science a été ajouté à votre Panier
Vous l'avez déjà ?
Repliez vers l'arrière Repliez vers l'avant
Ecoutez Lecture en cours... Interrompu   Vous écoutez un extrait de l'édition audio Audible
En savoir plus
Voir les 2 images

Doing Data Science (Anglais) Broché – 18 octobre 2013

3 commentaires client

Voir les formats et éditions Masquer les autres formats et éditions
Prix Amazon
Neuf à partir de Occasion à partir de
Format Kindle
"Veuillez réessayer"
"Veuillez réessayer"
EUR 37,01
EUR 28,50 EUR 31,23

Livres anglais et étrangers
Lisez en version originale. Cliquez ici

Produits fréquemment achetés ensemble

  • Doing Data Science
  • +
  • Data Science for Business
Prix total: EUR 72,41
Acheter les articles sélectionnés ensemble

Descriptions du produit

Book by ONeil Cathy Schutt Rachel

Aucun appareil Kindle n'est requis. Téléchargez l'une des applis Kindle gratuites et commencez à lire les livres Kindle sur votre smartphone, tablette ou ordinateur.

  • Apple
  • Android
  • Windows Phone

Pour obtenir l'appli gratuite, saisissez votre adresse e-mail ou numéro de téléphone mobile.

Détails sur le produit

  • Broché: 406 pages
  • Editeur : O'Reilly; Édition : 1 (18 octobre 2013)
  • Langue : Anglais
  • ISBN-10: 1449358659
  • ISBN-13: 978-1449358655
  • Dimensions du produit: 15,2 x 2 x 22,9 cm
  • Moyenne des commentaires client : 4.7 étoiles sur 5  Voir tous les commentaires (3 commentaires client)
  • Classement des meilleures ventes d'Amazon: 23.176 en Livres anglais et étrangers (Voir les 100 premiers en Livres anglais et étrangers)
  •  Souhaitez-vous compléter ou améliorer les informations sur ce produit ? Ou faire modifier les images?

En savoir plus sur les auteurs

Découvrez des livres, informez-vous sur les écrivains, lisez des blogs d'auteurs et bien plus encore.

Quels sont les autres articles que les clients achètent après avoir regardé cet article?

Commentaires en ligne

4.7 étoiles sur 5
5 étoiles
4 étoiles
3 étoiles
2 étoiles
1 étoiles
Voir les 3 commentaires client
Partagez votre opinion avec les autres clients

Commentaires client les plus utiles

Format: Broché
“Data Science” has become one of the most trendy research fields in recent years, as well as a catchall rubric for various job descriptions and work functions. The cynics and skeptics, and there are many of those, contend that “Data Science” is nothing more than repackaged Statistics, with a bit of coding and hacking thrown in. Its proponents, however, point out that most practicing data scientists use a variety of skills and techniques in their daily work, and come from a vast spectrum of career paths and backgrounds. I tend to side with the latter group, but I too am an outsider to this field and am still trying to get a better understanding of what it really entails.

“Doing Data Science: Straight Talk from the Frontline” is a compendium of chapters that deal with data science as it is practiced in the real world. Each chapter is written by a different author, all of who have significant practical experience and are acknowledged authorities on data science. Most of the contributors work in industry, but data science is still so fresh and new that there is a lot of crossing over between academia and the corporate world.

A few of the chapters include exercises, but these tend to be too advanced and assume too much background material for an introductory book. The exercises still give you a good idea of what kinds of problems data scientists tend to grapple with. However, this book is definitely not a textbook and cannot be effectively used as such. The book doesn’t provide any background on R, statistics, data scrubbing, machine learning, and various other techniques used by data scientist. It is highly unlikely that any single textbook would be able to do justice to all of that material anyways, but a book of that sort could still have a lot of potential use.
Lire la suite ›
Remarque sur ce commentaire Avez-vous trouvé ce commentaire utile ? Oui Non Commentaire en cours d'envoi...
Merci pour votre commentaire. Si ce commentaire est inapproprié, dites-le nous.
Désolé, nous n'avons pas réussi à enregistrer votre vote. Veuillez réessayer
Format: Broché Achat vérifié
Ce livre est une bonne introduction à la data science avec une couverture des principaux algorithmes qui sont utilisés dans ce domaine. C'est assez bien écrit et illustré par des portions de code, ça se lit facilement et c'est à la portée de nombreuses personnes. Je le recommande.
Remarque sur ce commentaire Avez-vous trouvé ce commentaire utile ? Oui Non Commentaire en cours d'envoi...
Merci pour votre commentaire. Si ce commentaire est inapproprié, dites-le nous.
Désolé, nous n'avons pas réussi à enregistrer votre vote. Veuillez réessayer
Format: Broché
Un statisticien pur et dur risque de s'ennuyer par moment, mais pour le reste, cet ouvrage en anglais permet de découvrir et de se former assez efficacement à un domaine en pleine explosion. Je l'ai recommandé à plusieurs de mes étudiants qui en sont tous satisfait.
Remarque sur ce commentaire Avez-vous trouvé ce commentaire utile ? Oui Non Commentaire en cours d'envoi...
Merci pour votre commentaire. Si ce commentaire est inapproprié, dites-le nous.
Désolé, nous n'avons pas réussi à enregistrer votre vote. Veuillez réessayer

Commentaires client les plus utiles sur Amazon.com (beta)

Amazon.com: 46 commentaires
117 internautes sur 117 ont trouvé ce commentaire utile 
More breadth than depth 28 décembre 2013
Par Carsten Jørgensen - Publié sur Amazon.com
Format: Broché
Book review - Doing Data Science by O'Neil and Schutt, O'Reilly Media.

More breadth than depth

What is data science? The book Doing Data Science not only explains what data science is but also provides a broad overview of methods and techniques that one must master in order to call one self a data scientist. The book is based on a course about data science given at Columbia University. However it is not to be considered as a text book about data science but more as a broad introduction to a number of topics in data science.

In the spring of 2013 I followed two Coursera courses. One about the statistical programming language R and one on Data Analysis. I had for some time been looking for a book that could be used as a follow-up reading on topics in data science. This was the reason I picked up "Doing Data Science".

The book begins with a chapter about what data science is all about is followed by four chapters on topics like statistical inference, explanatory data analysis, various machine learning algorithms, linear and logistic regression, and Naive Bayes. I have a background in both mathematics and statistics and I was able to understand these chapters but the material is covered in such broad terms that I find it hard to believe that a newcomer to this topics will understand or gain much knowledge from reading these chapters. Basic math is presented about the models but without some kind of detailed explanation one cannot develop any deeper intuition for the approach explained.

The best parts of the book is definitely chapter 6 to 8 and 10. In here we find interesting discussion about coverage of data science applied to financial modeling, extracting information from data, and social networks. I really enjoyed the examination of time stamped data, the Kaggle Model, feature selection, and case-attribute data versus social network data. The math behind these topics was however once again explained quite superficial. Centrality measures is central to social network analysis but it is very hard to develop intuition for there measures without a more detailed explanation about the underlying math. These chapters contains lots of useful resources for finding additional information about the discussed topics.

Data visualization is an integral part of data science for communication results. Beginners in the field of data science needs concrete and easy to follow instruction on how to get started with visualization. Unfortunately the book focuses more on the use of data visualization in modern art projects. The content is simply to abstract for beginners to learn about the usage of visualization in data science.

When I was browsing the book before actual buying it I was kind thrilled to see that it covered topics like causality and epidemiology. Topics that I did not found covered in any other book about data science. However the chapter about epidemiology is not about using data science in epidemiology but 'just' about using data science to evaluate the methods used in epidemiology. Likewise there seems to be no link between data science and causality. I later discovered that the authors used an entire blog post ([...] to explain why causality was part of the university course underlying the book. This material or parts of it should have made it into the book. I am still not convinced that causality is a topic in data science.

There are several examples in which the book assumes the reader to have knowledge of US government structure and organizations. Examples include page 292 when discussing US health care databases and page 298 where FDA is mentioned without further introduction or explanation about what FDA is.

A book than contains programming examples should always make the code accessible to download. Typing in the code yourself is simply waste of time. It is possible to download some of the datasets used in the book through GitHub. But the code does not seem to be available. I also own the electronic version of the book and I tried to copy-paste some of the examples from the e-book but there are several examples of code that hasn't been proof written or tested prior to publication. The sample code misses references to required R libraries or refers to computer folder structures on some local Columbia University computer. The companion datasets that can be downloaded on GitHub consists of a number of Excel files. The R sample code uses the gdata package to load these Excel files into R for further analysis. It took be quite some time to figure out why this process didn't work on a Windows computer. The gdata package requires Perl to be installed on the computer and this is not default software on Windows. In my opinion one should always publish data in a simple format, e.g. csv files and definitely not proprietary formats like xls for Excel files.

Data Science is both science and a lot of practical experience. I guess the title of the book Doing Data Science tries to capture that. You need to do data science in order to learn it. The covered topics are interesting but the material is more breadth than depth. Luckily there are lots of useful links and resources to additional materials. Personally I would prefer more details about the actual data science topics like e.g. extracting meaning from data and social network analysis and less focus on math. The book already requires some knowledge of math, statistics and programming, so why not presume that the reader has the background knowledge and dive straight into the data science discussions.

I really like the idea about having a lot of different people present various topics in data science and the book is well written and contains lots of useful resources for further studies of data science. I will recommend to book to people new to the subject but be aware of the fact that source code is not available and that is a major drawback.

Disclosure: I review for the O'Reilly Reader Review Program and I want to be transparent about my reviews so you should know that I received a free copy of this ebooks in exchange of my review.
51 internautes sur 53 ont trouvé ce commentaire utile 
Doing Data Science Worth a Look 19 novembre 2013
Par Dan D. Gutierrez - Publié sur Amazon.com
Format: Broché
I found this book to be a very odd bird indeed. It is one book you can read from back cover to front cover and not be at a disadvantage. This is because the book is really just a collection of presentations made by various people to a class taught by the primary author Rachel Schutt at Columbia University in the Fall of 2012 – Introduction to Data Science. It wasn’t entirely clear what content Schutt was directly responsible for since only some of the chapters indicate who the contributors were (one of the chapters was contributed by a group of her students!). The co-author, Cathy O’Neil, I’ve encountered before as an outspoken blogger going by the name “mathbabe” but it wasn’t specifically stated how she became part of the book project, other than to say she was one of the students in Schutt’s class. Chapter 6 was partly written by O’Neil.

Both Schutt and O’Neil are Ph.D.s data science appropriate fields, but the book was not “written” by the two, rather they seemed to have performed some kind of editing function with the materials submitted by each contributor and added commentaries of their own. As a result, the book is a hodgepodge of anecdotes, factoids, R code snippets, plots, and mathematics, all from the in-class presentations. I enjoy seeing math in data science books, but the equations in this book were sort of just floating there requiring the reader to explore further at another time.

Although I have issues with the book as it is not any sort of text for the field, I did enjoy reading it with a number of “Ah, I didn’t know that!” moments. Schutt’s credentials in data science are considerable, having worked at Google for a few years around the same time that “data science” was growing up in Silicon Valley. As a result the book has many memorable anecdotes about the early days of the data science industry, and observations about what makes big data tick. I enjoyed the story about the Google software engineer who accidentally deleted 10 petabytes of data, and I think my favorite quote from the book is from the student’s chapter 15:

Kaggle competitions could be described as the dick-measuring contests of data science.

With contributor’s chapters on statistical inference, machine learning algorithms, logistic regression, financial modeling, recommendation engines, data visualization, Hadoop, MapReduce, and more, I’d say the book is worth a read, but not necessarily as a source of learning data science but more as a high-level guide and short historical account of this young industry. You get to learn about the people, companies, technologies that have collectively built the data science arena and you’ll be better for it especially if you are working to become a data scientist yourself.
74 internautes sur 88 ont trouvé ce commentaire utile 
A spoonful of sugar... 29 octobre 2013
Par Dimitri Shvorob - Publié sur Amazon.com
Format: Format Kindle
... helps the medicine go down, as Mary Poppins used to say. An IT-focused publisher, O'Reilly has twice before used the "book as collection of chapters by different contributors" formula in its foray into the attractive "data" niche, with such titles as "Beautiful data" and "Bad data". "Doing data science" - by the way, I prefer Hastie and Tibshirani's "statistical learning" to the fuzzy and grandiose "data science" - follows the same approach, but, with its subject matter being closer to the academe, the company enlisted two young PhDs to steer the collaborative effort. Rachel Schutt took the lead as author and editor, and, assisted by Cathy O'Neil, produced an engaging, informal - you don't often see "science" in the title and "huge-ass" in the text - yet sufficiently technical to be hands-on, sequence-of-vignettes-styled book. Imagine a mash-up of a magazine article and a textbook. Neither part may be best-in-class, but their combination makes for a "unique selling proposition".

Well, maybe not a textbook. Most textbooks are carefully written and carefully checked. In contrast, when I see "Doing data science" introduce the ROC curve in three places, one of which translates the "O" as "operator", I can guess that this is a copy-paste of papers by three contributors. When Dr. O'Neil casually redefines an English word ("causal") to avoid rewriting a couple of sentences, or pronounces, on page 159, that "priors reduce degrees of freedom" - this is painfully meaningless, and neither term is defined, only name-checked - I suspect that she knows better, but just did not feel like spending more time on her half-chapter. Neither author speaks of their own projects - if this is the "frontline", then it's other soldiers' "trenches" that we are visiting. The occasional code listings are borrowed as well, thrown in without editing or comments. In this last regard, "Doing data science" lags far behind the book that seems to have informed its choice of topics, Peter Harrington's "Machine learning in action". (That's one suggestion - and if you want a good, accessible textbook, "Introduction to statistical learning" by James et al. is another).

None of it is going to matter to the book's target audience. "Doing data science" is aimed at beginners - and is bound to be interesting and useful to thousands of keen undergrads and adult learners.
11 internautes sur 13 ont trouvé ce commentaire utile 
Not so much about Doing Data Science as about what is covers 25 mars 2014
Par Marc Zucker - Publié sur Amazon.com
Format: Format Kindle
"Doing Data Science: Straight Talk from the Frontline" by Cathy O’Neil and Rachel Schutt; O'Reilly Media

With so many books being published on Data, from Big Data to Machine Learning to Data Analysis, we must ask what yet another book is going to offer us. And the answer is not that much. O’Neil and Schutt have seemingly written this to give us a view of how the ins and outs of the actual world of Data Science are practiced. But we get little more than an overview of what Data Analysis is in general.

Many topics a reader might find interesting, whether for background knowledge or otherwise, are given little emphasis, and even then, without much depth. As an example, a discussion on the Exponential Distribution merely states that “because we are familiar with the fact that ‘waiting time’ is a common enough real-world phenomenon that a distribution called the exponential distribution has been invented to describe it.” Any mathematician realizes that inventing probability distributions are a little more involved than is implied.

The main part of the book starts with a look at algorithms. The authors use R as their primary language. There is little in terms of explanation of the underlying processes and examples are pretty direct. Naïve Bayes, for example, is explained in a page and a half, so we can clearly say that this is an overview of many of the ideas going into Data Science. In fact it is a very wide overview. Financial Modelling, Spam Filtering, Epidemiology, the list goes on. This is definitely a plus; we see the wide applications that Data Science has.

The book ends with a discussion on Competitions, but we come back to the question of what we have gained more than we might get by a simple perusal of the web? Maybe I was hoping for more out of the book than I got; expectations can be a killer. But in the end we must make a choice. If we would like to know about Data Science – how it’s done – then we would probably like to look elsewhere. If we would like to have some book that we can show someone what areas exist within this field and what topics are touched upon in it (perhaps for an advisor), then this book should do fine.

(FTC disclosure (16 CFR Part 255): The reviewer has accepted a reviewer's copy of this book which is his to keep. He intends to provide an honest, independent, and fair evaluation of the book in all circumstances.)

18 internautes sur 23 ont trouvé ce commentaire utile 
poorly written for a technical book 27 novembre 2013
Par rivertech - Publié sur Amazon.com
Format: Format Kindle Achat vérifié
Lots of good information - if you can wade through significant comprehension interference caused by meandering trains of thought and poor, non-linear exposition. It has the prose of a novel, not a well written technical book. While trying to set an easy going tone, it is much too wordy - causing a reader to exert unnecessary effort to extract information.
Ces commentaires ont-ils été utiles ? Dites-le-nous


Souhaitez-vous compléter ou améliorer les informations sur ce produit ? Ou faire modifier les images?