le 16 avril 2014
“Data Science” has become one of the most trendy research fields in recent years, as well as a catchall rubric for various job descriptions and work functions. The cynics and skeptics, and there are many of those, contend that “Data Science” is nothing more than repackaged Statistics, with a bit of coding and hacking thrown in. Its proponents, however, point out that most practicing data scientists use a variety of skills and techniques in their daily work, and come from a vast spectrum of career paths and backgrounds. I tend to side with the latter group, but I too am an outsider to this field and am still trying to get a better understanding of what it really entails.
“Doing Data Science: Straight Talk from the Frontline” is a compendium of chapters that deal with data science as it is practiced in the real world. Each chapter is written by a different author, all of who have significant practical experience and are acknowledged authorities on data science. Most of the contributors work in industry, but data science is still so fresh and new that there is a lot of crossing over between academia and the corporate world.
A few of the chapters include exercises, but these tend to be too advanced and assume too much background material for an introductory book. The exercises still give you a good idea of what kinds of problems data scientists tend to grapple with. However, this book is definitely not a textbook and cannot be effectively used as such. The book doesn’t provide any background on R, statistics, data scrubbing, machine learning, and various other techniques used by data scientist. It is highly unlikely that any single textbook would be able to do justice to all of that material anyways, but a book of that sort could still have a lot of potential use.
There are two groups of people who would benefit from this book. The first are people who have absolutely no background in data science or any of its related fields, but would like to get a flavor of what data science is all about and are interested in exploring it for career purposes. The second group are people with significant technical background in one of the fields related to data science (programming, statistics, machine learning, etc.) who are interested in broadening their skills and would like to see how would their particular strengths fit within the broader data science field.
le 28 septembre 2014
Un statisticien pur et dur risque de s'ennuyer par moment, mais pour le reste, cet ouvrage en anglais permet de découvrir et de se former assez efficacement à un domaine en pleine explosion. Je l'ai recommandé à plusieurs de mes étudiants qui en sont tous satisfait.