198 internautes sur 204 ont trouvé ce commentaire utile
- Publié sur Amazon.com
Disclaimer: I served as a paid technical editor for Data Smart. I am not affiliated with the publisher, but I did receive a small fee for double-checking the book's mathematical content before it went to press. I also went to elementary school with the author. So as you read the rest of the review, keep in mind that this reviewer's judgment could be clouded by my lifelong allegiance to Lookout Mountain Elementary School, as well as the Scarface-esque pile of one dollar bills currently sitting on my kitchen table.
Anyway, books about "Data" seem to fit into one of the following categories:
* Extremely technical gradate-level mathematics books with lots of Greek letters and summation signs
* Pie-in-the-sky business bestsellers about how "Data" is going to revolutionize the world as we know it. (I call these "Moneyball" books)
* Technical books about the hottest new "Big Data" technology such as R and Hadoop
Data Smart is none of these. Unlike "Moneyball" books, Data Smart contains enough practical information to actually start performing analyses. Unlike most textbooks, it doesn't get bogged down in mathematical notation. And unlike books about R or the distributed data blah-blah du jour, all the examples use good old Microsoft Excel. It's geared toward competent analysts who are comfortable with Excel and aren't afraid of thinking about problems in a mathematical way. It's goal isn't to "revolutionize" your business with million-dollar software, but rather to make incremental improvements to processes with accessible analytic techniques.
I don't work at a big company, so I can't attest to the number of dollars your company will save by applying the book's methods. But I can attest that the author makes difficult mathematical concepts accessible with his quirky sense of humor and gift for metaphor. For example, I previously had not been exposed to the nitty-gritty of clustering techniques. After a couple of hours with the clustering chapters, which include illuminating diagrams and spreadsheet formulas, I felt like I had a good handle on the concepts, and would feel comfortable implementing the ideas in Excel -- or any other language, for that matter.
What I like most about the book is that it doesn't try to wave a magic data wand to cure all of your company's ills. Instead it focuses on a few areas where data and analytic techniques can deliver a concrete benefit, and gives you just enough to get started. In particular:
* Optimization techniques (Ch. 4) can systematically reduce the cost of manufacturing inputs
* Clustering techniques (Ch. 2 and 5) can deliver insights into customer behavior
* Predictive techniques (Ch. 3, 6, and 7) can increase margins with better predictions of uncertain outcomes
* Forecasting techniques (Ch. 8) can reduce waste with better demand planning
It may take some creativity to figure out how to apply the methods to your own business processes, but all of the techniques are "tried and true" in the sense of being widely deployed at large companies with big analytics budgets and teams of Ph.D.'s on staff. This book's contribution is to make these techniques available to anyone with a little background in applied mathematics and a copy of Excel. For that reason, despite the absence of glitter and/or Jack Welch on the book's cover, I think Data Smart is an important business book.
I had a few criticisms of the book as I was reading drafts, but almost all of them were addressed before the final revision. For the sake of completeness, I'll tell you what they were. Some of the chapters ran on a bit long, but these have been split up into manageable pieces. The Optimization chapter is a bit of a doozie, and used to be at the very beginning, but the reader can now "warm up" with some easier chapters on clustering and simple Bayesian techniques. The Regression chapter originally didn't discuss Receiver Operating Characteristic curves, which are important for evaluating predictive models visually, but now ROC curves are abundant.
Only one real criticism from me remains: I would have liked to see more on quantile regression, which is only mentioned in passing. It's a great technique for dealing with outlier-heavy data. The book by Koenker has good but highly mathematical coverage, and I would have loved to see this subject given the Foreman treatment. But, you can't have everything, and I suppose John needs to leave some material for Data Smart 2: The Spreadsheet of Doom.
In sum, Data Smart is a well-written and engaging guide to getting new insights from data using familiar tools. The techniques aren't really cutting-edge -- in fact, most have been around for decades -- but to my knowledge this is the first time they've been presented in a way that Excel-slinging business analysts can apply the methods without needing her own team of operations researchers and data scientists. If you're not sure whether the book's sophistication is on par with your own skills, you can download a complete sample chapter (as well as example spreadsheets) from the author's website.
One last thing: unlike many books with a technical bent, the prose is engaging and extremely clear. I think this can be traced to John's childhood. When John misbehaved, his father (who is a professor of English) would punish John by forcing him to read a novel by Charles Dickens. Minor infractions resulted in A Christmas Carol being meted out, and when he was really bad he had to read Great Expectations. This is a true story which you should ask John about if you see him at a book-signing event.
35 internautes sur 39 ont trouvé ce commentaire utile
M. L Lamendola
- Publié sur Amazon.com
Having been involved in both electrical power monitoring (very data intensive) and business intelligence software (provides business reports from database sources) for well over a decade now, I agree with the author's premise that there's a difference between data and information. I wrote an article on this subject for the Crystal Reports market, and it's featured on the Crystalkeen Website. Too much of what pretends to be "analysis" or "information" or "business reports" is simply reformatted data and not very useful.
Another premise of this author is that the data analysis function serves the business, not the other way around. This point is often lost upon those who are supposed to provide the analysis. Rather than answer business questions, they just provide analysis. Their thinking, such that it is, revolves around the idea that they best do their jobs when they can do the neatest tricks with the analysis system.
These are just two examples of several "wrong thinking" ideas that Foreman addresses in this book. Because these "wrong thinking" ideas are pervasive and cause the misallocation of millions of dollars of resources in the typical large company, this book is worth several thousand times its cover price for the typical large company. Scale down the cost as you scale down the enterprise, and the multiplier is obviously less dramatic but still quite potent. Assuming, of course, the reader grasps what Foreman is saying and acts upon those new insights.
This assumption has some teeth to it, because Foreman is a very clever writer. In addition to using humor to keep the reader engaged, he apparently labored long and hard over his word choices to get clear meaning across to the reader. This is something I greatly appreciate in a work of nonfiction. Typically, the subject matter expert lacks such a command of English and something gets a bit muffed in the translation from text to the mind of the reader.
Now, that's my commentary on the high-level stuff. Which does not comprise the bulk of this book. I addressed it first because, to me, this alone makes this book a "must read" for anyone involved in data analysis, business intelligence, or related fields. Too many in these fields cannot see the forest for the trees, and their penchant for getting mired down in insignificant details shows in the results of their work. They wonder why users waste many hours trying to do their own analysis in Excel, instead of looking at whether they are providing a useful service to the business and its decision-making needs.
Let's move on to the technical stuff covered in this book. At one time in my career, I was a spreadsheet junkie. I built very complex models in Excel. So I was delighted to walk through Foreman's examples and tutorials on using Excel to do various kinds of analysis. These examples and tutorials comprise the bulk of this book, but they are not the point of the book.
Let me explain by analogy. I'm not sure if this reaches the typical reader, but try to follow (and accept my apologies if it's a dud). In electrical engineering today, software does the number crunching for you. But in engineering school (and often in the friendly debates engineers have), the modus is on manually doing the calculations. When you read the electrical engineering trade publications, you find not an admonition to run the example through your software but you find manual calculations being walked through.
The reason, in all instances, is the participants must be able to understand the concepts. You can do this only by crunching the numbers yourself and following along in the mental processes of arriving at the answer. So the author of an article might provide quite a trail of calculation to prove a point. It's the point that matters, not the calculation per se. But you don't get the point unless you can see how it's arrived at.
For example, in this book Foreman discusses K-analysis. How can you really understand this without working through some examples and watching the effects on the data? Answer: You can't.
To me, being walked through this litany of hard-to-grasp data analysis concepts is the only way a person can really understand those concepts. I think a mere surface knowledge is insufficient (a little knowledge is dangerous....). Even outside the realm of data analysis, people toss about terms they clearly do not understand but think they do. But based on my many years interacting with Crystal Reports administrators and trainers, I think the problem is especially pernicious in this particular field of data analysis. If you really want to know what you're talking about, you need to do the learning work.
The first nine chapters walk the reader through data analysis concepts. Chapter 10 is an introduction to an analysis program called R. Foreman begins by summing up the previous nine chapters as an exercise in learning analytics and then making it clear that Excel isn't the right tool for actually doing analytics.
I don't believe Foreman is trying to "sell" R per se. It's what he's familiar with. There are other tools for data analysis, including the big players in the Business Intelligence (BI) market, such as Crystal Reports and Cognos. Basically, if you want an effective, accurate, efficient way to answer business decision-making questions from the data your business gathers, you need to step up to a tool designed for that job. And, of course, you need an adequate database behind it.
Foreman has excellent advise in his 11th chapter (which is not numbered), "Conclusion." It's only six pages long, but what he says in here is profound. If you, as the reader, grasp nothing else but what's in this conclusion, the book has served you well.