R for Everyone: Advanced Analytics and Graphics (Anglais) Broché – 19 décembre 2013
|Neuf à partir de||Occasion à partir de|
- Choisissez parmi 17 000 points de collecte en France
- Les membres du programme Amazon Premium bénéficient de livraison gratuites illimitées
- Trouvez votre point de collecte et ajoutez-le à votre carnet d’adresses
- Sélectionnez cette adresse lors de votre commande
Produits fréquemment achetés ensemble
Les clients ayant acheté cet article ont également acheté
Descriptions du produit
Présentation de l'éditeur
Statistical Computation for Programmers, Scientists, Quants, Excel Users, and Other Professionals
Using the open source R language, you can build powerful statistical models to answer many of your most challenging questions. R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone is the solution.
Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you’ll need to accomplish 80 percent of modern data tasks.
Lander’s self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You’ll download and install R; navigate and use the R environment; master basic program control, data import, and manipulation; and walk through several essential tests. Then, building on this foundation, you’ll construct several complete models, both linear and nonlinear, and use some data mining techniques.
By the time you’re done, you won’t just know how to write R programs, you’ll be ready to tackle the statistical problems you care about most.
• Exploring R, RStudio, and R packages
• Using R for math: variable types, vectors, calling functions, and more
• Exploiting data structures, including data.frames, matrices, and lists
• Creating attractive, intuitive statistical graphics
• Writing user-defined functions
• Controlling program flow with if, ifelse, and complex checks
• Improving program efficiency with group manipulations
• Combining and reshaping multiple datasets
• Manipulating strings using R’s facilities and regular expressions
• Creating normal, binomial, and Poisson probability distributions
• Programming basic statistics: mean, standard deviation, and t-tests
• Building linear, generalized linear, and nonlinear models
• Assessing the quality of models and variable selection
• Preventing overfitting, using the Elastic Net and Bayesian methods
• Analyzing univariate and multivariate time series data
• Grouping data via K-means and hierarchical clustering
• Preparing reports, slideshows, and web pages with knitr
• Building reusable R packages with devtools and Rcpp
• Getting involved with the R global community
Biographie de l'auteur
Aucun appareil Kindle n'est requis. Téléchargez l'une des applis Kindle gratuites et commencez à lire les livres Kindle sur votre smartphone, tablette ou ordinateur.
Pour obtenir l'appli gratuite, saisissez votre numéro de téléphone mobile.
Détails sur le produit
Quels sont les autres articles que les clients achètent après avoir regardé cet article?
Commentaires en ligne
Commentaires client les plus utiles sur Amazon.com (beta)
Where "R for Everyone" differs from "R in Action" - and, coming to the positives, where it wins out - is in intermediate-R territory. One important example is coverage of "ggplot2". Whereas "R in Action" discusses the "old school" R graphics, "R for Everyone" goes with "ggplot2", becoming the second popular book (after Winston Chang's "R Graphics Cookbook") to discuss the package - and although its explanation of "ggplot2" syntax is sketchy, the samples found throughout the book do build into a useful "ggplot2" gallery that actually brought me over the fence. "plyr" package, an important data-manipulation aid, is another example, and another "R in Action" no-show. So is "data.table". So is "knitr", used to produce reports. So is "rcpp", used to interface R and C++. So is R package-building. (You will notice that the topics become more advanced. These are introductions rather than substantial explorations, but awareness is a valuable thing). In the book's second half, when discussion moves from R to statistics-with-R, the author continues to manage to find original material; statistical explanations may be brief - this is not a textbook - but examples, and pointers to useful R utilities, are much appreciated.
I own just one R book - literally, "The R Book", by Crawley - but "R for Everyone" will be joining it; this has got to be a compliment. Kudos to Jared Lander for writing an original, substantial, useful book.
UPD. It's June 2015, and second edition of Robert Kabacoff's "R in Action" is finally out - but the changes are incremental, and my endorsement of "R for Everyone" stands.
For what it's worth, I am an R user and I like to pick up books on R to see how other people do things. The fact that I was exposed to packages I have never used was a plus and definitely make the book worthwhile.
This book is basically 2-distinct books: The first 13-chapters are the basics of R. They are quite good and if you are new to R you will find them extremely useful.
Virtually all the remainder of the book is using R for various statistical techniques. This is where I had my problem. If you get this book with the assumption that you will learn statistics at the same time, then you will be disappointed. The problem is that while the book does tell you HOW to do the test, that's about it. There isn't much in terms of explaining what it is you did or how to interpret the results. I suppose if you look at it as a book to show you how to use the various R commands to run a t-test or an ANOVA, then that's OK, but I don't see value if you do something, get a value and not understand what it's for. But, if you are already statistically savvy, then this might not be an issue.
One thing I did not like though is the use of ggplot. Now, I fully appreciate that ggplot will in fact generate far better graphics than the core plot routines in R. No question. But, ggplot in itself is a book, and in many cases, I just cut-and-pasted the code into R to see what happens. There wasn't really a whole bunch of explanations as to why you were doing what you were doing. Given that this is more an intro book (given the initial chapters of R that gives me this impression), I would have considered using the core plot routines instead. More work and less attractive I know, but if your audience are people who are new to R, then why not stay with the core routines?
The book has a nice layout in color, but this is misleading. Long R output without proper formatting for a textbook is always displayed because the author wrote the book directly in the code as he himself states and printed it out as it is. And it feels like. Most of the text looks just like comments in a program code. The treatment of functions is very poor (they are also very rarely used in the book) and the explanation of the different R data types lacks depth and is misguided. Silly examples are used to show the basics as in printing the author's name. The later chapters get even worse, literally damaging all the more interesting parts, where the book leaves the very basics and moves on to data handling and then to advanced data analytics in R.
The part of the book that deals with data analytics is sincerely a bit of a tragedy. Rushed text with no clear or sometimes whatsoever explanations of what is actually being done, with just little text and lots of code output and charts taking most of the space. Ironically the book that is "for everyone" makes hard for "everyone" to understand anything that uses statistics, about 60% of the book!.
It is harder even for those trained on statistics or related "hard" sciences.
For example, In chapter 22, right in the beginning the author uses a value for the predicted number of clusters in the data under analysis. This value is taken out of the blue and only later it is shown how this value can be found using two methods. The first method doesn’t bring any useful value (and you wonder why it is shown). The second method does bring a good value but it is not explained in the text how this method's results should be interpreted to determine this value. Apart from two rushed sentences that speak of a standard deviation being used, whatever this standard deviation is coming from as the author says nothing about the algorithm clusGap that he used for such. I did some research and found out that the author's LiveLesson video course, that follows the book almost page by page, does mention, albeit quickly, how to interpret the second method’s result above. But not in his book… Unfortunately this video course also suffers from the same problems that the book does, as it is mostly a live reading of the book with the author typing the code.
In fact, almost anything related to data analytics is very poorly explained, if at all. Another example, out of many, is the section 20.3 on Generalized Adaptive Models. After preprocessing the raw data used for the analysis, a few charts are displayed (without much explanation of the code used for which) and then the data analysis code output is shown without any explanation. Two features of the data, CreditAmount and Age, are displayed in charts where they are smoothed, but there is no explanation about what for. And the analysis stops right there without any further explanation. What could be said in a few sentences is left out.
Most of the data analytics examples also show very poor performance, leading the user to think why data analysis is used if it performs so badly and, if it performs well, why the author didn’t select any better example.
There are also many pedagogical errors, minor and major ones. I will just mention a few taken from chapter 12 as an example:
1) Many variables are created with the function assign in a loop but actually only two of them are used. What for? On top of it, the same loop is coded again later with just a different variable name.
2) The function merge is used with the same column names, although the author states that “the ability to specify different column names (..) is the most useful feature of merge” before doing so with the function join from the library plyr. Then you wonder what difference is being shown.
3) It gets worse. In a rather convoluted way to show how to merge different data frames, the author introduces two new features of R, eval and parse, just by passing and without any specific examples or further explanation. In this same convoluted example the author also uses the R function Reduce in the most complicated way with the dots, without first showing simple examples and what it is for. Only then later down in the text he goes on to explain what Reduce does but fails to mention that it can only be applied to binary functions. The text states that “Reduce can be a difficult function to grasp”. If it is, it would deserve a better treatment, not as a side note, explained in an example that is related to something else (how to merge data frames). It should also have a full explanation of how it can be used.
R is a beautiful language that can be well explained. It is not hard to show its power in data analysis with short but clear explanations. It’s regrettable that this book misses its stated goals so badly, when it could have done otherwise brilliantly, as its author seems capable to do a much better job. So I can't recommend this book. There is actually a shortage of good R books in the market, but "R in Action" (second edition is coming), "The Art of R Programming" and "The R Book" are much superior choices.