65 internautes sur 73 ont trouvé ce commentaire utile
Dr. Lee D. Carlson
- Publié sur Amazon.com
In just a decade, Bayesian networks have went from being a mere academic curiosity to a highly useful field with myriads of applications. Indeed, the applications of Bayesian networks are wide-ranging and include disparate fields such as network engineering, bioinformatics, medical diagnostics, and intelligent troubleshooting. This book gives a fine overview of the subject, and after reading it one will have an in-depth understanding of both the underlying foundations and the algorithms involved in using Bayesian networks. The reader will have to look elsewhere for applications of Bayesian networks, since they are only discussed briefly in the book. Due to space constraints, only the first four chapters will be reviewed here.
The author defines a Bayesian network as a graphical structure for representing the probabilistic relationship among a large number of variables and for performing probabilistic inference with these variables. Before the advent of Bayesian networks, probabilistic inference depended on the use of Bayes' theorem, which entailed that the problems examined be relatively simple, due to the exponential space and time complexity that can arise in the application of this theorem.
After a short review of probability theory in chapter 1, a discussion of the "philosophical" foundations of probability, and a discussion of the difficulties inherent in representing large instances and in performing inference over a large number of variables, the author introduces Bayesian networks as directed acyclic graphs satisfying the Markov condition. A brief discussion of NasoNet, which is a large-scale Bayesian network used in the diagnosis and prognosis of nasopharyngeal cancer, is given. The author then shows in detail how to create Bayesian networks using causal edges, introducing in the process the notion of manipulating variables and the notion of a causation between two variables. An interesting example of manipulation is given in the context of pharmaceuticals, and an example of bad manipulation is given.
Chapter 2 addresses the nature of dependencies in DAGs via the concept of `faithfulness' and entailed conditional independencies. Very important in this chapter is the notion of `d-separation', which identifies all and only those conditional independencies entailed by the Markov condition for G. An explicit algorithm is given for finding d-separations. D-separation is used to define a notion of Markov equivalence between DAGs containing the same set of nodes. Also discussed is the minimality condition, wherein a DAG will not satisfy the Markov condition with respect to a probability distribution if an edge is removed from it. The author shows every probability distribution satisfies the minimality condition with some DAG. The notion of a `Markov blanket' is introduced, which measures the extent to which the instantiation of a set of nodes close to a particular node can shield the node from the effect of all other nodes. A Markov boundary of a random variable is then defined as a Markov blanket such that none of its proper subsets is a Markov blanket of the random variable. The utility of these concepts lies in the fact that the set of all parents of each variable X, children of X, and parents of children of X are the unique Markov boundary of X, if the DAG satisfies the faithfulness condition.
Inference in Bayesian networks is the topic of chapter 3, with Pearl's message-passing algorithm starting off the discussion for the case of discrete random variables. This algorithm, which applies for Bayesian networks whose DAGs are trees, is based on a theorem, whose statement takes well over a page, and whose proof covers five pages. The author gives detailed examples though, and these are very helpful in understanding the algorithm. The Pearl algorithm is then generalized to singly and multiply connected networks. After a discussion of the computational complexity of the algorithm, the author then overviews the `noisy OR-gate model', which is a model whose complexity is manageable, since each variable in the model has only two values. The author then moves on to doing inference using an approach, called `symbolic probabilistic inference' that approximates finding the optimal way to compute marginal distributions of interest from the joint probability distribution. This algorithm involves a number of multiplications in order to compute the marginal probability. To minimize the computational effort, it would be advantageous to minimize the number of these multiplications, and so the author discusses the `optimal factoring problem', which, once solved for a given factoring instance, will give a factorization that requires a minimal number of multiplications. What follows after this is a very interesting discussion of the relationship of human reasoning to Bayesian networks. This is done via the introduction of the `causal network model', and the author then, quite unexpectedly, overviews the research on the testing of human subjects so as to test the accuracy of the model. These testing studies included those that involve inference based on `discounting', which measures to what degree an individual becomes less confident in the cause when told that a different cause of the effect was present. Another discussed is one that involves larger networks in the context of traffic congestion. This is followed by a discussion of a study of causal reasoning in the context of the debugging of programs.
Inference algorithms are studied for the case of continuous variables in chapter four. After a review of the normal probability distribution, the author discusses an inference algorithm for the case of Gaussian Bayesian networks. An algorithm for doing inference with continuous variables for singly connected Bayesian networks is given, that allows the determination of expected value and variance of each node conditioned on specified values of nodes in some subset. This is followed by several detailed and helpful examples of inference in continuous variables. As expected, issues with computational complexity arise, and so the author discusses approximate inference, via the method of stochastic simulation, which involves a classical sampling method called `logic sampling.' This is then followed by a discussion of likelihood weighting, which cures some of the problems involved with logic sampling. Abductive inference, so important in contemporary applications, is then discussed in detail.