“Building Probabilistic Graphical Models with Python” is the 3rd book I have reviewed from PACKT publishing. It has just been released on this June. Compared to the classic book of PGM – “PGM: Principles and Techniques” by Dr. Koller, this book is ‘tiny’. While the former provides the complete theory backgrounds of PGM, our ‘tiny’ book goes straightforward into the real-world PGM libraries and applications. This post is trying to help readers figure out what is in this book from a pure technical perspective. Cheers~
0. The book
1. The content
Chap 1 talks about the basics of Probability – random variables, Bayes rules and distributions. Once readers grasp the basic ideas, PyMC is introduced to help understand and enforce these concepts.
Chap 2 dives into the Bayesian Network (BN) – independence, conditional independence and D-separation. Libpgm and Scikit-learn are introduced here to create BN models, run queries and play with Naive Bayes Classifier.
Chap 3 concentrates on the bases of Markov Network (MN) – factorization, separation and Conditional Random Field (CRF).
Chap 4 hits the active research field of PGM, structure learning from the data – constraint based and score based learning. This chapter provides a lot of code snippet to understand how data and learning approaches could impact the final structure learned, using IPython, libpgm and numpy.
Chap 5 demonstrates methods of parameter learning, Maximum Likelihood Estimation (MLE) and Bayesian Estimation. Each method has a code example to show the key points of this method, using PyMC, scipy and matplotlib.
Chap 6 shows ways to do exact inference in graph models – Variable Eliminations (VE) and Junction Tree (JT), each of which has a complete code example using libpgm plus a step-by-step analysis.
Chap 7 discusses the active research field of PGM, approximate inference – message passing and sampling. (Loopy) Belief Propagation (BP) is demonstrated in an image processing example using OpenGM. Markov Chain Monte Carlo (MCMC) , Gibbs Sampling and Metropolis-Hasting are also discussed with real code examples.
2. The conclusion
Though PGM has been there for years, this area is not widely known unless you are a graduate student or researcher devoting yourself to the statistical branch of the AI field. If you are, then you must remember the days when you had to implement your own MN solver using VE (BTW, I still have my own in my github). The only problem is the ‘standard’ implementation used in the real world keeps unknown. If you are not, then you must be interested in PGM libraries, tools and applications used in the wild. Either case, this ‘tiny’ book works.
P.S. If you are looking for PGM in Java or other languages, this book does not help. As the book title, it uses Python, which is the most friendly and human-being oriented programming language in the world (in my opinion:)