Making the Most of MCMC

Sometimes in grad school you need to write about topics that you yourself have little to no clue about. Part of this learning process is figuring out how to teach yourself some of these very difficult concepts. This blog post comes from a blog post I co-wrote with my cohort chum, Justin, 
By: Justin and Meridith

Markov Chains, and particularly Markov Chains Monte Carlo, are a difficult concept to explain. In fact, Dr. Hanks has stated that they are “Easier done than said.” At the very basis of everything, Markov Chains are a system that transitions from one state to another state. It is a random memorylessness process, that is,  the next state depends only on the current state and not on the sequence of events that preceded it. I have scoured the web and believe the following to be the simplest visual introduction to Markov Chains. (Spoiler Alert: It arose from someone – Andrey Markov –  being a sassmaster.)

From here, it’s easy to start to gain an appreciation for the wide breadth of applications available for Markov Chains. However, if we want to transition, as it were, from Markov Chains to Markov Chain Monte Carlo simulations, we must first explore Monte Carlo methods. These methods are a class of computational algorithms that utilize repeated random sampling (simulations) to obtain the distribution of an unknown probabilistic entity. The modern version of the Monte Carlo method was invented in the late 1940s by Stanislaw Ulam (coolest name ever), while he was working on nuclear weapons projects at the Los Alamos National Laboratory (think Manhattan Project). It was named by Nicholas Metropolis, after the Monte Carlo Casino, where Ulam’s uncle often gambled. Because reasons, apparently.  Peter Muller’s article gives a brief introduction of Markov Chain Monte Carlo simulation, a method that enables the simulation of Bayesian posterior distributions and thus facilitates the use of Bayesian inference.  According to Muller, the goal of MCMC is to set up a Markov Chain with an ergodic distribution and some initial state, where the term ergodic indicates that there is a non-zero probability of the process passing from one state to any other state at each step. Starting at the initial state, transitions (from one state to another) are simulated and the simulated states recorded. The ergodic sample average of simulated states will then converge to the value of the desired posterior integral. Muller notes that two conditions must be met in order to use the resulting integral estimates:

  1. As the number of simulated transitions, M approaches infinity, the chain must converge to the desired posterior distribution
  2. Some diagnostic must be found to determine when practical convergence occurs, i.e. when a sufficient number of simulations have been performed.

The first prerequisite, theoretical convergence, can be reduced to meeting the following three criteria: irreducibility, aperiodicity, and invariance. The second and more ambiguous of the two conditions—a criterion for practical convergence—has several proposed solutions in the literature. For example, Gelman and Rubin (1992) developed an “ANOVA type statistic [for considering] several independent parallel runs of the MCMC simulation,” and Geweke (1992) has suggested a comparison of an early-iterations ergodic average to an ergodic average of later iterations. However, Muller also suggests the simpler method of visual diagnosis via plotting the states for each iteration against the iteration number to judge convergence. 

Markov Chains and MCMC have many useful applications, ranging across a wide spectrum of fields. One such interesting application is in the game of baseball. When viewing a half-inning in a baseball game, there are 28 possible states based on number of outs–0, 1, 2, or 3–and runners on base–different combinations of having no runners, or having runners on first, second, and/or third base (see http://www.pankin.com/markov/theory.htm for a more detailed description of the transition matrix). This gives us a 28×28 transition matrix filled with the probabilities of being in each respective state. From here, we could calculate the expected value of runs scored from each state and analyze how this expectation changes from state to state. We could also extend this to analyzing the probability of scoring a single run by defining a slightly different transition matrix (again, see the link provided above for more detail). Due to its usefulness, MCMC has become a common tool for baseball analysts and sabermetricians (Editor’s Note: Totally had to Google sabermetrician.  I’m feeling I got short changed a little in the job naming category).

Another useful application is in the field of ecology. A useful paper, An Application of Markov Chain Monte Carlo to Community Ecology, serves as a wonderful walkthrough of MCMC with the easily conceptualized example of community assemblages (presence/absence) of birds among islands. As with our stated requirements for MCMC, the next state of bird species’ distribution among a set of islands only depends on their current state distribution. The article does a great jobs of connecting the dots from the ecological concept (birds disperse among islands, possibly due to competition) to an ecological question (given the starting state and some measures of competition among species, what is the probability that a random matrix will exhibit the same level of competition) to mathematical challenge (applying MCMC) and then loops the results back around to answer the ecological question (the distribution of finch species among the Galapagos shows evidence of competition!). If you’re feeling extra badass and want to make the jump from learning about MCMC to coding some examples in R be sure to check out the following blog posts!

Tricks of the Trade: LaTeX

Ok, guys. I’ve been studying as a baby statistician (scienctician? statscientist? ecologitician?)  for a little while now and I’m here to share some of their secrets. Before I started here at Penn State I had a couple ideas about what other grad students in my department would be like. First, everyone would be computer masters of any and all statistical programs: R, SAS, others that I hadn’t even heard of yet. Second, they’d all be completely on top of everything in all of our classes because they all would’ve completed undergraduate and master’s programs also in statistics. And thirdly, it’d be really hard to relate to other students because of my background in biology and my love for the outdoors (because clearly they’d all prefer sitting inside in front of their computers, right?). Thankfully, I was way off base and not only am I not left in the educational dust, but my cohort is full of awesome students with a wide variety of strengths and abilities. And I must collect them all. Yea, my new goal is to be like some sort of awesome Anna-Paquin-as-Rogue statistician and glean all of the amazing abilities and knowledge while I can. Except I think I’ll stick to taking the time to learn and practice things…instead of the whole touchy hurty thing she does. One of my absolute favorite new acquires is the ability to write code in LaTeX.


Another one of my pre-stats misconceptions was that whenever you saw an equation in a journal article it was created with Word’s super difficult equation editor.  Hopefully I’m not the only one who thought this, because now I feel really silly (Editor’s Note: I assumed mathematical witchcraft, so joke’s on me really.). LaTeX is a document preparation system for high-quality typesetting often used for technical or scientific documents. Long story short, you could be creating completely badass documents with lots of equations and badassery like these: [Homework with R Code, Homework with crazy stat stuff!]. I received my intro to LaTeX during one of the Cohort Workshops I have been arranging on Fridays for my department. Another grad student gave us a very brief introduction and showed us some of the basics. A few downloads, a bunch of googling, and several hours of practice later (not to mention an uninstall and redownload…) I was really starting to get the hang of it! Anyone who’s learning to program knows that you experience some of the most frustrating moments during that initial learning curve. WHY WON’T YOU JUST COMPILE AND SHOW ME A PDF OF MY NAME AND ‘HELLO WORLD’? I DID EXACTLY WHAT YOU TOLD ME…*deletes comma* Oh…well then. BEHOLD MY BRILLIANCE! FOR I HATH CREATED A MASTERPIECE!

I would like to encourage everyone to give it a go! I can answer basic questions, but I’ve found that the vast majority of my own beginner’s questions have been accomplished through a few key resources, including the Great Googily Moogily. Behold your starting point!

What to Download
  1. Tex – LaTeX is actually a sub-entity of Tex, sort of like Git and Github (which I also am just beginning to understand!) So you’ll actually need to download Tex in order to run LaTeX. Unfortunately, there are slightly different versions for Windows and Mac users but both deal with the same underlying program (if you run anything else, my apologies for being completely unaware of how to guide you).
  2. An Editor – The Tex download comes with everything you absolutely need, but I like using an editor for extra pretty colors and the option to code for other programming languages. I like working in Aquamacs, which is the Mac version of Emacs. (Update: I now use Sublime Text because Aquamacs kept giving me unhelpful error messages and I wasn’t having none of that.)
IMG_4569
Full disclosure: this took me a WHILE!
What to Try First
  1. Hello, World! – Your first task is to just compile and create a PDF file with the most basic of greetings. I used this website at Art of Problem Solving. Even still I spent way too long before I got my first code to compile and PDF produced. It’s a glorious achievement!
  2. Do a Homework in LaTeX – This is not applicable to everyone and for all classes. But if you have a math or stats course where the homework isn’t too intensive consider completing it in LaTeX! One of my professors even wrote lots of handy coding tips on one of my homeworks that helped me a lot the next time around. I love being able to feel accomplished at writing up a nice, clean looking final version even if the homework is crazy difficult. Helps me keep those imposter thoughts at bay!


Next Level Stuff
  1. Update your CV – This was one of my recommendations for our Motivation blog post last week. I used this one from Bradley P Carlin and you can check out my final form!
  2. Write and submit your next manuscript using LaTeX! – Now, I’m nowhere near this stage of my program but I’d wager that quite a few templates or style formatting guidelines available for submitting a paper using LaTeX! Go, go, go!
  3. Combine with RStudio to work with knitr and sweave to produce LaTeX documents with R code  and results spliced in!


Basically part of my grand PhD scheme is to master a lot of the computing and presentation side of statistics so that I will be a valuable asset and worthy of ALL the jobs. At least a few options after graduating will be worth the toiling away finding that stray comma or misspelled command. Now that you’ve heard my favorite new tool I’ve learned so far please share yours! Or even if your favorite is also LaTeX tell me all the little tricks  you’ve picked up! I want all the tricks!