diff --git a/06_StatisticalInference/01_01_Introduction/index.Rmd b/06_StatisticalInference/01_01_Introduction/index.Rmd
index 74e8b2a1e..3b5e98989 100644
--- a/06_StatisticalInference/01_01_Introduction/index.Rmd
+++ b/06_StatisticalInference/01_01_Introduction/index.Rmd
@@ -1,158 +1,158 @@
----
-title       : Introduction to statistical inference
-subtitle    : Statistical inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-## Statistical inference defined
-
-Statistical inference is the process of drawing formal conclusions from
-data. 
-
-In our class, we wil define formal statistical inference as settings where one wants to infer facts about a population using noisy
-statistical data where uncertainty must be accounted for.
-
----
-
-## Motivating example: who's going to win the election?
-
-In every major election, pollsters would like to know, ahead of the
-actual election, who's going to win. Here, the target of
-estimation (the estimand) is clear, the percentage of people in 
-a particular group (city, state, county, country or other electoral
-grouping) who will vote for each candidate.
-
-We can not poll everyone. Even if we could, some polled 
-may change their vote by the time the election occurs.
-How do we collect a reasonable subset of data and quantify the
-uncertainty in the process to produce a good guess at who will win?
-
----
-
-## Motivating example: is hormone replacement therapy effective? 
-
-A large clinical trial (the Women’s Health Initiative) published results in 2002 that contradicted prior evidence on the efficacy of hormone replacement therapy for post menopausal women and suggested a negative impact of HRT for several key health outcomes. **Based on a statistically based protocol, the study was stopped early due an excess number of negative events.**
-
-Here's there's two inferential problems. 
-
-1. Is HRT effective?
-2. How long should we continue the trial in the presence of contrary
-evidence?
-
-See WHI writing group paper JAMA 2002, Vol 288:321 - 333. for the paper and Steinkellner et al. Menopause 2012, Vol 19:616 621 for adiscussion of the long term impacts
-
----
-
-## Motivating example: ECMO
-
-In 1985 a group at a major neonatal intensive care center published the results of a trial comparing a standard treatment and a promising new extracorporeal membrane oxygenation treatment (ECMO) for newborn infants with severe respiratory failure. **Ethical considerations lead to a statistical randomization scheme whereby one infant received the control therapy, thereby opening the study to sample-size based criticisms.**
-
-For a review and statistical discussion, see Royall Statistical Science 1991, Vol 6, No. 1, 52-88
-
----
-
-## Summary
-
-- These examples illustrate many of the difficulties of trying
-to use data to create general conclusions about a population.
-- Paramount among our concerns are:
-  - Is the sample representative of the population that we'd like to draw inferences about?
-  - Are there known and observed, known and unobserved or unknown and unobserved variables that contaminate our conclusions?
-  - Is there systematic bias created by missing data or the design or conduct of the study?
-  - What randomness exists in the data and how do we use or adjust for it? Here randomness can either be explicit via randomization
-or random sampling, or implicit as the aggregation of many complex uknown processes.
-  - Are we trying to estimate an underlying mechanistic model of phenomena under study?
-- Statistical inference requires navigating the set of assumptions and
-tools and subsequently thinking about how to draw conclusions from data.
-
---- 
-## Example goals of inference
-
-1. Estimate and quantify the uncertainty of an estimate of 
-a population quantity (the proportion of people who will
-  vote for a candidate).
-2. Determine whether a population quantity 
-  is a benchmark value ("is the treatment effective?").
-3. Infer a mechanistic relationship when quantities are measured with
-  noise ("What is the slope for Hooke's law?")
-4. Determine the impact of a policy? ("If we reduce polution levels,
-  will asthma rates decline?")
-
-
----
-## Example tools of the trade 
-
-1. Randomization: concerned with balancing unobserved variables that may confound inferences of interest
-2. Random sampling: concerned with obtaining data that is representative 
-of the population of interest
-3. Sampling models: concerned with creating a model for the sampling
-process, the most common is so called "iid".
-4. Hypothesis testing: concerned with decision making in the presence of uncertainty
-5. Confidence intervals: concerned with quantifying uncertainty in 
-estimation
-6. Probability models: a formal connection between the data and a population of interest. Often probability models are assumed or are
-approximated.
-7. Study design: the process of designing an experiment to minimize biases and variability.
-8. Nonparametric bootstrapping: the process of using the data to,
-  with minimal probability model assumptions, create inferences.
-9. Permutation, randomization and exchangeability testing: the process 
-of using data permutations to perform inferences.
-
----
-## Different thinking about probability leads to different styles of inference
-
-We won't spend too much time talking about this, but there are several different
-styles of inference. Two broad categories that get discussed a lot are:
-
-1. Frequency probability: is the long run proportion of
- times an event occurs in independent, identically distributed 
- repetitions.
-2. Frequency inference: uses frequency interpretations of probabilities
-to control error rates. Answers questions like "What should I decide
-given my data controlling the long run proportion of mistakes I make at
-a tolerable level."
-3. Bayesian probability: is the probability calculus of beliefs, given that beliefs follow certain rules.
-4. Bayesian inference: the use of Bayesian probability representation
-of beliefs to perform inference. Answers questions like "Given my subjective beliefs and the objective information from the data, what
-should I believe now?"
-
-Data scientists tend to fall within shades of gray of these and various other schools of inference. 
-
----
-## In this class
-
-* In this class, we will primarily focus on basic sampling models, 
-basic probability models and frequency style analyses
-to create standard inferences. 
-* Being data scientists,  we will also consider some inferential strategies that  rely heavily on the observed data, such as permutation testing
-and bootstrapping.
-* As probability modeling will be our starting point, we first build
-up basic probability.
-
----
-## Where to learn more on the topics not covered
-
-1. Explicit use of random sampling in inferences: look in references
-on "finite population statistics". Used heavily in polling and
-sample surveys.
-2. Explicit use of randomization in inferences: look in references
-on "causal inference" especially in clinical trials.
-3. Bayesian probability and Bayesian statistics: look for basic itroductory books (there are many).
-4. Missing data: well covered in biostatistics and econometric
-references; look for references to "multiple imputation", a popular tool for
-addressing missing data.
-5. Study design: consider looking in the subject matter area that
-  you are interested in; some examples with rich histories in design:
-  1. The epidemiological literature is very focused on using study design to investigate public health.
-  2. The classical development of study design in agriculture broadly covers design and design principles.
-  3. The industrial quality control literature covers design thoroughly.
- 
+---
+title       : Introduction to statistical inference
+subtitle    : Statistical inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+## Statistical inference defined
+
+Statistical inference is the process of drawing formal conclusions from
+data. 
+
+In our class, we wil define formal statistical inference as settings where one wants to infer facts about a population using noisy
+statistical data where uncertainty must be accounted for.
+
+---
+
+## Motivating example: who's going to win the election?
+
+In every major election, pollsters would like to know, ahead of the
+actual election, who's going to win. Here, the target of
+estimation (the estimand) is clear, the percentage of people in 
+a particular group (city, state, county, country or other electoral
+grouping) who will vote for each candidate.
+
+We can not poll everyone. Even if we could, some polled 
+may change their vote by the time the election occurs.
+How do we collect a reasonable subset of data and quantify the
+uncertainty in the process to produce a good guess at who will win?
+
+---
+
+## Motivating example: is hormone replacement therapy effective? 
+
+A large clinical trial (the Women’s Health Initiative) published results in 2002 that contradicted prior evidence on the efficacy of hormone replacement therapy for post menopausal women and suggested a negative impact of HRT for several key health outcomes. **Based on a statistically based protocol, the study was stopped early due an excess number of negative events.**
+
+Here's there's two inferential problems. 
+
+1. Is HRT effective?
+2. How long should we continue the trial in the presence of contrary
+evidence?
+
+See WHI writing group paper JAMA 2002, Vol 288:321 - 333. for the paper and Steinkellner et al. Menopause 2012, Vol 19:616 621 for adiscussion of the long term impacts
+
+---
+
+## Motivating example: ECMO
+
+In 1985 a group at a major neonatal intensive care center published the results of a trial comparing a standard treatment and a promising new extracorporeal membrane oxygenation treatment (ECMO) for newborn infants with severe respiratory failure. **Ethical considerations lead to a statistical randomization scheme whereby one infant received the control therapy, thereby opening the study to sample-size based criticisms.**
+
+For a review and statistical discussion, see Royall Statistical Science 1991, Vol 6, No. 1, 52-88
+
+---
+
+## Summary
+
+- These examples illustrate many of the difficulties of trying
+to use data to create general conclusions about a population.
+- Paramount among our concerns are:
+  - Is the sample representative of the population that we'd like to draw inferences about?
+  - Are there known and observed, known and unobserved or unknown and unobserved variables that contaminate our conclusions?
+  - Is there systematic bias created by missing data or the design or conduct of the study?
+  - What randomness exists in the data and how do we use or adjust for it? Here randomness can either be explicit via randomization
+or random sampling, or implicit as the aggregation of many complex uknown processes.
+  - Are we trying to estimate an underlying mechanistic model of phenomena under study?
+- Statistical inference requires navigating the set of assumptions and
+tools and subsequently thinking about how to draw conclusions from data.
+
+--- 
+## Example goals of inference
+
+1. Estimate and quantify the uncertainty of an estimate of 
+a population quantity (the proportion of people who will
+  vote for a candidate).
+2. Determine whether a population quantity 
+  is a benchmark value ("is the treatment effective?").
+3. Infer a mechanistic relationship when quantities are measured with
+  noise ("What is the slope for Hooke's law?")
+4. Determine the impact of a policy? ("If we reduce polution levels,
+  will asthma rates decline?")
+
+
+---
+## Example tools of the trade 
+
+1. Randomization: concerned with balancing unobserved variables that may confound inferences of interest
+2. Random sampling: concerned with obtaining data that is representative 
+of the population of interest
+3. Sampling models: concerned with creating a model for the sampling
+process, the most common is so called "iid".
+4. Hypothesis testing: concerned with decision making in the presence of uncertainty
+5. Confidence intervals: concerned with quantifying uncertainty in 
+estimation
+6. Probability models: a formal connection between the data and a population of interest. Often probability models are assumed or are
+approximated.
+7. Study design: the process of designing an experiment to minimize biases and variability.
+8. Nonparametric bootstrapping: the process of using the data to,
+  with minimal probability model assumptions, create inferences.
+9. Permutation, randomization and exchangeability testing: the process 
+of using data permutations to perform inferences.
+
+---
+## Different thinking about probability leads to different styles of inference
+
+We won't spend too much time talking about this, but there are several different
+styles of inference. Two broad categories that get discussed a lot are:
+
+1. Frequency probability: is the long run proportion of
+ times an event occurs in independent, identically distributed 
+ repetitions.
+2. Frequency inference: uses frequency interpretations of probabilities
+to control error rates. Answers questions like "What should I decide
+given my data controlling the long run proportion of mistakes I make at
+a tolerable level."
+3. Bayesian probability: is the probability calculus of beliefs, given that beliefs follow certain rules.
+4. Bayesian inference: the use of Bayesian probability representation
+of beliefs to perform inference. Answers questions like "Given my subjective beliefs and the objective information from the data, what
+should I believe now?"
+
+Data scientists tend to fall within shades of gray of these and various other schools of inference. 
+
+---
+## In this class
+
+* In this class, we will primarily focus on basic sampling models, 
+basic probability models and frequency style analyses
+to create standard inferences. 
+* Being data scientists,  we will also consider some inferential strategies that  rely heavily on the observed data, such as permutation testing
+and bootstrapping.
+* As probability modeling will be our starting point, we first build
+up basic probability.
+
+---
+## Where to learn more on the topics not covered
+
+1. Explicit use of random sampling in inferences: look in references
+on "finite population statistics". Used heavily in polling and
+sample surveys.
+2. Explicit use of randomization in inferences: look in references
+on "causal inference" especially in clinical trials.
+3. Bayesian probability and Bayesian statistics: look for basic itroductory books (there are many).
+4. Missing data: well covered in biostatistics and econometric
+references; look for references to "multiple imputation", a popular tool for
+addressing missing data.
+5. Study design: consider looking in the subject matter area that
+  you are interested in; some examples with rich histories in design:
+  1. The epidemiological literature is very focused on using study design to investigate public health.
+  2. The classical development of study design in agriculture broadly covers design and design principles.
+  3. The industrial quality control literature covers design thoroughly.
+
diff --git a/06_StatisticalInference/01_01_Introduction/index.html b/06_StatisticalInference/01_01_Introduction/index.html
index 391528189..f5b9aad35 100644
--- a/06_StatisticalInference/01_01_Introduction/index.html
+++ b/06_StatisticalInference/01_01_Introduction/index.html
@@ -1,358 +1,358 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Introduction to statistical inference</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Introduction to statistical inference">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Introduction to statistical inference</h1>
-    <h2>Statistical inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Statistical inference defined</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>Statistical inference is the process of drawing formal conclusions from
-data. </p>
-
-<p>In our class, we wil define formal statistical inference as settings where one wants to infer facts about a population using noisy
-statistical data where uncertainty must be accounted for.</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Motivating example: who&#39;s going to win the election?</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>In every major election, pollsters would like to know, ahead of the
-actual election, who&#39;s going to win. Here, the target of
-estimation (the estimand) is clear, the percentage of people in 
-a particular group (city, state, county, country or other electoral
-grouping) who will vote for each candidate.</p>
-
-<p>We can not poll everyone. Even if we could, some polled 
-may change their vote by the time the election occurs.
-How do we collect a reasonable subset of data and quantify the
-uncertainty in the process to produce a good guess at who will win?</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Motivating example: is hormone replacement therapy effective?</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>A large clinical trial (the Women’s Health Initiative) published results in 2002 that contradicted prior evidence on the efficacy of hormone replacement therapy for post menopausal women and suggested a negative impact of HRT for several key health outcomes. <strong>Based on a statistically based protocol, the study was stopped early due an excess number of negative events.</strong></p>
-
-<p>Here&#39;s there&#39;s two inferential problems. </p>
-
-<ol>
-<li>Is HRT effective?</li>
-<li>How long should we continue the trial in the presence of contrary
-evidence?</li>
-</ol>
-
-<p>See WHI writing group paper JAMA 2002, Vol 288:321 - 333. for the paper and Steinkellner et al. Menopause 2012, Vol 19:616 621 for adiscussion of the long term impacts</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Motivating example: ECMO</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>In 1985 a group at a major neonatal intensive care center published the results of a trial comparing a standard treatment and a promising new extracorporeal membrane oxygenation treatment (ECMO) for newborn infants with severe respiratory failure. <strong>Ethical considerations lead to a statistical randomization scheme whereby one infant received the control therapy, thereby opening the study to sample-size based criticisms.</strong></p>
-
-<p>For a review and statistical discussion, see Royall Statistical Science 1991, Vol 6, No. 1, 52-88</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Summary</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>These examples illustrate many of the difficulties of trying
-to use data to create general conclusions about a population.</li>
-<li>Paramount among our concerns are:
-
-<ul>
-<li>Is the sample representative of the population that we&#39;d like to draw inferences about?</li>
-<li>Are there known and observed, known and unobserved or unknown and unobserved variables that contaminate our conclusions?</li>
-<li>Is there systematic bias created by missing data or the design or conduct of the study?</li>
-<li>What randomness exists in the data and how do we use or adjust for it? Here randomness can either be explicit via randomization
-or random sampling, or implicit as the aggregation of many complex uknown processes.</li>
-<li>Are we trying to estimate an underlying mechanistic model of phenomena under study?</li>
-</ul></li>
-<li>Statistical inference requires navigating the set of assumptions and
-tools and subsequently thinking about how to draw conclusions from data.</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Example goals of inference</h2>
-  </hgroup>
-  <article data-timings="">
-    <ol>
-<li>Estimate and quantify the uncertainty of an estimate of 
-a population quantity (the proportion of people who will
-vote for a candidate).</li>
-<li>Determine whether a population quantity 
-is a benchmark value (&quot;is the treatment effective?&quot;).</li>
-<li>Infer a mechanistic relationship when quantities are measured with
-noise (&quot;What is the slope for Hooke&#39;s law?&quot;)</li>
-<li>Determine the impact of a policy? (&quot;If we reduce polution levels,
-will asthma rates decline?&quot;)</li>
-</ol>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Example tools of the trade</h2>
-  </hgroup>
-  <article data-timings="">
-    <ol>
-<li>Randomization: concerned with balancing unobserved variables that may confound inferences of interest</li>
-<li>Random sampling: concerned with obtaining data that is representative 
-of the population of interest</li>
-<li>Sampling models: concerned with creating a model for the sampling
-process, the most common is so called &quot;iid&quot;.</li>
-<li>Hypothesis testing: concerned with decision making in the presence of uncertainty</li>
-<li>Confidence intervals: concerned with quantifying uncertainty in 
-estimation</li>
-<li>Probability models: a formal connection between the data and a population of interest. Often probability models are assumed or are
-approximated.</li>
-<li>Study design: the process of designing an experiment to minimize biases and variability.</li>
-<li>Nonparametric bootstrapping: the process of using the data to,
-with minimal probability model assumptions, create inferences.</li>
-<li>Permutation, randomization and exchangeability testing: the process 
-of using data permutations to perform inferences.</li>
-</ol>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Different thinking about probability leads to different styles of inference</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>We won&#39;t spend too much time talking about this, but there are several different
-styles of inference. Two broad categories that get discussed a lot are:</p>
-
-<ol>
-<li>Frequency probability: is the long run proportion of
-times an event occurs in independent, identically distributed 
-repetitions.</li>
-<li>Frequency inference: uses frequency interpretations of probabilities
-to control error rates. Answers questions like &quot;What should I decide
-given my data controlling the long run proportion of mistakes I make at
-a tolerable level.&quot;</li>
-<li>Bayesian probability: is the probability calculus of beliefs, given that beliefs follow certain rules.</li>
-<li>Bayesian inference: the use of Bayesian probability representation
-of beliefs to perform inference. Answers questions like &quot;Given my subjective beliefs and the objective information from the data, what
-should I believe now?&quot;</li>
-</ol>
-
-<p>Data scientists tend to fall within shades of gray of these and various other schools of inference. </p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>In this class</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>In this class, we will primarily focus on basic sampling models, 
-basic probability models and frequency style analyses
-to create standard inferences. </li>
-<li>Being data scientists,  we will also consider some inferential strategies that  rely heavily on the observed data, such as permutation testing
-and bootstrapping.</li>
-<li>As probability modeling will be our starting point, we first build
-up basic probability.</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>Where to learn more on the topics not covered</h2>
-  </hgroup>
-  <article data-timings="">
-    <ol>
-<li>Explicit use of random sampling in inferences: look in references
-on &quot;finite population statistics&quot;. Used heavily in polling and
-sample surveys.</li>
-<li>Explicit use of randomization in inferences: look in references
-on &quot;causal inference&quot; especially in clinical trials.</li>
-<li>Bayesian probability and Bayesian statistics: look for basic itroductory books (there are many).</li>
-<li>Missing data: well covered in biostatistics and econometric
-references; look for references to &quot;multiple imputation&quot;, a popular tool for
-addressing missing data.</li>
-<li>Study design: consider looking in the subject matter area that
-you are interested in; some examples with rich histories in design:
-
-<ol>
-<li>The epidemiological literature is very focused on using study design to investigate public health.</li>
-<li>The classical development of study design in agriculture broadly covers design and design principles.</li>
-<li>The industrial quality control literature covers design thoroughly.</li>
-</ol></li>
-</ol>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Statistical inference defined'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Motivating example: who&#39;s going to win the election?'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Motivating example: is hormone replacement therapy effective?'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Motivating example: ECMO'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Summary'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Example goals of inference'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Example tools of the trade'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Different thinking about probability leads to different styles of inference'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='In this class'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='Where to learn more on the topics not covered'>
-         10
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Introduction to statistical inference</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Introduction to statistical inference">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Introduction to statistical inference</h1>
+    <h2>Statistical inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Statistical inference defined</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>Statistical inference is the process of drawing formal conclusions from
+data. </p>
+
+<p>In our class, we wil define formal statistical inference as settings where one wants to infer facts about a population using noisy
+statistical data where uncertainty must be accounted for.</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Motivating example: who&#39;s going to win the election?</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>In every major election, pollsters would like to know, ahead of the
+actual election, who&#39;s going to win. Here, the target of
+estimation (the estimand) is clear, the percentage of people in 
+a particular group (city, state, county, country or other electoral
+grouping) who will vote for each candidate.</p>
+
+<p>We can not poll everyone. Even if we could, some polled 
+may change their vote by the time the election occurs.
+How do we collect a reasonable subset of data and quantify the
+uncertainty in the process to produce a good guess at who will win?</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Motivating example: is hormone replacement therapy effective?</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>A large clinical trial (the Women’s Health Initiative) published results in 2002 that contradicted prior evidence on the efficacy of hormone replacement therapy for post menopausal women and suggested a negative impact of HRT for several key health outcomes. <strong>Based on a statistically based protocol, the study was stopped early due an excess number of negative events.</strong></p>
+
+<p>Here&#39;s there&#39;s two inferential problems. </p>
+
+<ol>
+<li>Is HRT effective?</li>
+<li>How long should we continue the trial in the presence of contrary
+evidence?</li>
+</ol>
+
+<p>See WHI writing group paper JAMA 2002, Vol 288:321 - 333. for the paper and Steinkellner et al. Menopause 2012, Vol 19:616 621 for adiscussion of the long term impacts</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Motivating example: ECMO</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>In 1985 a group at a major neonatal intensive care center published the results of a trial comparing a standard treatment and a promising new extracorporeal membrane oxygenation treatment (ECMO) for newborn infants with severe respiratory failure. <strong>Ethical considerations lead to a statistical randomization scheme whereby one infant received the control therapy, thereby opening the study to sample-size based criticisms.</strong></p>
+
+<p>For a review and statistical discussion, see Royall Statistical Science 1991, Vol 6, No. 1, 52-88</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Summary</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>These examples illustrate many of the difficulties of trying
+to use data to create general conclusions about a population.</li>
+<li>Paramount among our concerns are:
+
+<ul>
+<li>Is the sample representative of the population that we&#39;d like to draw inferences about?</li>
+<li>Are there known and observed, known and unobserved or unknown and unobserved variables that contaminate our conclusions?</li>
+<li>Is there systematic bias created by missing data or the design or conduct of the study?</li>
+<li>What randomness exists in the data and how do we use or adjust for it? Here randomness can either be explicit via randomization
+or random sampling, or implicit as the aggregation of many complex uknown processes.</li>
+<li>Are we trying to estimate an underlying mechanistic model of phenomena under study?</li>
+</ul></li>
+<li>Statistical inference requires navigating the set of assumptions and
+tools and subsequently thinking about how to draw conclusions from data.</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Example goals of inference</h2>
+  </hgroup>
+  <article data-timings="">
+    <ol>
+<li>Estimate and quantify the uncertainty of an estimate of 
+a population quantity (the proportion of people who will
+vote for a candidate).</li>
+<li>Determine whether a population quantity 
+is a benchmark value (&quot;is the treatment effective?&quot;).</li>
+<li>Infer a mechanistic relationship when quantities are measured with
+noise (&quot;What is the slope for Hooke&#39;s law?&quot;)</li>
+<li>Determine the impact of a policy? (&quot;If we reduce polution levels,
+will asthma rates decline?&quot;)</li>
+</ol>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Example tools of the trade</h2>
+  </hgroup>
+  <article data-timings="">
+    <ol>
+<li>Randomization: concerned with balancing unobserved variables that may confound inferences of interest</li>
+<li>Random sampling: concerned with obtaining data that is representative 
+of the population of interest</li>
+<li>Sampling models: concerned with creating a model for the sampling
+process, the most common is so called &quot;iid&quot;.</li>
+<li>Hypothesis testing: concerned with decision making in the presence of uncertainty</li>
+<li>Confidence intervals: concerned with quantifying uncertainty in 
+estimation</li>
+<li>Probability models: a formal connection between the data and a population of interest. Often probability models are assumed or are
+approximated.</li>
+<li>Study design: the process of designing an experiment to minimize biases and variability.</li>
+<li>Nonparametric bootstrapping: the process of using the data to,
+with minimal probability model assumptions, create inferences.</li>
+<li>Permutation, randomization and exchangeability testing: the process 
+of using data permutations to perform inferences.</li>
+</ol>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Different thinking about probability leads to different styles of inference</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>We won&#39;t spend too much time talking about this, but there are several different
+styles of inference. Two broad categories that get discussed a lot are:</p>
+
+<ol>
+<li>Frequency probability: is the long run proportion of
+times an event occurs in independent, identically distributed 
+repetitions.</li>
+<li>Frequency inference: uses frequency interpretations of probabilities
+to control error rates. Answers questions like &quot;What should I decide
+given my data controlling the long run proportion of mistakes I make at
+a tolerable level.&quot;</li>
+<li>Bayesian probability: is the probability calculus of beliefs, given that beliefs follow certain rules.</li>
+<li>Bayesian inference: the use of Bayesian probability representation
+of beliefs to perform inference. Answers questions like &quot;Given my subjective beliefs and the objective information from the data, what
+should I believe now?&quot;</li>
+</ol>
+
+<p>Data scientists tend to fall within shades of gray of these and various other schools of inference. </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>In this class</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>In this class, we will primarily focus on basic sampling models, 
+basic probability models and frequency style analyses
+to create standard inferences. </li>
+<li>Being data scientists,  we will also consider some inferential strategies that  rely heavily on the observed data, such as permutation testing
+and bootstrapping.</li>
+<li>As probability modeling will be our starting point, we first build
+up basic probability.</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>Where to learn more on the topics not covered</h2>
+  </hgroup>
+  <article data-timings="">
+    <ol>
+<li>Explicit use of random sampling in inferences: look in references
+on &quot;finite population statistics&quot;. Used heavily in polling and
+sample surveys.</li>
+<li>Explicit use of randomization in inferences: look in references
+on &quot;causal inference&quot; especially in clinical trials.</li>
+<li>Bayesian probability and Bayesian statistics: look for basic itroductory books (there are many).</li>
+<li>Missing data: well covered in biostatistics and econometric
+references; look for references to &quot;multiple imputation&quot;, a popular tool for
+addressing missing data.</li>
+<li>Study design: consider looking in the subject matter area that
+you are interested in; some examples with rich histories in design:
+
+<ol>
+<li>The epidemiological literature is very focused on using study design to investigate public health.</li>
+<li>The classical development of study design in agriculture broadly covers design and design principles.</li>
+<li>The industrial quality control literature covers design thoroughly.</li>
+</ol></li>
+</ol>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Statistical inference defined'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Motivating example: who&#39;s going to win the election?'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Motivating example: is hormone replacement therapy effective?'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Motivating example: ECMO'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Summary'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Example goals of inference'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Example tools of the trade'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Different thinking about probability leads to different styles of inference'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='In this class'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='Where to learn more on the topics not covered'>
+         10
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/01_01_Introduction/index.md b/06_StatisticalInference/01_01_Introduction/index.md
index 74e8b2a1e..3b5e98989 100644
--- a/06_StatisticalInference/01_01_Introduction/index.md
+++ b/06_StatisticalInference/01_01_Introduction/index.md
@@ -1,158 +1,158 @@
----
-title       : Introduction to statistical inference
-subtitle    : Statistical inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-## Statistical inference defined
-
-Statistical inference is the process of drawing formal conclusions from
-data. 
-
-In our class, we wil define formal statistical inference as settings where one wants to infer facts about a population using noisy
-statistical data where uncertainty must be accounted for.
-
----
-
-## Motivating example: who's going to win the election?
-
-In every major election, pollsters would like to know, ahead of the
-actual election, who's going to win. Here, the target of
-estimation (the estimand) is clear, the percentage of people in 
-a particular group (city, state, county, country or other electoral
-grouping) who will vote for each candidate.
-
-We can not poll everyone. Even if we could, some polled 
-may change their vote by the time the election occurs.
-How do we collect a reasonable subset of data and quantify the
-uncertainty in the process to produce a good guess at who will win?
-
----
-
-## Motivating example: is hormone replacement therapy effective? 
-
-A large clinical trial (the Women’s Health Initiative) published results in 2002 that contradicted prior evidence on the efficacy of hormone replacement therapy for post menopausal women and suggested a negative impact of HRT for several key health outcomes. **Based on a statistically based protocol, the study was stopped early due an excess number of negative events.**
-
-Here's there's two inferential problems. 
-
-1. Is HRT effective?
-2. How long should we continue the trial in the presence of contrary
-evidence?
-
-See WHI writing group paper JAMA 2002, Vol 288:321 - 333. for the paper and Steinkellner et al. Menopause 2012, Vol 19:616 621 for adiscussion of the long term impacts
-
----
-
-## Motivating example: ECMO
-
-In 1985 a group at a major neonatal intensive care center published the results of a trial comparing a standard treatment and a promising new extracorporeal membrane oxygenation treatment (ECMO) for newborn infants with severe respiratory failure. **Ethical considerations lead to a statistical randomization scheme whereby one infant received the control therapy, thereby opening the study to sample-size based criticisms.**
-
-For a review and statistical discussion, see Royall Statistical Science 1991, Vol 6, No. 1, 52-88
-
----
-
-## Summary
-
-- These examples illustrate many of the difficulties of trying
-to use data to create general conclusions about a population.
-- Paramount among our concerns are:
-  - Is the sample representative of the population that we'd like to draw inferences about?
-  - Are there known and observed, known and unobserved or unknown and unobserved variables that contaminate our conclusions?
-  - Is there systematic bias created by missing data or the design or conduct of the study?
-  - What randomness exists in the data and how do we use or adjust for it? Here randomness can either be explicit via randomization
-or random sampling, or implicit as the aggregation of many complex uknown processes.
-  - Are we trying to estimate an underlying mechanistic model of phenomena under study?
-- Statistical inference requires navigating the set of assumptions and
-tools and subsequently thinking about how to draw conclusions from data.
-
---- 
-## Example goals of inference
-
-1. Estimate and quantify the uncertainty of an estimate of 
-a population quantity (the proportion of people who will
-  vote for a candidate).
-2. Determine whether a population quantity 
-  is a benchmark value ("is the treatment effective?").
-3. Infer a mechanistic relationship when quantities are measured with
-  noise ("What is the slope for Hooke's law?")
-4. Determine the impact of a policy? ("If we reduce polution levels,
-  will asthma rates decline?")
-
-
----
-## Example tools of the trade 
-
-1. Randomization: concerned with balancing unobserved variables that may confound inferences of interest
-2. Random sampling: concerned with obtaining data that is representative 
-of the population of interest
-3. Sampling models: concerned with creating a model for the sampling
-process, the most common is so called "iid".
-4. Hypothesis testing: concerned with decision making in the presence of uncertainty
-5. Confidence intervals: concerned with quantifying uncertainty in 
-estimation
-6. Probability models: a formal connection between the data and a population of interest. Often probability models are assumed or are
-approximated.
-7. Study design: the process of designing an experiment to minimize biases and variability.
-8. Nonparametric bootstrapping: the process of using the data to,
-  with minimal probability model assumptions, create inferences.
-9. Permutation, randomization and exchangeability testing: the process 
-of using data permutations to perform inferences.
-
----
-## Different thinking about probability leads to different styles of inference
-
-We won't spend too much time talking about this, but there are several different
-styles of inference. Two broad categories that get discussed a lot are:
-
-1. Frequency probability: is the long run proportion of
- times an event occurs in independent, identically distributed 
- repetitions.
-2. Frequency inference: uses frequency interpretations of probabilities
-to control error rates. Answers questions like "What should I decide
-given my data controlling the long run proportion of mistakes I make at
-a tolerable level."
-3. Bayesian probability: is the probability calculus of beliefs, given that beliefs follow certain rules.
-4. Bayesian inference: the use of Bayesian probability representation
-of beliefs to perform inference. Answers questions like "Given my subjective beliefs and the objective information from the data, what
-should I believe now?"
-
-Data scientists tend to fall within shades of gray of these and various other schools of inference. 
-
----
-## In this class
-
-* In this class, we will primarily focus on basic sampling models, 
-basic probability models and frequency style analyses
-to create standard inferences. 
-* Being data scientists,  we will also consider some inferential strategies that  rely heavily on the observed data, such as permutation testing
-and bootstrapping.
-* As probability modeling will be our starting point, we first build
-up basic probability.
-
----
-## Where to learn more on the topics not covered
-
-1. Explicit use of random sampling in inferences: look in references
-on "finite population statistics". Used heavily in polling and
-sample surveys.
-2. Explicit use of randomization in inferences: look in references
-on "causal inference" especially in clinical trials.
-3. Bayesian probability and Bayesian statistics: look for basic itroductory books (there are many).
-4. Missing data: well covered in biostatistics and econometric
-references; look for references to "multiple imputation", a popular tool for
-addressing missing data.
-5. Study design: consider looking in the subject matter area that
-  you are interested in; some examples with rich histories in design:
-  1. The epidemiological literature is very focused on using study design to investigate public health.
-  2. The classical development of study design in agriculture broadly covers design and design principles.
-  3. The industrial quality control literature covers design thoroughly.
- 
+---
+title       : Introduction to statistical inference
+subtitle    : Statistical inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+## Statistical inference defined
+
+Statistical inference is the process of drawing formal conclusions from
+data. 
+
+In our class, we wil define formal statistical inference as settings where one wants to infer facts about a population using noisy
+statistical data where uncertainty must be accounted for.
+
+---
+
+## Motivating example: who's going to win the election?
+
+In every major election, pollsters would like to know, ahead of the
+actual election, who's going to win. Here, the target of
+estimation (the estimand) is clear, the percentage of people in 
+a particular group (city, state, county, country or other electoral
+grouping) who will vote for each candidate.
+
+We can not poll everyone. Even if we could, some polled 
+may change their vote by the time the election occurs.
+How do we collect a reasonable subset of data and quantify the
+uncertainty in the process to produce a good guess at who will win?
+
+---
+
+## Motivating example: is hormone replacement therapy effective? 
+
+A large clinical trial (the Women’s Health Initiative) published results in 2002 that contradicted prior evidence on the efficacy of hormone replacement therapy for post menopausal women and suggested a negative impact of HRT for several key health outcomes. **Based on a statistically based protocol, the study was stopped early due an excess number of negative events.**
+
+Here's there's two inferential problems. 
+
+1. Is HRT effective?
+2. How long should we continue the trial in the presence of contrary
+evidence?
+
+See WHI writing group paper JAMA 2002, Vol 288:321 - 333. for the paper and Steinkellner et al. Menopause 2012, Vol 19:616 621 for adiscussion of the long term impacts
+
+---
+
+## Motivating example: ECMO
+
+In 1985 a group at a major neonatal intensive care center published the results of a trial comparing a standard treatment and a promising new extracorporeal membrane oxygenation treatment (ECMO) for newborn infants with severe respiratory failure. **Ethical considerations lead to a statistical randomization scheme whereby one infant received the control therapy, thereby opening the study to sample-size based criticisms.**
+
+For a review and statistical discussion, see Royall Statistical Science 1991, Vol 6, No. 1, 52-88
+
+---
+
+## Summary
+
+- These examples illustrate many of the difficulties of trying
+to use data to create general conclusions about a population.
+- Paramount among our concerns are:
+  - Is the sample representative of the population that we'd like to draw inferences about?
+  - Are there known and observed, known and unobserved or unknown and unobserved variables that contaminate our conclusions?
+  - Is there systematic bias created by missing data or the design or conduct of the study?
+  - What randomness exists in the data and how do we use or adjust for it? Here randomness can either be explicit via randomization
+or random sampling, or implicit as the aggregation of many complex uknown processes.
+  - Are we trying to estimate an underlying mechanistic model of phenomena under study?
+- Statistical inference requires navigating the set of assumptions and
+tools and subsequently thinking about how to draw conclusions from data.
+
+--- 
+## Example goals of inference
+
+1. Estimate and quantify the uncertainty of an estimate of 
+a population quantity (the proportion of people who will
+  vote for a candidate).
+2. Determine whether a population quantity 
+  is a benchmark value ("is the treatment effective?").
+3. Infer a mechanistic relationship when quantities are measured with
+  noise ("What is the slope for Hooke's law?")
+4. Determine the impact of a policy? ("If we reduce polution levels,
+  will asthma rates decline?")
+
+
+---
+## Example tools of the trade 
+
+1. Randomization: concerned with balancing unobserved variables that may confound inferences of interest
+2. Random sampling: concerned with obtaining data that is representative 
+of the population of interest
+3. Sampling models: concerned with creating a model for the sampling
+process, the most common is so called "iid".
+4. Hypothesis testing: concerned with decision making in the presence of uncertainty
+5. Confidence intervals: concerned with quantifying uncertainty in 
+estimation
+6. Probability models: a formal connection between the data and a population of interest. Often probability models are assumed or are
+approximated.
+7. Study design: the process of designing an experiment to minimize biases and variability.
+8. Nonparametric bootstrapping: the process of using the data to,
+  with minimal probability model assumptions, create inferences.
+9. Permutation, randomization and exchangeability testing: the process 
+of using data permutations to perform inferences.
+
+---
+## Different thinking about probability leads to different styles of inference
+
+We won't spend too much time talking about this, but there are several different
+styles of inference. Two broad categories that get discussed a lot are:
+
+1. Frequency probability: is the long run proportion of
+ times an event occurs in independent, identically distributed 
+ repetitions.
+2. Frequency inference: uses frequency interpretations of probabilities
+to control error rates. Answers questions like "What should I decide
+given my data controlling the long run proportion of mistakes I make at
+a tolerable level."
+3. Bayesian probability: is the probability calculus of beliefs, given that beliefs follow certain rules.
+4. Bayesian inference: the use of Bayesian probability representation
+of beliefs to perform inference. Answers questions like "Given my subjective beliefs and the objective information from the data, what
+should I believe now?"
+
+Data scientists tend to fall within shades of gray of these and various other schools of inference. 
+
+---
+## In this class
+
+* In this class, we will primarily focus on basic sampling models, 
+basic probability models and frequency style analyses
+to create standard inferences. 
+* Being data scientists,  we will also consider some inferential strategies that  rely heavily on the observed data, such as permutation testing
+and bootstrapping.
+* As probability modeling will be our starting point, we first build
+up basic probability.
+
+---
+## Where to learn more on the topics not covered
+
+1. Explicit use of random sampling in inferences: look in references
+on "finite population statistics". Used heavily in polling and
+sample surveys.
+2. Explicit use of randomization in inferences: look in references
+on "causal inference" especially in clinical trials.
+3. Bayesian probability and Bayesian statistics: look for basic itroductory books (there are many).
+4. Missing data: well covered in biostatistics and econometric
+references; look for references to "multiple imputation", a popular tool for
+addressing missing data.
+5. Study design: consider looking in the subject matter area that
+  you are interested in; some examples with rich histories in design:
+  1. The epidemiological literature is very focused on using study design to investigate public health.
+  2. The classical development of study design in agriculture broadly covers design and design principles.
+  3. The industrial quality control literature covers design thoroughly.
+
diff --git a/06_StatisticalInference/01_01_Introduction/index.pdf b/06_StatisticalInference/01_01_Introduction/index.pdf
index 70d9be1bc..b50714770 100644
Binary files a/06_StatisticalInference/01_01_Introduction/index.pdf and b/06_StatisticalInference/01_01_Introduction/index.pdf differ
diff --git a/06_StatisticalInference/01_02_Probability/assets/fig/unnamed-chunk-1.png b/06_StatisticalInference/01_02_Probability/assets/fig/unnamed-chunk-1.png
new file mode 100644
index 000000000..21d71259c
Binary files /dev/null and b/06_StatisticalInference/01_02_Probability/assets/fig/unnamed-chunk-1.png differ
diff --git a/06_StatisticalInference/01_02_Probability/assets/fig/unnamed-chunk-2.png b/06_StatisticalInference/01_02_Probability/assets/fig/unnamed-chunk-2.png
new file mode 100644
index 000000000..833444e0b
Binary files /dev/null and b/06_StatisticalInference/01_02_Probability/assets/fig/unnamed-chunk-2.png differ
diff --git a/06_StatisticalInference/01_02_Probability/index.Rmd b/06_StatisticalInference/01_02_Probability/index.Rmd
index c925cc40e..3691fc14c 100644
--- a/06_StatisticalInference/01_02_Probability/index.Rmd
+++ b/06_StatisticalInference/01_02_Probability/index.Rmd
@@ -1,277 +1,276 @@
----
-title       : Probability
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-## Notation
-
-- The **sample space**, $\Omega$, is the collection of possible outcomes of an experiment
-  - Example: die roll $\Omega = \{1,2,3,4,5,6\}$
-- An **event**, say $E$, is a subset of $\Omega$ 
-  - Example: die roll is even $E = \{2,4,6\}$
-- An **elementary** or **simple** event is a particular result
-  of an experiment
-  - Example: die roll is a four, $\omega = 4$
-- $\emptyset$ is called the **null event** or the **empty set**
-
----
-
-## Interpretation of set operations
-
-Normal set operations have particular interpretations in this setting
-
-1. $\omega \in E$ implies that $E$ occurs when $\omega$ occurs
-2. $\omega \not\in E$ implies that $E$ does not occur when $\omega$ occurs
-3. $E \subset F$ implies that the occurrence of $E$ implies the occurrence of $F$
-4. $E \cap F$  implies the event that both $E$ and $F$ occur
-5. $E \cup F$ implies the event that at least one of $E$ or $F$ occur
-6. $E \cap F=\emptyset$ means that $E$ and $F$ are **mutually exclusive**, or cannot both occur
-7. $E^c$ or $\bar E$ is the event that $E$ does not occur
-
----
-
-## Probability
-
-A **probability measure**, $P$, is a function from the collection of possible events so that the following hold
-
-1. For an event $E\subset \Omega$, $0 \leq P(E) \leq 1$
-2. $P(\Omega) = 1$
-3. If $E_1$ and $E_2$ are mutually exclusive events
-  $P(E_1 \cup E_2) = P(E_1) + P(E_2)$.
-
-Part 3 of the definition implies **finite additivity**
-
-$$
-P(\cup_{i=1}^n A_i) = \sum_{i=1}^n P(A_i)
-$$
-where the $\{A_i\}$ are mutually exclusive. (Note a more general version of
-additivity is used in advanced classes.)
-
-
----
-
-
-## Example consequences
-
-- $P(\emptyset) = 0$
-- $P(E) = 1 - P(E^c)$
-- $P(A \cup B) = P(A) + P(B) - P(A \cap B)$
-- if $A \subset B$ then $P(A) \leq P(B)$
-- $P\left(A \cup B\right) = 1 - P(A^c \cap B^c)$
-- $P(A \cap B^c) = P(A) - P(A \cap B)$
-- $P(\cup_{i=1}^n E_i) \leq \sum_{i=1}^n P(E_i)$
-- $P(\cup_{i=1}^n E_i) \geq \max_i P(E_i)$
-
----
-
-## Example
-
-The National Sleep Foundation ([www.sleepfoundation.org](http://www.sleepfoundation.org/)) reports that around 3% of the American population has sleep apnea. They also report that around 10% of the North American and European population has restless leg syndrome. Does this imply that 13% of people will have at least one sleep problems of these sorts?
-
----
-
-## Example continued
-
-Answer: No, the events are not mutually exclusive. To elaborate let:
-
-$$
-\begin{eqnarray*}
-    A_1 & = & \{\mbox{Person has sleep apnea}\} \\
-    A_2 & = & \{\mbox{Person has RLS}\} 
-  \end{eqnarray*}
-$$
-
-Then 
-
-$$
-\begin{eqnarray*}
-    P(A_1 \cup A_2 ) & = & P(A_1) + P(A_2) - P(A_1 \cap A_2) \\
-   & = & 0.13 - \mbox{Probability of having both}
-  \end{eqnarray*}
-$$
-Likely, some fraction of the population has both.
-
----
-
-## Random variables
-
-- A **random variable** is a numerical outcome of an experiment.
-- The random variables that we study will come in two varieties,
-  **discrete** or **continuous**.
-- Discrete random variable are random variables that take on only a
-countable number of possibilities.
-  * $P(X = k)$
-- Continuous random variable can take any value on the real line or some subset of the real line.
-  * $P(X \in A)$
-
----
-
-## Examples of variables that can be thought of as random variables
-
-- The $(0-1)$ outcome of the flip of a coin
-- The outcome from the roll of a die
-- The BMI of a subject four years after a baseline measurement
-- The hypertension status of a subject randomly drawn from a population
-
----
-
-## PMF
-
-A probability mass function evaluated at a value corresponds to the
-probability that a random variable takes that value. To be a valid
-pmf a function, $p$, must satisfy
-
-  1. $p(x) \geq 0$ for all $x$
-  2. $\sum_{x} p(x) = 1$
-
-The sum is taken over all of the possible values for $x$.
-
----
-
-## Example
-
-Let $X$ be the result of a coin flip where $X=0$ represents
-tails and $X = 1$ represents heads.
-$$
-p(x) = (1/2)^{x} (1/2)^{1-x} ~~\mbox{ for }~~x = 0,1
-$$
-Suppose that we do not know whether or not the coin is fair; Let
-$\theta$ be the probability of a head expressed as a proportion
-(between 0 and 1).
-$$
-p(x) = \theta^{x} (1 - \theta)^{1-x} ~~\mbox{ for }~~x = 0,1
-$$
-
----
-
-## PDF
-
-A probability density function (pdf), is a function associated with
-a continuous random variable 
-
-  *Areas under pdfs correspond to probabilities for that random variable*
-
-To be a valid pdf, a function $f$ must satisfy
-
-1. $f(x) \geq 0$ for all $x$
-
-2. The area under $f(x)$ is one.
-
----
-## Example
-
-Suppose that the proportion of help calls that get addressed in
-a random day by a help line is given by
-$$
-f(x) = \left\{\begin{array}{ll}
-    2 x & \mbox{ for } 1 > x > 0 \\
-    0                 & \mbox{ otherwise} 
-\end{array} \right. 
-$$
-
-Is this a mathematically valid density?
-
----
-
-```{r, fig.height = 5, fig.width = 5, echo = TRUE, fig.align='center'}
-x <- c(-0.5, 0, 1, 1, 1.5); y <- c( 0, 0, 2, 0, 0)
-plot(x, y, lwd = 3, frame = FALSE, type = "l")
-```
-
----
-
-## Example continued
-
-What is the probability that 75% or fewer of calls get addressed?
-
-```{r, fig.height = 5, fig.width = 5, echo = FALSE, fig.align='center'}
-plot(x, y, lwd = 3, frame = FALSE, type = "l")
-polygon(c(0, .75, .75, 0), c(0, 0, 1.5, 0), lwd = 3, col = "lightblue")
-```
-
----
-```{r}
-1.5 * .75 / 2
-pbeta(.75, 2, 1)
-```
----
-
-## CDF and survival function
-
-- The **cumulative distribution function** (CDF) of a random variable $X$ is defined as the function 
-$$
-F(x) = P(X \leq x)
-$$
-- This definition applies regardless of whether $X$ is discrete or continuous.
-- The **survival function** of a random variable $X$ is defined as
-$$
-S(x) = P(X > x)
-$$
-- Notice that $S(x) = 1 - F(x)$
-- For continuous random variables, the PDF is the derivative of the CDF
-
----
-
-## Example
-
-What are the survival function and CDF from the density considered before?
-
-For $1 \geq x \geq 0$
-$$
-F(x) = P(X \leq x) = \frac{1}{2} Base \times Height = \frac{1}{2} (x) \times (2 x) = x^2
-$$
-
-$$
-S(x) = 1 - x^2
-$$
-
-```{r}
-pbeta(c(0.4, 0.5, 0.6), 2, 1)
-```
-
----
-
-## Quantiles
-
-- The  $\alpha^{th}$ **quantile** of a distribution with distribution function $F$ is the point $x_\alpha$ so that
-$$
-F(x_\alpha) = \alpha
-$$
-- A **percentile** is simply a quantile with $\alpha$ expressed as a percent
-- The **median** is the $50^{th}$ percentile
-
----
-## Example
-- We want to solve $0.5 = F(x) = x^2$
-- Resulting in the solution 
-```{r, echo = TRUE} 
-sqrt(0.5)
-``` 
-- Therefore, about `r sqrt(0.5)` of calls being answered on a random day is the median.
-- R can approximate quantiles for you for common distributions
-
-```{r}
-qbeta(0.5, 2, 1)
-```
-
----
-
-## Summary
-
-- You might be wondering at this point "I've heard of a median before, it didn't require integration. Where's the data?"
-- We're referring to are **population quantities**. Therefore, the median being
-  discussed is the **population median**.
-- A probability model connects the data to the population using assumptions.
-- Therefore the median we're discussing is the **estimand**, the sample median will be the **estimator**
-
+---
+title       : Probability
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Notation
+
+- The **sample space**, $\Omega$, is the collection of possible outcomes of an experiment
+  - Example: die roll $\Omega = \{1,2,3,4,5,6\}$
+- An **event**, say $E$, is a subset of $\Omega$ 
+  - Example: die roll is even $E = \{2,4,6\}$
+- An **elementary** or **simple** event is a particular result
+  of an experiment
+  - Example: die roll is a four, $\omega = 4$
+- $\emptyset$ is called the **null event** or the **empty set**
+
+---
+
+## Interpretation of set operations
+
+Normal set operations have particular interpretations in this setting
+
+1. $\omega \in E$ implies that $E$ occurs when $\omega$ occurs
+2. $\omega \not\in E$ implies that $E$ does not occur when $\omega$ occurs
+3. $E \subset F$ implies that the occurrence of $E$ implies the occurrence of $F$
+4. $E \cap F$  implies the event that both $E$ and $F$ occur
+5. $E \cup F$ implies the event that at least one of $E$ or $F$ occur
+6. $E \cap F=\emptyset$ means that $E$ and $F$ are **mutually exclusive**, or cannot both occur
+7. $E^c$ or $\bar E$ is the event that $E$ does not occur
+
+---
+
+## Probability
+
+A **probability measure**, $P$, is a function from the collection of possible events so that the following hold
+
+1. For an event $E\subset \Omega$, $0 \leq P(E) \leq 1$
+2. $P(\Omega) = 1$
+3. If $E_1$ and $E_2$ are mutually exclusive events
+  $P(E_1 \cup E_2) = P(E_1) + P(E_2)$.
+
+Part 3 of the definition implies **finite additivity**
+
+$$
+P(\cup_{i=1}^n A_i) = \sum_{i=1}^n P(A_i)
+$$
+where the $\{A_i\}$ are mutually exclusive. (Note a more general version of
+additivity is used in advanced classes.)
+
+
+---
+
+
+## Example consequences
+
+- $P(\emptyset) = 0$
+- $P(E) = 1 - P(E^c)$
+- $P(A \cup B) = P(A) + P(B) - P(A \cap B)$
+- if $A \subset B$ then $P(A) \leq P(B)$
+- $P\left(A \cup B\right) = 1 - P(A^c \cap B^c)$
+- $P(A \cap B^c) = P(A) - P(A \cap B)$
+- $P(\cup_{i=1}^n E_i) \leq \sum_{i=1}^n P(E_i)$
+- $P(\cup_{i=1}^n E_i) \geq \max_i P(E_i)$
+
+---
+
+## Example
+
+The National Sleep Foundation ([www.sleepfoundation.org](http://www.sleepfoundation.org/)) reports that around 3% of the American population has sleep apnea. They also report that around 10% of the North American and European population has restless leg syndrome. Does this imply that 13% of people will have at least one sleep problems of these sorts?
+
+---
+
+## Example continued
+
+Answer: No, the events are not mutually exclusive. To elaborate let:
+
+$$
+\begin{eqnarray*}
+    A_1 & = & \{\mbox{Person has sleep apnea}\} \\
+    A_2 & = & \{\mbox{Person has RLS}\} 
+  \end{eqnarray*}
+$$
+
+Then 
+
+$$
+\begin{eqnarray*}
+    P(A_1 \cup A_2 ) & = & P(A_1) + P(A_2) - P(A_1 \cap A_2) \\
+   & = & 0.13 - \mbox{Probability of having both}
+  \end{eqnarray*}
+$$
+Likely, some fraction of the population has both.
+
+---
+
+## Random variables
+
+- A **random variable** is a numerical outcome of an experiment.
+- The random variables that we study will come in two varieties,
+  **discrete** or **continuous**.
+- Discrete random variable are random variables that take on only a
+countable number of possibilities.
+  * $P(X = k)$
+- Continuous random variable can take any value on the real line or some subset of the real line.
+  * $P(X \in A)$
+
+---
+
+## Examples of variables that can be thought of as random variables
+
+- The $(0-1)$ outcome of the flip of a coin
+- The outcome from the roll of a die
+- The BMI of a subject four years after a baseline measurement
+- The hypertension status of a subject randomly drawn from a population
+
+---
+
+## PMF
+
+A probability mass function evaluated at a value corresponds to the
+probability that a random variable takes that value. To be a valid
+pmf a function, $p$, must satisfy
+
+  1. $p(x) \geq 0$ for all $x$
+  2. $\sum_{x} p(x) = 1$
+
+The sum is taken over all of the possible values for $x$.
+
+---
+
+## Example
+
+Let $X$ be the result of a coin flip where $X=0$ represents
+tails and $X = 1$ represents heads.
+$$
+p(x) = (1/2)^{x} (1/2)^{1-x} ~~\mbox{ for }~~x = 0,1
+$$
+Suppose that we do not know whether or not the coin is fair; Let
+$\theta$ be the probability of a head expressed as a proportion
+(between 0 and 1).
+$$
+p(x) = \theta^{x} (1 - \theta)^{1-x} ~~\mbox{ for }~~x = 0,1
+$$
+
+---
+
+## PDF
+
+A probability density function (pdf), is a function associated with
+a continuous random variable 
+
+  *Areas under pdfs correspond to probabilities for that random variable*
+
+To be a valid pdf, a function $f$ must satisfy
+
+1. $f(x) \geq 0$ for all $x$
+
+2. The area under $f(x)$ is one.
+
+---
+## Example
+
+Suppose that the proportion of help calls that get addressed in
+a random day by a help line is given by
+$$
+f(x) = \left\{\begin{array}{ll}
+    2 x & \mbox{ for } 1 > x > 0 \\
+    0                 & \mbox{ otherwise} 
+\end{array} \right. 
+$$
+
+Is this a mathematically valid density?
+
+---
+
+```{r, fig.height = 5, fig.width = 5, echo = TRUE, fig.align='center'}
+x <- c(-0.5, 0, 1, 1, 1.5); y <- c( 0, 0, 2, 0, 0)
+plot(x, y, lwd = 3, frame = FALSE, type = "l")
+```
+
+---
+
+## Example continued
+
+What is the probability that 75% or fewer of calls get addressed?
+
+```{r, fig.height = 5, fig.width = 5, echo = FALSE, fig.align='center'}
+plot(x, y, lwd = 3, frame = FALSE, type = "l")
+polygon(c(0, .75, .75, 0), c(0, 0, 1.5, 0), lwd = 3, col = "lightblue")
+```
+
+---
+```{r}
+1.5 * .75 / 2
+pbeta(.75, 2, 1)
+```
+---
+
+## CDF and survival function
+
+- The **cumulative distribution function** (CDF) of a random variable $X$ is defined as the function 
+$$
+F(x) = P(X \leq x)
+$$
+- This definition applies regardless of whether $X$ is discrete or continuous.
+- The **survival function** of a random variable $X$ is defined as
+$$
+S(x) = P(X > x)
+$$
+- Notice that $S(x) = 1 - F(x)$
+- For continuous random variables, the PDF is the derivative of the CDF
+
+---
+
+## Example
+
+What are the survival function and CDF from the density considered before?
+
+For $1 \geq x \geq 0$
+$$
+F(x) = P(X \leq x) = \frac{1}{2} Base \times Height = \frac{1}{2} (x) \times (2 x) = x^2
+$$
+
+$$
+S(x) = 1 - x^2
+$$
+
+```{r}
+pbeta(c(0.4, 0.5, 0.6), 2, 1)
+```
+
+---
+
+## Quantiles
+
+- The  $\alpha^{th}$ **quantile** of a distribution with distribution function $F$ is the point $x_\alpha$ so that
+$$
+F(x_\alpha) = \alpha
+$$
+- A **percentile** is simply a quantile with $\alpha$ expressed as a percent
+- The **median** is the $50^{th}$ percentile
+
+---
+## Example
+- We want to solve $0.5 = F(x) = x^2$
+- Resulting in the solution 
+```{r, echo = TRUE} 
+sqrt(0.5)
+``` 
+- Therefore, about `r sqrt(0.5)` of calls being answered on a random day is the median.
+- R can approximate quantiles for you for common distributions
+
+```{r}
+qbeta(0.5, 2, 1)
+```
+
+---
+
+## Summary
+
+- You might be wondering at this point "I've heard of a median before, it didn't require integration. Where's the data?"
+- We're referring to are **population quantities**. Therefore, the median being
+  discussed is the **population median**.
+- A probability model connects the data to the population using assumptions.
+- Therefore the median we're discussing is the **estimand**, the sample median will be the **estimator**
diff --git a/06_StatisticalInference/01_02_Probability/index.html b/06_StatisticalInference/01_02_Probability/index.html
index 8e224deef..f9de0d1e0 100644
--- a/06_StatisticalInference/01_02_Probability/index.html
+++ b/06_StatisticalInference/01_02_Probability/index.html
@@ -1,617 +1,617 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Probability</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Probability">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Probability</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Notation</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The <strong>sample space</strong>, \(\Omega\), is the collection of possible outcomes of an experiment
-
-<ul>
-<li>Example: die roll \(\Omega = \{1,2,3,4,5,6\}\)</li>
-</ul></li>
-<li>An <strong>event</strong>, say \(E\), is a subset of \(\Omega\) 
-
-<ul>
-<li>Example: die roll is even \(E = \{2,4,6\}\)</li>
-</ul></li>
-<li>An <strong>elementary</strong> or <strong>simple</strong> event is a particular result
-of an experiment
-
-<ul>
-<li>Example: die roll is a four, \(\omega = 4\)</li>
-</ul></li>
-<li>\(\emptyset\) is called the <strong>null event</strong> or the <strong>empty set</strong></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Interpretation of set operations</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>Normal set operations have particular interpretations in this setting</p>
-
-<ol>
-<li>\(\omega \in E\) implies that \(E\) occurs when \(\omega\) occurs</li>
-<li>\(\omega \not\in E\) implies that \(E\) does not occur when \(\omega\) occurs</li>
-<li>\(E \subset F\) implies that the occurrence of \(E\) implies the occurrence of \(F\)</li>
-<li>\(E \cap F\)  implies the event that both \(E\) and \(F\) occur</li>
-<li>\(E \cup F\) implies the event that at least one of \(E\) or \(F\) occur</li>
-<li>\(E \cap F=\emptyset\) means that \(E\) and \(F\) are <strong>mutually exclusive</strong>, or cannot both occur</li>
-<li>\(E^c\) or \(\bar E\) is the event that \(E\) does not occur</li>
-</ol>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Probability</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>A <strong>probability measure</strong>, \(P\), is a function from the collection of possible events so that the following hold</p>
-
-<ol>
-<li>For an event \(E\subset \Omega\), \(0 \leq P(E) \leq 1\)</li>
-<li>\(P(\Omega) = 1\)</li>
-<li>If \(E_1\) and \(E_2\) are mutually exclusive events
-\(P(E_1 \cup E_2) = P(E_1) + P(E_2)\).</li>
-</ol>
-
-<p>Part 3 of the definition implies <strong>finite additivity</strong></p>
-
-<p>\[
-P(\cup_{i=1}^n A_i) = \sum_{i=1}^n P(A_i)
-\]
-where the \(\{A_i\}\) are mutually exclusive. (Note a more general version of
-additivity is used in advanced classes.)</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Example consequences</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>\(P(\emptyset) = 0\)</li>
-<li>\(P(E) = 1 - P(E^c)\)</li>
-<li>\(P(A \cup B) = P(A) + P(B) - P(A \cap B)\)</li>
-<li>if \(A \subset B\) then \(P(A) \leq P(B)\)</li>
-<li>\(P\left(A \cup B\right) = 1 - P(A^c \cap B^c)\)</li>
-<li>\(P(A \cap B^c) = P(A) - P(A \cap B)\)</li>
-<li>\(P(\cup_{i=1}^n E_i) \leq \sum_{i=1}^n P(E_i)\)</li>
-<li>\(P(\cup_{i=1}^n E_i) \geq \max_i P(E_i)\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>The National Sleep Foundation (<a href="http://www.sleepfoundation.org/">www.sleepfoundation.org</a>) reports that around 3% of the American population has sleep apnea. They also report that around 10% of the North American and European population has restless leg syndrome. Does this imply that 13% of people will have at least one sleep problems of these sorts?</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Example continued</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>Answer: No, the events are not mutually exclusive. To elaborate let:</p>
-
-<p>\[
-\begin{eqnarray*}
-    A_1 & = & \{\mbox{Person has sleep apnea}\} \\
-    A_2 & = & \{\mbox{Person has RLS}\} 
-  \end{eqnarray*}
-\]</p>
-
-<p>Then </p>
-
-<p>\[
-\begin{eqnarray*}
-    P(A_1 \cup A_2 ) & = & P(A_1) + P(A_2) - P(A_1 \cap A_2) \\
-   & = & 0.13 - \mbox{Probability of having both}
-  \end{eqnarray*}
-\]
-Likely, some fraction of the population has both.</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Random variables</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>A <strong>random variable</strong> is a numerical outcome of an experiment.</li>
-<li>The random variables that we study will come in two varieties,
-<strong>discrete</strong> or <strong>continuous</strong>.</li>
-<li>Discrete random variable are random variables that take on only a
-countable number of possibilities.
-
-<ul>
-<li>\(P(X = k)\)</li>
-</ul></li>
-<li>Continuous random variable can take any value on the real line or some subset of the real line.
-
-<ul>
-<li>\(P(X \in A)\)</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Examples of variables that can be thought of as random variables</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The \((0-1)\) outcome of the flip of a coin</li>
-<li>The outcome from the roll of a die</li>
-<li>The BMI of a subject four years after a baseline measurement</li>
-<li>The hypertension status of a subject randomly drawn from a population</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>PMF</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>A probability mass function evaluated at a value corresponds to the
-probability that a random variable takes that value. To be a valid
-pmf a function, \(p\), must satisfy</p>
-
-<ol>
-<li>\(p(x) \geq 0\) for all \(x\)</li>
-<li>\(\sum_{x} p(x) = 1\)</li>
-</ol>
-
-<p>The sum is taken over all of the possible values for \(x\).</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>Let \(X\) be the result of a coin flip where \(X=0\) represents
-tails and \(X = 1\) represents heads.
-\[
-p(x) = (1/2)^{x} (1/2)^{1-x} ~~\mbox{ for }~~x = 0,1
-\]
-Suppose that we do not know whether or not the coin is fair; Let
-\(\theta\) be the probability of a head expressed as a proportion
-(between 0 and 1).
-\[
-p(x) = \theta^{x} (1 - \theta)^{1-x} ~~\mbox{ for }~~x = 0,1
-\]</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <hgroup>
-    <h2>PDF</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>A probability density function (pdf), is a function associated with
-a continuous random variable </p>
-
-<p><em>Areas under pdfs correspond to probabilities for that random variable</em></p>
-
-<p>To be a valid pdf, a function \(f\) must satisfy</p>
-
-<ol>
-<li><p>\(f(x) \geq 0\) for all \(x\)</p></li>
-<li><p>The area under \(f(x)\) is one.</p></li>
-</ol>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-12" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>Suppose that the proportion of help calls that get addressed in
-a random day by a help line is given by
-\[
-f(x) = \left\{\begin{array}{ll}
-    2 x & \mbox{ for } 1 > x > 0 \\
-    0                 & \mbox{ otherwise} 
-\end{array} \right. 
-\]</p>
-
-<p>Is this a mathematically valid density?</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-13" style="background:;">
-  <article data-timings="">
-    <pre><code class="r">x &lt;- c(-0.5, 0, 1, 1, 1.5)
-y &lt;- c(0, 0, 2, 0, 0)
-plot(x, y, lwd = 3, frame = FALSE, type = &quot;l&quot;)
-</code></pre>
-
-<p><img src="assets/fig/unnamed-chunk-1.png" alt="plot of chunk unnamed-chunk-1"> </p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-14" style="background:;">
-  <hgroup>
-    <h2>Example continued</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>What is the probability that 75% or fewer of calls get addressed?</p>
-
-<p><img src="assets/fig/unnamed-chunk-2.png" alt="plot of chunk unnamed-chunk-2"> </p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-15" style="background:;">
-  <article data-timings="">
-    <pre><code class="r">1.5 * 0.75/2
-</code></pre>
-
-<pre><code>## [1] 0.5625
-</code></pre>
-
-<pre><code class="r">pbeta(0.75, 2, 1)
-</code></pre>
-
-<pre><code>## [1] 0.5625
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-16" style="background:;">
-  <hgroup>
-    <h2>CDF and survival function</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The <strong>cumulative distribution function</strong> (CDF) of a random variable \(X\) is defined as the function 
-\[
-F(x) = P(X \leq x)
-\]</li>
-<li>This definition applies regardless of whether \(X\) is discrete or continuous.</li>
-<li>The <strong>survival function</strong> of a random variable \(X\) is defined as
-\[
-S(x) = P(X > x)
-\]</li>
-<li>Notice that \(S(x) = 1 - F(x)\)</li>
-<li>For continuous random variables, the PDF is the derivative of the CDF</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-17" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>What are the survival function and CDF from the density considered before?</p>
-
-<p>For \(1 \geq x \geq 0\)
-\[
-F(x) = P(X \leq x) = \frac{1}{2} Base \times Height = \frac{1}{2} (x) \times (2 x) = x^2
-\]</p>
-
-<p>\[
-S(x) = 1 - x^2
-\]</p>
-
-<pre><code class="r">pbeta(c(0.4, 0.5, 0.6), 2, 1)
-</code></pre>
-
-<pre><code>## [1] 0.16 0.25 0.36
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-18" style="background:;">
-  <hgroup>
-    <h2>Quantiles</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The  \(\alpha^{th}\) <strong>quantile</strong> of a distribution with distribution function \(F\) is the point \(x_\alpha\) so that
-\[
-F(x_\alpha) = \alpha
-\]</li>
-<li>A <strong>percentile</strong> is simply a quantile with \(\alpha\) expressed as a percent</li>
-<li>The <strong>median</strong> is the \(50^{th}\) percentile</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-19" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>We want to solve \(0.5 = F(x) = x^2\)</li>
-<li>Resulting in the solution </li>
-</ul>
-
-<pre><code class="r">sqrt(0.5)
-</code></pre>
-
-<pre><code>## [1] 0.7071
-</code></pre>
-
-<ul>
-<li>Therefore, about 0.7071 of calls being answered on a random day is the median.</li>
-<li>R can approximate quantiles for you for common distributions</li>
-</ul>
-
-<pre><code class="r">qbeta(0.5, 2, 1)
-</code></pre>
-
-<pre><code>## [1] 0.7071
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-20" style="background:;">
-  <hgroup>
-    <h2>Summary</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>You might be wondering at this point &quot;I&#39;ve heard of a median before, it didn&#39;t require integration. Where&#39;s the data?&quot;</li>
-<li>We&#39;re referring to are <strong>population quantities</strong>. Therefore, the median being
-discussed is the <strong>population median</strong>.</li>
-<li>A probability model connects the data to the population using assumptions.</li>
-<li>Therefore the median we&#39;re discussing is the <strong>estimand</strong>, the sample median will be the <strong>estimator</strong></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Notation'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Interpretation of set operations'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Probability'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Example consequences'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Example'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Example continued'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Random variables'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Examples of variables that can be thought of as random variables'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='PMF'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='Example'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title='PDF'>
-         11
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=12 title='Example'>
-         12
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=13 title=''>
-         13
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=14 title='Example continued'>
-         14
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=15 title=''>
-         15
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=16 title='CDF and survival function'>
-         16
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=17 title='Example'>
-         17
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=18 title='Quantiles'>
-         18
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=19 title='Example'>
-         19
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=20 title='Summary'>
-         20
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Probability</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Probability">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Probability</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Notation</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The <strong>sample space</strong>, \(\Omega\), is the collection of possible outcomes of an experiment
+
+<ul>
+<li>Example: die roll \(\Omega = \{1,2,3,4,5,6\}\)</li>
+</ul></li>
+<li>An <strong>event</strong>, say \(E\), is a subset of \(\Omega\) 
+
+<ul>
+<li>Example: die roll is even \(E = \{2,4,6\}\)</li>
+</ul></li>
+<li>An <strong>elementary</strong> or <strong>simple</strong> event is a particular result
+of an experiment
+
+<ul>
+<li>Example: die roll is a four, \(\omega = 4\)</li>
+</ul></li>
+<li>\(\emptyset\) is called the <strong>null event</strong> or the <strong>empty set</strong></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Interpretation of set operations</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>Normal set operations have particular interpretations in this setting</p>
+
+<ol>
+<li>\(\omega \in E\) implies that \(E\) occurs when \(\omega\) occurs</li>
+<li>\(\omega \not\in E\) implies that \(E\) does not occur when \(\omega\) occurs</li>
+<li>\(E \subset F\) implies that the occurrence of \(E\) implies the occurrence of \(F\)</li>
+<li>\(E \cap F\)  implies the event that both \(E\) and \(F\) occur</li>
+<li>\(E \cup F\) implies the event that at least one of \(E\) or \(F\) occur</li>
+<li>\(E \cap F=\emptyset\) means that \(E\) and \(F\) are <strong>mutually exclusive</strong>, or cannot both occur</li>
+<li>\(E^c\) or \(\bar E\) is the event that \(E\) does not occur</li>
+</ol>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Probability</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>A <strong>probability measure</strong>, \(P\), is a function from the collection of possible events so that the following hold</p>
+
+<ol>
+<li>For an event \(E\subset \Omega\), \(0 \leq P(E) \leq 1\)</li>
+<li>\(P(\Omega) = 1\)</li>
+<li>If \(E_1\) and \(E_2\) are mutually exclusive events
+\(P(E_1 \cup E_2) = P(E_1) + P(E_2)\).</li>
+</ol>
+
+<p>Part 3 of the definition implies <strong>finite additivity</strong></p>
+
+<p>\[
+P(\cup_{i=1}^n A_i) = \sum_{i=1}^n P(A_i)
+\]
+where the \(\{A_i\}\) are mutually exclusive. (Note a more general version of
+additivity is used in advanced classes.)</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Example consequences</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>\(P(\emptyset) = 0\)</li>
+<li>\(P(E) = 1 - P(E^c)\)</li>
+<li>\(P(A \cup B) = P(A) + P(B) - P(A \cap B)\)</li>
+<li>if \(A \subset B\) then \(P(A) \leq P(B)\)</li>
+<li>\(P\left(A \cup B\right) = 1 - P(A^c \cap B^c)\)</li>
+<li>\(P(A \cap B^c) = P(A) - P(A \cap B)\)</li>
+<li>\(P(\cup_{i=1}^n E_i) \leq \sum_{i=1}^n P(E_i)\)</li>
+<li>\(P(\cup_{i=1}^n E_i) \geq \max_i P(E_i)\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>The National Sleep Foundation (<a href="http://www.sleepfoundation.org/">www.sleepfoundation.org</a>) reports that around 3% of the American population has sleep apnea. They also report that around 10% of the North American and European population has restless leg syndrome. Does this imply that 13% of people will have at least one sleep problems of these sorts?</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Example continued</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>Answer: No, the events are not mutually exclusive. To elaborate let:</p>
+
+<p>\[
+\begin{eqnarray*}
+    A_1 & = & \{\mbox{Person has sleep apnea}\} \\
+    A_2 & = & \{\mbox{Person has RLS}\} 
+  \end{eqnarray*}
+\]</p>
+
+<p>Then </p>
+
+<p>\[
+\begin{eqnarray*}
+    P(A_1 \cup A_2 ) & = & P(A_1) + P(A_2) - P(A_1 \cap A_2) \\
+   & = & 0.13 - \mbox{Probability of having both}
+  \end{eqnarray*}
+\]
+Likely, some fraction of the population has both.</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Random variables</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>A <strong>random variable</strong> is a numerical outcome of an experiment.</li>
+<li>The random variables that we study will come in two varieties,
+<strong>discrete</strong> or <strong>continuous</strong>.</li>
+<li>Discrete random variable are random variables that take on only a
+countable number of possibilities.
+
+<ul>
+<li>\(P(X = k)\)</li>
+</ul></li>
+<li>Continuous random variable can take any value on the real line or some subset of the real line.
+
+<ul>
+<li>\(P(X \in A)\)</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Examples of variables that can be thought of as random variables</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The \((0-1)\) outcome of the flip of a coin</li>
+<li>The outcome from the roll of a die</li>
+<li>The BMI of a subject four years after a baseline measurement</li>
+<li>The hypertension status of a subject randomly drawn from a population</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>PMF</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>A probability mass function evaluated at a value corresponds to the
+probability that a random variable takes that value. To be a valid
+pmf a function, \(p\), must satisfy</p>
+
+<ol>
+<li>\(p(x) \geq 0\) for all \(x\)</li>
+<li>\(\sum_{x} p(x) = 1\)</li>
+</ol>
+
+<p>The sum is taken over all of the possible values for \(x\).</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>Let \(X\) be the result of a coin flip where \(X=0\) represents
+tails and \(X = 1\) represents heads.
+\[
+p(x) = (1/2)^{x} (1/2)^{1-x} ~~\mbox{ for }~~x = 0,1
+\]
+Suppose that we do not know whether or not the coin is fair; Let
+\(\theta\) be the probability of a head expressed as a proportion
+(between 0 and 1).
+\[
+p(x) = \theta^{x} (1 - \theta)^{1-x} ~~\mbox{ for }~~x = 0,1
+\]</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <hgroup>
+    <h2>PDF</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>A probability density function (pdf), is a function associated with
+a continuous random variable </p>
+
+<p><em>Areas under pdfs correspond to probabilities for that random variable</em></p>
+
+<p>To be a valid pdf, a function \(f\) must satisfy</p>
+
+<ol>
+<li><p>\(f(x) \geq 0\) for all \(x\)</p></li>
+<li><p>The area under \(f(x)\) is one.</p></li>
+</ol>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>Suppose that the proportion of help calls that get addressed in
+a random day by a help line is given by
+\[
+f(x) = \left\{\begin{array}{ll}
+    2 x & \mbox{ for } 1 > x > 0 \\
+    0                 & \mbox{ otherwise} 
+\end{array} \right. 
+\]</p>
+
+<p>Is this a mathematically valid density?</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-13" style="background:;">
+  <article data-timings="">
+    <pre><code class="r">x &lt;- c(-0.5, 0, 1, 1, 1.5)
+y &lt;- c(0, 0, 2, 0, 0)
+plot(x, y, lwd = 3, frame = FALSE, type = &quot;l&quot;)
+</code></pre>
+
+<p><img src="assets/fig/unnamed-chunk-1.png" title="plot of chunk unnamed-chunk-1" alt="plot of chunk unnamed-chunk-1" style="display: block; margin: auto;" /></p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-14" style="background:;">
+  <hgroup>
+    <h2>Example continued</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>What is the probability that 75% or fewer of calls get addressed?</p>
+
+<p><img src="assets/fig/unnamed-chunk-2.png" title="plot of chunk unnamed-chunk-2" alt="plot of chunk unnamed-chunk-2" style="display: block; margin: auto;" /></p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-15" style="background:;">
+  <article data-timings="">
+    <pre><code class="r">1.5 * 0.75/2
+</code></pre>
+
+<pre><code>## [1] 0.5625
+</code></pre>
+
+<pre><code class="r">pbeta(0.75, 2, 1)
+</code></pre>
+
+<pre><code>## [1] 0.5625
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-16" style="background:;">
+  <hgroup>
+    <h2>CDF and survival function</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The <strong>cumulative distribution function</strong> (CDF) of a random variable \(X\) is defined as the function 
+\[
+F(x) = P(X \leq x)
+\]</li>
+<li>This definition applies regardless of whether \(X\) is discrete or continuous.</li>
+<li>The <strong>survival function</strong> of a random variable \(X\) is defined as
+\[
+S(x) = P(X > x)
+\]</li>
+<li>Notice that \(S(x) = 1 - F(x)\)</li>
+<li>For continuous random variables, the PDF is the derivative of the CDF</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-17" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>What are the survival function and CDF from the density considered before?</p>
+
+<p>For \(1 \geq x \geq 0\)
+\[
+F(x) = P(X \leq x) = \frac{1}{2} Base \times Height = \frac{1}{2} (x) \times (2 x) = x^2
+\]</p>
+
+<p>\[
+S(x) = 1 - x^2
+\]</p>
+
+<pre><code class="r">pbeta(c(0.4, 0.5, 0.6), 2, 1)
+</code></pre>
+
+<pre><code>## [1] 0.16 0.25 0.36
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-18" style="background:;">
+  <hgroup>
+    <h2>Quantiles</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The  \(\alpha^{th}\) <strong>quantile</strong> of a distribution with distribution function \(F\) is the point \(x_\alpha\) so that
+\[
+F(x_\alpha) = \alpha
+\]</li>
+<li>A <strong>percentile</strong> is simply a quantile with \(\alpha\) expressed as a percent</li>
+<li>The <strong>median</strong> is the \(50^{th}\) percentile</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-19" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>We want to solve \(0.5 = F(x) = x^2\)</li>
+<li>Resulting in the solution </li>
+</ul>
+
+<pre><code class="r">sqrt(0.5)
+</code></pre>
+
+<pre><code>## [1] 0.7071
+</code></pre>
+
+<ul>
+<li>Therefore, about 0.7071 of calls being answered on a random day is the median.</li>
+<li>R can approximate quantiles for you for common distributions</li>
+</ul>
+
+<pre><code class="r">qbeta(0.5, 2, 1)
+</code></pre>
+
+<pre><code>## [1] 0.7071
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-20" style="background:;">
+  <hgroup>
+    <h2>Summary</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>You might be wondering at this point &quot;I&#39;ve heard of a median before, it didn&#39;t require integration. Where&#39;s the data?&quot;</li>
+<li>We&#39;re referring to are <strong>population quantities</strong>. Therefore, the median being
+discussed is the <strong>population median</strong>.</li>
+<li>A probability model connects the data to the population using assumptions.</li>
+<li>Therefore the median we&#39;re discussing is the <strong>estimand</strong>, the sample median will be the <strong>estimator</strong></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Notation'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Interpretation of set operations'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Probability'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Example consequences'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Example'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Example continued'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Random variables'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Examples of variables that can be thought of as random variables'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='PMF'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='Example'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title='PDF'>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title='Example'>
+         12
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=13 title=''>
+         13
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=14 title='Example continued'>
+         14
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=15 title=''>
+         15
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=16 title='CDF and survival function'>
+         16
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=17 title='Example'>
+         17
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=18 title='Quantiles'>
+         18
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=19 title='Example'>
+         19
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=20 title='Summary'>
+         20
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/01_02_Probability/index.md b/06_StatisticalInference/01_02_Probability/index.md
index 61a470797..6b6231410 100644
--- a/06_StatisticalInference/01_02_Probability/index.md
+++ b/06_StatisticalInference/01_02_Probability/index.md
@@ -1,311 +1,310 @@
----
-title       : Probability
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-## Notation
-
-- The **sample space**, $\Omega$, is the collection of possible outcomes of an experiment
-  - Example: die roll $\Omega = \{1,2,3,4,5,6\}$
-- An **event**, say $E$, is a subset of $\Omega$ 
-  - Example: die roll is even $E = \{2,4,6\}$
-- An **elementary** or **simple** event is a particular result
-  of an experiment
-  - Example: die roll is a four, $\omega = 4$
-- $\emptyset$ is called the **null event** or the **empty set**
-
----
-
-## Interpretation of set operations
-
-Normal set operations have particular interpretations in this setting
-
-1. $\omega \in E$ implies that $E$ occurs when $\omega$ occurs
-2. $\omega \not\in E$ implies that $E$ does not occur when $\omega$ occurs
-3. $E \subset F$ implies that the occurrence of $E$ implies the occurrence of $F$
-4. $E \cap F$  implies the event that both $E$ and $F$ occur
-5. $E \cup F$ implies the event that at least one of $E$ or $F$ occur
-6. $E \cap F=\emptyset$ means that $E$ and $F$ are **mutually exclusive**, or cannot both occur
-7. $E^c$ or $\bar E$ is the event that $E$ does not occur
-
----
-
-## Probability
-
-A **probability measure**, $P$, is a function from the collection of possible events so that the following hold
-
-1. For an event $E\subset \Omega$, $0 \leq P(E) \leq 1$
-2. $P(\Omega) = 1$
-3. If $E_1$ and $E_2$ are mutually exclusive events
-  $P(E_1 \cup E_2) = P(E_1) + P(E_2)$.
-
-Part 3 of the definition implies **finite additivity**
-
-$$
-P(\cup_{i=1}^n A_i) = \sum_{i=1}^n P(A_i)
-$$
-where the $\{A_i\}$ are mutually exclusive. (Note a more general version of
-additivity is used in advanced classes.)
-
-
----
-
-
-## Example consequences
-
-- $P(\emptyset) = 0$
-- $P(E) = 1 - P(E^c)$
-- $P(A \cup B) = P(A) + P(B) - P(A \cap B)$
-- if $A \subset B$ then $P(A) \leq P(B)$
-- $P\left(A \cup B\right) = 1 - P(A^c \cap B^c)$
-- $P(A \cap B^c) = P(A) - P(A \cap B)$
-- $P(\cup_{i=1}^n E_i) \leq \sum_{i=1}^n P(E_i)$
-- $P(\cup_{i=1}^n E_i) \geq \max_i P(E_i)$
-
----
-
-## Example
-
-The National Sleep Foundation ([www.sleepfoundation.org](http://www.sleepfoundation.org/)) reports that around 3% of the American population has sleep apnea. They also report that around 10% of the North American and European population has restless leg syndrome. Does this imply that 13% of people will have at least one sleep problems of these sorts?
-
----
-
-## Example continued
-
-Answer: No, the events are not mutually exclusive. To elaborate let:
-
-$$
-\begin{eqnarray*}
-    A_1 & = & \{\mbox{Person has sleep apnea}\} \\
-    A_2 & = & \{\mbox{Person has RLS}\} 
-  \end{eqnarray*}
-$$
-
-Then 
-
-$$
-\begin{eqnarray*}
-    P(A_1 \cup A_2 ) & = & P(A_1) + P(A_2) - P(A_1 \cap A_2) \\
-   & = & 0.13 - \mbox{Probability of having both}
-  \end{eqnarray*}
-$$
-Likely, some fraction of the population has both.
-
----
-
-## Random variables
-
-- A **random variable** is a numerical outcome of an experiment.
-- The random variables that we study will come in two varieties,
-  **discrete** or **continuous**.
-- Discrete random variable are random variables that take on only a
-countable number of possibilities.
-  * $P(X = k)$
-- Continuous random variable can take any value on the real line or some subset of the real line.
-  * $P(X \in A)$
-
----
-
-## Examples of variables that can be thought of as random variables
-
-- The $(0-1)$ outcome of the flip of a coin
-- The outcome from the roll of a die
-- The BMI of a subject four years after a baseline measurement
-- The hypertension status of a subject randomly drawn from a population
-
----
-
-## PMF
-
-A probability mass function evaluated at a value corresponds to the
-probability that a random variable takes that value. To be a valid
-pmf a function, $p$, must satisfy
-
-  1. $p(x) \geq 0$ for all $x$
-  2. $\sum_{x} p(x) = 1$
-
-The sum is taken over all of the possible values for $x$.
-
----
-
-## Example
-
-Let $X$ be the result of a coin flip where $X=0$ represents
-tails and $X = 1$ represents heads.
-$$
-p(x) = (1/2)^{x} (1/2)^{1-x} ~~\mbox{ for }~~x = 0,1
-$$
-Suppose that we do not know whether or not the coin is fair; Let
-$\theta$ be the probability of a head expressed as a proportion
-(between 0 and 1).
-$$
-p(x) = \theta^{x} (1 - \theta)^{1-x} ~~\mbox{ for }~~x = 0,1
-$$
-
----
-
-## PDF
-
-A probability density function (pdf), is a function associated with
-a continuous random variable 
-
-  *Areas under pdfs correspond to probabilities for that random variable*
-
-To be a valid pdf, a function $f$ must satisfy
-
-1. $f(x) \geq 0$ for all $x$
-
-2. The area under $f(x)$ is one.
-
----
-## Example
-
-Suppose that the proportion of help calls that get addressed in
-a random day by a help line is given by
-$$
-f(x) = \left\{\begin{array}{ll}
-    2 x & \mbox{ for } 1 > x > 0 \\
-    0                 & \mbox{ otherwise} 
-\end{array} \right. 
-$$
-
-Is this a mathematically valid density?
-
----
-
-
-```r
-x <- c(-0.5, 0, 1, 1, 1.5)
-y <- c(0, 0, 2, 0, 0)
-plot(x, y, lwd = 3, frame = FALSE, type = "l")
-```
-
-![plot of chunk unnamed-chunk-1](assets/fig/unnamed-chunk-1.png) 
-
-
----
-
-## Example continued
-
-What is the probability that 75% or fewer of calls get addressed?
-
-![plot of chunk unnamed-chunk-2](assets/fig/unnamed-chunk-2.png) 
-
-
----
-
-```r
-1.5 * 0.75/2
-```
-
-```
-## [1] 0.5625
-```
-
-```r
-pbeta(0.75, 2, 1)
-```
-
-```
-## [1] 0.5625
-```
-
----
-
-## CDF and survival function
-
-- The **cumulative distribution function** (CDF) of a random variable $X$ is defined as the function 
-$$
-F(x) = P(X \leq x)
-$$
-- This definition applies regardless of whether $X$ is discrete or continuous.
-- The **survival function** of a random variable $X$ is defined as
-$$
-S(x) = P(X > x)
-$$
-- Notice that $S(x) = 1 - F(x)$
-- For continuous random variables, the PDF is the derivative of the CDF
-
----
-
-## Example
-
-What are the survival function and CDF from the density considered before?
-
-For $1 \geq x \geq 0$
-$$
-F(x) = P(X \leq x) = \frac{1}{2} Base \times Height = \frac{1}{2} (x) \times (2 x) = x^2
-$$
-
-$$
-S(x) = 1 - x^2
-$$
-
-
-```r
-pbeta(c(0.4, 0.5, 0.6), 2, 1)
-```
-
-```
-## [1] 0.16 0.25 0.36
-```
-
-
----
-
-## Quantiles
-
-- The  $\alpha^{th}$ **quantile** of a distribution with distribution function $F$ is the point $x_\alpha$ so that
-$$
-F(x_\alpha) = \alpha
-$$
-- A **percentile** is simply a quantile with $\alpha$ expressed as a percent
-- The **median** is the $50^{th}$ percentile
-
----
-## Example
-- We want to solve $0.5 = F(x) = x^2$
-- Resulting in the solution 
-
-```r
-sqrt(0.5)
-```
-
-```
-## [1] 0.7071
-```
-
-- Therefore, about 0.7071 of calls being answered on a random day is the median.
-- R can approximate quantiles for you for common distributions
-
-
-```r
-qbeta(0.5, 2, 1)
-```
-
-```
-## [1] 0.7071
-```
-
-
----
-
-## Summary
-
-- You might be wondering at this point "I've heard of a median before, it didn't require integration. Where's the data?"
-- We're referring to are **population quantities**. Therefore, the median being
-  discussed is the **population median**.
-- A probability model connects the data to the population using assumptions.
-- Therefore the median we're discussing is the **estimand**, the sample median will be the **estimator**
-
+---
+title       : Probability
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Notation
+
+- The **sample space**, $\Omega$, is the collection of possible outcomes of an experiment
+  - Example: die roll $\Omega = \{1,2,3,4,5,6\}$
+- An **event**, say $E$, is a subset of $\Omega$ 
+  - Example: die roll is even $E = \{2,4,6\}$
+- An **elementary** or **simple** event is a particular result
+  of an experiment
+  - Example: die roll is a four, $\omega = 4$
+- $\emptyset$ is called the **null event** or the **empty set**
+
+---
+
+## Interpretation of set operations
+
+Normal set operations have particular interpretations in this setting
+
+1. $\omega \in E$ implies that $E$ occurs when $\omega$ occurs
+2. $\omega \not\in E$ implies that $E$ does not occur when $\omega$ occurs
+3. $E \subset F$ implies that the occurrence of $E$ implies the occurrence of $F$
+4. $E \cap F$  implies the event that both $E$ and $F$ occur
+5. $E \cup F$ implies the event that at least one of $E$ or $F$ occur
+6. $E \cap F=\emptyset$ means that $E$ and $F$ are **mutually exclusive**, or cannot both occur
+7. $E^c$ or $\bar E$ is the event that $E$ does not occur
+
+---
+
+## Probability
+
+A **probability measure**, $P$, is a function from the collection of possible events so that the following hold
+
+1. For an event $E\subset \Omega$, $0 \leq P(E) \leq 1$
+2. $P(\Omega) = 1$
+3. If $E_1$ and $E_2$ are mutually exclusive events
+  $P(E_1 \cup E_2) = P(E_1) + P(E_2)$.
+
+Part 3 of the definition implies **finite additivity**
+
+$$
+P(\cup_{i=1}^n A_i) = \sum_{i=1}^n P(A_i)
+$$
+where the $\{A_i\}$ are mutually exclusive. (Note a more general version of
+additivity is used in advanced classes.)
+
+
+---
+
+
+## Example consequences
+
+- $P(\emptyset) = 0$
+- $P(E) = 1 - P(E^c)$
+- $P(A \cup B) = P(A) + P(B) - P(A \cap B)$
+- if $A \subset B$ then $P(A) \leq P(B)$
+- $P\left(A \cup B\right) = 1 - P(A^c \cap B^c)$
+- $P(A \cap B^c) = P(A) - P(A \cap B)$
+- $P(\cup_{i=1}^n E_i) \leq \sum_{i=1}^n P(E_i)$
+- $P(\cup_{i=1}^n E_i) \geq \max_i P(E_i)$
+
+---
+
+## Example
+
+The National Sleep Foundation ([www.sleepfoundation.org](http://www.sleepfoundation.org/)) reports that around 3% of the American population has sleep apnea. They also report that around 10% of the North American and European population has restless leg syndrome. Does this imply that 13% of people will have at least one sleep problems of these sorts?
+
+---
+
+## Example continued
+
+Answer: No, the events are not mutually exclusive. To elaborate let:
+
+$$
+\begin{eqnarray*}
+    A_1 & = & \{\mbox{Person has sleep apnea}\} \\
+    A_2 & = & \{\mbox{Person has RLS}\} 
+  \end{eqnarray*}
+$$
+
+Then 
+
+$$
+\begin{eqnarray*}
+    P(A_1 \cup A_2 ) & = & P(A_1) + P(A_2) - P(A_1 \cap A_2) \\
+   & = & 0.13 - \mbox{Probability of having both}
+  \end{eqnarray*}
+$$
+Likely, some fraction of the population has both.
+
+---
+
+## Random variables
+
+- A **random variable** is a numerical outcome of an experiment.
+- The random variables that we study will come in two varieties,
+  **discrete** or **continuous**.
+- Discrete random variable are random variables that take on only a
+countable number of possibilities.
+  * $P(X = k)$
+- Continuous random variable can take any value on the real line or some subset of the real line.
+  * $P(X \in A)$
+
+---
+
+## Examples of variables that can be thought of as random variables
+
+- The $(0-1)$ outcome of the flip of a coin
+- The outcome from the roll of a die
+- The BMI of a subject four years after a baseline measurement
+- The hypertension status of a subject randomly drawn from a population
+
+---
+
+## PMF
+
+A probability mass function evaluated at a value corresponds to the
+probability that a random variable takes that value. To be a valid
+pmf a function, $p$, must satisfy
+
+  1. $p(x) \geq 0$ for all $x$
+  2. $\sum_{x} p(x) = 1$
+
+The sum is taken over all of the possible values for $x$.
+
+---
+
+## Example
+
+Let $X$ be the result of a coin flip where $X=0$ represents
+tails and $X = 1$ represents heads.
+$$
+p(x) = (1/2)^{x} (1/2)^{1-x} ~~\mbox{ for }~~x = 0,1
+$$
+Suppose that we do not know whether or not the coin is fair; Let
+$\theta$ be the probability of a head expressed as a proportion
+(between 0 and 1).
+$$
+p(x) = \theta^{x} (1 - \theta)^{1-x} ~~\mbox{ for }~~x = 0,1
+$$
+
+---
+
+## PDF
+
+A probability density function (pdf), is a function associated with
+a continuous random variable 
+
+  *Areas under pdfs correspond to probabilities for that random variable*
+
+To be a valid pdf, a function $f$ must satisfy
+
+1. $f(x) \geq 0$ for all $x$
+
+2. The area under $f(x)$ is one.
+
+---
+## Example
+
+Suppose that the proportion of help calls that get addressed in
+a random day by a help line is given by
+$$
+f(x) = \left\{\begin{array}{ll}
+    2 x & \mbox{ for } 1 > x > 0 \\
+    0                 & \mbox{ otherwise} 
+\end{array} \right. 
+$$
+
+Is this a mathematically valid density?
+
+---
+
+
+```r
+x <- c(-0.5, 0, 1, 1, 1.5)
+y <- c(0, 0, 2, 0, 0)
+plot(x, y, lwd = 3, frame = FALSE, type = "l")
+```
+
+<img src="assets/fig/unnamed-chunk-1.png" title="plot of chunk unnamed-chunk-1" alt="plot of chunk unnamed-chunk-1" style="display: block; margin: auto;" />
+
+
+---
+
+## Example continued
+
+What is the probability that 75% or fewer of calls get addressed?
+
+<img src="assets/fig/unnamed-chunk-2.png" title="plot of chunk unnamed-chunk-2" alt="plot of chunk unnamed-chunk-2" style="display: block; margin: auto;" />
+
+
+---
+
+```r
+1.5 * 0.75/2
+```
+
+```
+## [1] 0.5625
+```
+
+```r
+pbeta(0.75, 2, 1)
+```
+
+```
+## [1] 0.5625
+```
+
+---
+
+## CDF and survival function
+
+- The **cumulative distribution function** (CDF) of a random variable $X$ is defined as the function 
+$$
+F(x) = P(X \leq x)
+$$
+- This definition applies regardless of whether $X$ is discrete or continuous.
+- The **survival function** of a random variable $X$ is defined as
+$$
+S(x) = P(X > x)
+$$
+- Notice that $S(x) = 1 - F(x)$
+- For continuous random variables, the PDF is the derivative of the CDF
+
+---
+
+## Example
+
+What are the survival function and CDF from the density considered before?
+
+For $1 \geq x \geq 0$
+$$
+F(x) = P(X \leq x) = \frac{1}{2} Base \times Height = \frac{1}{2} (x) \times (2 x) = x^2
+$$
+
+$$
+S(x) = 1 - x^2
+$$
+
+
+```r
+pbeta(c(0.4, 0.5, 0.6), 2, 1)
+```
+
+```
+## [1] 0.16 0.25 0.36
+```
+
+
+---
+
+## Quantiles
+
+- The  $\alpha^{th}$ **quantile** of a distribution with distribution function $F$ is the point $x_\alpha$ so that
+$$
+F(x_\alpha) = \alpha
+$$
+- A **percentile** is simply a quantile with $\alpha$ expressed as a percent
+- The **median** is the $50^{th}$ percentile
+
+---
+## Example
+- We want to solve $0.5 = F(x) = x^2$
+- Resulting in the solution 
+
+```r
+sqrt(0.5)
+```
+
+```
+## [1] 0.7071
+```
+
+- Therefore, about 0.7071 of calls being answered on a random day is the median.
+- R can approximate quantiles for you for common distributions
+
+
+```r
+qbeta(0.5, 2, 1)
+```
+
+```
+## [1] 0.7071
+```
+
+
+---
+
+## Summary
+
+- You might be wondering at this point "I've heard of a median before, it didn't require integration. Where's the data?"
+- We're referring to are **population quantities**. Therefore, the median being
+  discussed is the **population median**.
+- A probability model connects the data to the population using assumptions.
+- Therefore the median we're discussing is the **estimand**, the sample median will be the **estimator**
diff --git a/06_StatisticalInference/01_02_Probability/index.pdf b/06_StatisticalInference/01_02_Probability/index.pdf
index b431ce394..fddebae9e 100644
Binary files a/06_StatisticalInference/01_02_Probability/index.pdf and b/06_StatisticalInference/01_02_Probability/index.pdf differ
diff --git a/06_StatisticalInference/01_03_Expectations/assets/fig/lsm.png b/06_StatisticalInference/01_03_Expectations/assets/fig/lsm.png
new file mode 100644
index 000000000..d37f19c84
Binary files /dev/null and b/06_StatisticalInference/01_03_Expectations/assets/fig/lsm.png differ
diff --git a/06_StatisticalInference/01_03_Expectations/assets/fig/unnamed-chunk-1.png b/06_StatisticalInference/01_03_Expectations/assets/fig/unnamed-chunk-1.png
new file mode 100644
index 000000000..d55969896
Binary files /dev/null and b/06_StatisticalInference/01_03_Expectations/assets/fig/unnamed-chunk-1.png differ
diff --git a/06_StatisticalInference/01_03_Expectations/assets/fig/unnamed-chunk-2.png b/06_StatisticalInference/01_03_Expectations/assets/fig/unnamed-chunk-2.png
new file mode 100644
index 000000000..94882ce30
Binary files /dev/null and b/06_StatisticalInference/01_03_Expectations/assets/fig/unnamed-chunk-2.png differ
diff --git a/06_StatisticalInference/01_03_Expectations/assets/fig/unnamed-chunk-3.png b/06_StatisticalInference/01_03_Expectations/assets/fig/unnamed-chunk-3.png
new file mode 100644
index 000000000..15a643336
Binary files /dev/null and b/06_StatisticalInference/01_03_Expectations/assets/fig/unnamed-chunk-3.png differ
diff --git a/06_StatisticalInference/01_03_Expectations/index.html b/06_StatisticalInference/01_03_Expectations/index.html
index 04a508ac3..72dd4b7b5 100644
--- a/06_StatisticalInference/01_03_Expectations/index.html
+++ b/06_StatisticalInference/01_03_Expectations/index.html
@@ -1,549 +1,552 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Expected values</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Expected values">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Expected values</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Expected values</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The <strong>expected value</strong> or <strong>mean</strong> of a random variable is the center of its distribution</li>
-<li>For discrete random variable \(X\) with PMF \(p(x)\), it is defined as follows
-\[
-E[X] = \sum_x xp(x).
-\]
-where the sum is taken over the possible values of \(x\)</li>
-<li>\(E[X]\) represents the center of mass of a collection of locations and weights, \(\{x, p(x)\}\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <h3>Find the center of mass of the bars</h3>
-
-<p><img src="assets/fig/unnamed-chunk-1.png" alt="plot of chunk unnamed-chunk-1"> </p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Using manipulate</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code>library(manipulate)
-myHist &lt;- function(mu){
-  hist(galton$child,col=&quot;blue&quot;,breaks=100)
-  lines(c(mu, mu), c(0, 150),col=&quot;red&quot;,lwd=5)
-  mse &lt;- mean((galton$child - mu)^2)
-  text(63, 150, paste(&quot;mu = &quot;, mu))
-  text(63, 140, paste(&quot;Imbalance = &quot;, round(mse, 2)))
-}
-manipulate(myHist(mu), mu = slider(62, 74, step = 0.5))
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>The center of mass is the empirical mean</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">hist(galton$child, col = &quot;blue&quot;, breaks = 100)
-meanChild &lt;- mean(galton$child)
-lines(rep(meanChild, 100), seq(0, 150, length = 100), col = &quot;red&quot;, lwd = 5)
-</code></pre>
-
-<p><img src="assets/fig/lsm.png" alt="plot of chunk lsm"> </p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose a coin is flipped and \(X\) is declared \(0\) or \(1\) corresponding to a head or a tail, respectively</li>
-<li>What is the expected value of \(X\)? 
-\[
-E[X] = .5 \times 0 + .5 \times 1 = .5
-\]</li>
-<li>Note, if thought about geometrically, this answer is obvious; if two equal weights are spaced at 0 and 1, the center of mass will be \(.5\)</li>
-</ul>
-
-<p><img src="assets/fig/unnamed-chunk-2.png" alt="plot of chunk unnamed-chunk-2"> </p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose that a die is rolled and \(X\) is the number face up</li>
-<li>What is the expected value of \(X\)?
-\[
-E[X] = 1 \times \frac{1}{6} + 2 \times \frac{1}{6} +
-3 \times \frac{1}{6} + 4 \times \frac{1}{6} +
-5 \times \frac{1}{6} + 6 \times \frac{1}{6} = 3.5
-\]</li>
-<li>Again, the geometric argument makes this answer obvious without calculation.</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Continuous random variables</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>For a continuous random variable, \(X\), with density, \(f\), the expected
-value is defined as follows
-\[
-E[X] = \mbox{the area under the function}~~~ t f(t)
-\]</li>
-<li>This definition borrows from the definition of center of mass for a continuous body</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Consider a density where \(f(x) = 1\) for \(x\) between zero and one</li>
-<li>(Is this a valid density?)</li>
-<li>Suppose that \(X\) follows this density; what is its expected value?<br>
-<img src="assets/fig/unnamed-chunk-3.png" alt="plot of chunk unnamed-chunk-3"> </li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>Rules about expected values</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The expected value is a linear operator </li>
-<li>If \(a\) and \(b\) are not random and \(X\) and \(Y\) are two random variables then
-
-<ul>
-<li>\(E[aX + b] = a E[X] + b\)</li>
-<li>\(E[X + Y] = E[X] + E[Y]\)</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>You flip a coin, \(X\) and simulate a uniform random number \(Y\), what is the expected value of their sum? 
-\[
-E[X + Y] = E[X] + E[Y] = .5 + .5 = 1
-\] </li>
-<li>Another example, you roll a die twice. What is the expected value of the average? </li>
-<li>Let \(X_1\) and \(X_2\) be the results of the two rolls
-\[
-E[(X_1 + X_2) / 2] = \frac{1}{2}(E[X_1] + E[X_2])
-= \frac{1}{2}(3.5 + 3.5) = 3.5
-\]</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ol>
-<li>Let \(X_i\) for \(i=1,\ldots,n\) be a collection of random variables, each from a distribution with mean \(\mu\)</li>
-<li>Calculate the expected value of the sample average of the \(X_i\)
-\[
-\begin{eqnarray*}
-E\left[ \frac{1}{n}\sum_{i=1}^n X_i\right]
-& = & \frac{1}{n} E\left[\sum_{i=1}^n X_i\right] \\
-& = & \frac{1}{n} \sum_{i=1}^n E\left[X_i\right] \\
-& = & \frac{1}{n} \sum_{i=1}^n \mu =  \mu.
-\end{eqnarray*}
-\]</li>
-</ol>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-12" style="background:;">
-  <hgroup>
-    <h2>Remark</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Therefore, the expected value of the <strong>sample mean</strong> is the population mean that it&#39;s trying to estimate</li>
-<li>When the expected value of an estimator is what its trying to estimate, we say that the estimator is <strong>unbiased</strong></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-13" style="background:;">
-  <hgroup>
-    <h2>The variance</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The variance of a random variable is a measure of <em>spread</em></li>
-<li>If \(X\) is a random variable with mean \(\mu\), the variance of \(X\) is defined as</li>
-</ul>
-
-<p>\[
-Var(X) = E[(X - \mu)^2]
-\]</p>
-
-<p>the expected (squared) distance from the mean</p>
-
-<ul>
-<li>Densities with a higher variance are more spread out than densities with a lower variance</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-14" style="background:;">
-  <article data-timings="">
-    <ul>
-<li>Convenient computational form
-\[
-Var(X) = E[X^2] - E[X]^2
-\]</li>
-<li>If \(a\) is constant then \(Var(aX) = a^2 Var(X)\)</li>
-<li>The square root of the variance is called the <strong>standard deviation</strong></li>
-<li>The standard deviation has the same units as \(X\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-15" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li><p>What&#39;s the sample variance from the result of a toss of a die? </p>
-
-<ul>
-<li>\(E[X] = 3.5\) </li>
-<li>\(E[X^2] = 1 ^ 2 \times \frac{1}{6} + 2 ^ 2 \times \frac{1}{6} + 3 ^ 2 \times \frac{1}{6} + 4 ^ 2 \times \frac{1}{6} + 5 ^ 2 \times \frac{1}{6} + 6 ^ 2 \times \frac{1}{6} = 15.17\) </li>
-</ul></li>
-<li><p>\(Var(X) = E[X^2] - E[X]^2 \approx 2.92\)</p></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-16" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li><p>What&#39;s the sample variance from the result of the toss of a coin with probability of heads (1) of \(p\)? </p>
-
-<ul>
-<li>\(E[X] = 0 \times (1 - p) + 1 \times p = p\)</li>
-<li>\(E[X^2] = E[X] = p\) </li>
-</ul></li>
-<li><p>\(Var(X) = E[X^2] - E[X]^2 = p - p^2 = p(1 - p)\)</p></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-17" style="background:;">
-  <hgroup>
-    <h2>Interpreting variances</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Chebyshev&#39;s inequality is useful for interpreting variances</li>
-<li>This inequality states that
-\[
-P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}
-\]</li>
-<li>For example, the probability that a random variable lies beyond \(k\) standard deviations from its mean is less than \(1/k^2\)
-\[
-\begin{eqnarray*}
-2\sigma & \rightarrow & 25\% \\
-3\sigma & \rightarrow & 11\% \\
-4\sigma & \rightarrow &  6\% 
-\end{eqnarray*}
-\]</li>
-<li>Note this is only a bound; the actual probability might be quite a bit smaller</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-18" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>IQs are often said to be distributed with a mean of \(100\) and a sd of \(15\)</li>
-<li>What is the probability of a randomly drawn person having an IQ higher than \(160\) or below \(40\)?</li>
-<li>Thus we want to know the probability of a person being more than \(4\) standard deviations from the mean</li>
-<li>Thus Chebyshev&#39;s inequality suggests that this will be no larger than 6\%</li>
-<li>IQs distributions are often cited as being bell shaped, in which case this bound is very conservative</li>
-<li>The probability of a random draw from a bell curve being \(4\) standard deviations from the mean is on the order of \(10^{-5}\) (one thousandth of one percent)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-19" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>A former buzz phrase in industrial quality control is Motorola&#39;s &quot;Six Sigma&quot; whereby businesses are suggested to control extreme events or rare defective parts</li>
-<li>Chebyshev&#39;s inequality states that the probability of a &quot;Six Sigma&quot; event is less than \(1/6^2 \approx 3\%\)</li>
-<li>If a bell curve is assumed, the probability of a &quot;six sigma&quot; event is on the order of \(10^{-9}\) (one ten millionth of a percent)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Expected values'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Example'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Using manipulate'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='The center of mass is the empirical mean'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Example'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Example'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Continuous random variables'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Example'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='Rules about expected values'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='Example'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title='Example'>
-         11
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=12 title='Remark'>
-         12
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=13 title='The variance'>
-         13
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=14 title=''>
-         14
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=15 title='Example'>
-         15
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=16 title='Example'>
-         16
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=17 title='Interpreting variances'>
-         17
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=18 title='Example'>
-         18
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=19 title='Example'>
-         19
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Expected values</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Expected values">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Expected values</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Expected values</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The <strong>expected value</strong> or <strong>mean</strong> of a random variable is the center of its distribution</li>
+<li>For discrete random variable \(X\) with PMF \(p(x)\), it is defined as follows
+\[
+E[X] = \sum_x xp(x).
+\]
+where the sum is taken over the possible values of \(x\)</li>
+<li>\(E[X]\) represents the center of mass of a collection of locations and weights, \(\{x, p(x)\}\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <h3>Find the center of mass of the bars</h3>
+
+<pre><code>## Loading required package: MASS
+</code></pre>
+
+<p><img src="assets/fig/unnamed-chunk-1.png" title="plot of chunk unnamed-chunk-1" alt="plot of chunk unnamed-chunk-1" style="display: block; margin: auto;" /></p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Using manipulate</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code>library(manipulate)
+myHist &lt;- function(mu){
+  hist(galton$child,col=&quot;blue&quot;,breaks=100)
+  lines(c(mu, mu), c(0, 150),col=&quot;red&quot;,lwd=5)
+  mse &lt;- mean((galton$child - mu)^2)
+  text(63, 150, paste(&quot;mu = &quot;, mu))
+  text(63, 140, paste(&quot;Imbalance = &quot;, round(mse, 2)))
+}
+manipulate(myHist(mu), mu = slider(62, 74, step = 0.5))
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>The center of mass is the empirical mean</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">hist(galton$child, col = &quot;blue&quot;, breaks = 100)
+meanChild &lt;- mean(galton$child)
+lines(rep(meanChild, 100), seq(0, 150, length = 100), col = &quot;red&quot;, lwd = 5)
+</code></pre>
+
+<p><img src="assets/fig/lsm.png" title="plot of chunk lsm" alt="plot of chunk lsm" style="display: block; margin: auto;" /></p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose a coin is flipped and \(X\) is declared \(0\) or \(1\) corresponding to a head or a tail, respectively</li>
+<li>What is the expected value of \(X\)? 
+\[
+E[X] = .5 \times 0 + .5 \times 1 = .5
+\]</li>
+<li>Note, if thought about geometrically, this answer is obvious; if two equal weights are spaced at 0 and 1, the center of mass will be \(.5\)</li>
+</ul>
+
+<p><img src="assets/fig/unnamed-chunk-2.png" title="plot of chunk unnamed-chunk-2" alt="plot of chunk unnamed-chunk-2" style="display: block; margin: auto;" /></p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose that a die is rolled and \(X\) is the number face up</li>
+<li>What is the expected value of \(X\)?
+\[
+E[X] = 1 \times \frac{1}{6} + 2 \times \frac{1}{6} +
+3 \times \frac{1}{6} + 4 \times \frac{1}{6} +
+5 \times \frac{1}{6} + 6 \times \frac{1}{6} = 3.5
+\]</li>
+<li>Again, the geometric argument makes this answer obvious without calculation.</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Continuous random variables</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>For a continuous random variable, \(X\), with density, \(f\), the expected
+value is defined as follows
+\[
+E[X] = \mbox{the area under the function}~~~ t f(t)
+\]</li>
+<li>This definition borrows from the definition of center of mass for a continuous body</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Consider a density where \(f(x) = 1\) for \(x\) between zero and one</li>
+<li>(Is this a valid density?)</li>
+<li>Suppose that \(X\) follows this density; what is its expected value?<br>
+<img src="assets/fig/unnamed-chunk-3.png" alt="plot of chunk unnamed-chunk-3"> </li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>Rules about expected values</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The expected value is a linear operator </li>
+<li>If \(a\) and \(b\) are not random and \(X\) and \(Y\) are two random variables then
+
+<ul>
+<li>\(E[aX + b] = a E[X] + b\)</li>
+<li>\(E[X + Y] = E[X] + E[Y]\)</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>You flip a coin, \(X\) and simulate a uniform random number \(Y\), what is the expected value of their sum? 
+\[
+E[X + Y] = E[X] + E[Y] = .5 + .5 = 1
+\] </li>
+<li>Another example, you roll a die twice. What is the expected value of the average? </li>
+<li>Let \(X_1\) and \(X_2\) be the results of the two rolls
+\[
+E[(X_1 + X_2) / 2] = \frac{1}{2}(E[X_1] + E[X_2])
+= \frac{1}{2}(3.5 + 3.5) = 3.5
+\]</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ol>
+<li>Let \(X_i\) for \(i=1,\ldots,n\) be a collection of random variables, each from a distribution with mean \(\mu\)</li>
+<li>Calculate the expected value of the sample average of the \(X_i\)
+\[
+\begin{eqnarray*}
+E\left[ \frac{1}{n}\sum_{i=1}^n X_i\right]
+& = & \frac{1}{n} E\left[\sum_{i=1}^n X_i\right] \\
+& = & \frac{1}{n} \sum_{i=1}^n E\left[X_i\right] \\
+& = & \frac{1}{n} \sum_{i=1}^n \mu =  \mu.
+\end{eqnarray*}
+\]</li>
+</ol>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <hgroup>
+    <h2>Remark</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Therefore, the expected value of the <strong>sample mean</strong> is the population mean that it&#39;s trying to estimate</li>
+<li>When the expected value of an estimator is what its trying to estimate, we say that the estimator is <strong>unbiased</strong></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-13" style="background:;">
+  <hgroup>
+    <h2>The variance</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The variance of a random variable is a measure of <em>spread</em></li>
+<li>If \(X\) is a random variable with mean \(\mu\), the variance of \(X\) is defined as</li>
+</ul>
+
+<p>\[
+Var(X) = E[(X - \mu)^2]
+\]</p>
+
+<p>the expected (squared) distance from the mean</p>
+
+<ul>
+<li>Densities with a higher variance are more spread out than densities with a lower variance</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-14" style="background:;">
+  <article data-timings="">
+    <ul>
+<li>Convenient computational form
+\[
+Var(X) = E[X^2] - E[X]^2
+\]</li>
+<li>If \(a\) is constant then \(Var(aX) = a^2 Var(X)\)</li>
+<li>The square root of the variance is called the <strong>standard deviation</strong></li>
+<li>The standard deviation has the same units as \(X\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-15" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li><p>What&#39;s the sample variance from the result of a toss of a die? </p>
+
+<ul>
+<li>\(E[X] = 3.5\) </li>
+<li>\(E[X^2] = 1 ^ 2 \times \frac{1}{6} + 2 ^ 2 \times \frac{1}{6} + 3 ^ 2 \times \frac{1}{6} + 4 ^ 2 \times \frac{1}{6} + 5 ^ 2 \times \frac{1}{6} + 6 ^ 2 \times \frac{1}{6} = 15.17\) </li>
+</ul></li>
+<li><p>\(Var(X) = E[X^2] - E[X]^2 \approx 2.92\)</p></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-16" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li><p>What&#39;s the sample variance from the result of the toss of a coin with probability of heads (1) of \(p\)? </p>
+
+<ul>
+<li>\(E[X] = 0 \times (1 - p) + 1 \times p = p\)</li>
+<li>\(E[X^2] = E[X] = p\) </li>
+</ul></li>
+<li><p>\(Var(X) = E[X^2] - E[X]^2 = p - p^2 = p(1 - p)\)</p></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-17" style="background:;">
+  <hgroup>
+    <h2>Interpreting variances</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Chebyshev&#39;s inequality is useful for interpreting variances</li>
+<li>This inequality states that
+\[
+P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}
+\]</li>
+<li>For example, the probability that a random variable lies beyond \(k\) standard deviations from its mean is less than \(1/k^2\)
+\[
+\begin{eqnarray*}
+2\sigma & \rightarrow & 25\% \\
+3\sigma & \rightarrow & 11\% \\
+4\sigma & \rightarrow &  6\% 
+\end{eqnarray*}
+\]</li>
+<li>Note this is only a bound; the actual probability might be quite a bit smaller</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-18" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>IQs are often said to be distributed with a mean of \(100\) and a sd of \(15\)</li>
+<li>What is the probability of a randomly drawn person having an IQ higher than \(160\) or below \(40\)?</li>
+<li>Thus we want to know the probability of a person being more than \(4\) standard deviations from the mean</li>
+<li>Thus Chebyshev&#39;s inequality suggests that this will be no larger than 6\%</li>
+<li>IQs distributions are often cited as being bell shaped, in which case this bound is very conservative</li>
+<li>The probability of a random draw from a bell curve being \(4\) standard deviations from the mean is on the order of \(10^{-5}\) (one thousandth of one percent)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-19" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>A former buzz phrase in industrial quality control is Motorola&#39;s &quot;Six Sigma&quot; whereby businesses are suggested to control extreme events or rare defective parts</li>
+<li>Chebyshev&#39;s inequality states that the probability of a &quot;Six Sigma&quot; event is less than \(1/6^2 \approx 3\%\)</li>
+<li>If a bell curve is assumed, the probability of a &quot;six sigma&quot; event is on the order of \(10^{-9}\) (one ten millionth of a percent)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Expected values'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Example'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Using manipulate'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='The center of mass is the empirical mean'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Example'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Example'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Continuous random variables'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Example'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='Rules about expected values'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='Example'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title='Example'>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title='Remark'>
+         12
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=13 title='The variance'>
+         13
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=14 title=''>
+         14
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=15 title='Example'>
+         15
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=16 title='Example'>
+         16
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=17 title='Interpreting variances'>
+         17
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=18 title='Example'>
+         18
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=19 title='Example'>
+         19
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/01_03_Expectations/index.md b/06_StatisticalInference/01_03_Expectations/index.md
index 8ac21861b..e86b30289 100644
--- a/06_StatisticalInference/01_03_Expectations/index.md
+++ b/06_StatisticalInference/01_03_Expectations/index.md
@@ -1,234 +1,239 @@
----
-title       : Expected values
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-## Expected values
-
-- The **expected value** or **mean** of a random variable is the center of its distribution
-- For discrete random variable $X$ with PMF $p(x)$, it is defined as follows
-    $$
-    E[X] = \sum_x xp(x).
-    $$
-    where the sum is taken over the possible values of $x$
-- $E[X]$ represents the center of mass of a collection of locations and weights, $\{x, p(x)\}$
-
----
-
-## Example
-### Find the center of mass of the bars
-![plot of chunk unnamed-chunk-1](assets/fig/unnamed-chunk-1.png) 
-
-
----
-## Using manipulate
-```
-library(manipulate)
-myHist <- function(mu){
-  hist(galton$child,col="blue",breaks=100)
-  lines(c(mu, mu), c(0, 150),col="red",lwd=5)
-  mse <- mean((galton$child - mu)^2)
-  text(63, 150, paste("mu = ", mu))
-  text(63, 140, paste("Imbalance = ", round(mse, 2)))
-}
-manipulate(myHist(mu), mu = slider(62, 74, step = 0.5))
-```
-
----
-## The center of mass is the empirical mean
-
-```r
-hist(galton$child, col = "blue", breaks = 100)
-meanChild <- mean(galton$child)
-lines(rep(meanChild, 100), seq(0, 150, length = 100), col = "red", lwd = 5)
-```
-
-![plot of chunk lsm](assets/fig/lsm.png) 
-
-
----
-## Example
-
-- Suppose a coin is flipped and $X$ is declared $0$ or $1$ corresponding to a head or a tail, respectively
-- What is the expected value of $X$? 
-    $$
-    E[X] = .5 \times 0 + .5 \times 1 = .5
-    $$
-- Note, if thought about geometrically, this answer is obvious; if two equal weights are spaced at 0 and 1, the center of mass will be $.5$
-
-![plot of chunk unnamed-chunk-2](assets/fig/unnamed-chunk-2.png) 
-
----
-
-## Example
-
-- Suppose that a die is rolled and $X$ is the number face up
-- What is the expected value of $X$?
-    $$
-    E[X] = 1 \times \frac{1}{6} + 2 \times \frac{1}{6} +
-    3 \times \frac{1}{6} + 4 \times \frac{1}{6} +
-    5 \times \frac{1}{6} + 6 \times \frac{1}{6} = 3.5
-    $$
-- Again, the geometric argument makes this answer obvious without calculation.
-
----
-
-## Continuous random variables
-
-- For a continuous random variable, $X$, with density, $f$, the expected
-    value is defined as follows
-    $$
-    E[X] = \mbox{the area under the function}~~~ t f(t)
-    $$
-- This definition borrows from the definition of center of mass for a continuous body
-
----
-
-## Example
-
-- Consider a density where $f(x) = 1$ for $x$ between zero and one
-- (Is this a valid density?)
-- Suppose that $X$ follows this density; what is its expected value?  
-![plot of chunk unnamed-chunk-3](assets/fig/unnamed-chunk-3.png) 
-
-
----
-
-## Rules about expected values
-
-- The expected value is a linear operator 
-- If $a$ and $b$ are not random and $X$ and $Y$ are two random variables then
-  - $E[aX + b] = a E[X] + b$
-  - $E[X + Y] = E[X] + E[Y]$
-
----
-
-## Example
-
-- You flip a coin, $X$ and simulate a uniform random number $Y$, what is the expected value of their sum? 
-    $$
-    E[X + Y] = E[X] + E[Y] = .5 + .5 = 1
-    $$ 
-- Another example, you roll a die twice. What is the expected value of the average? 
-- Let $X_1$ and $X_2$ be the results of the two rolls
-    $$
-    E[(X_1 + X_2) / 2] = \frac{1}{2}(E[X_1] + E[X_2])
-    = \frac{1}{2}(3.5 + 3.5) = 3.5
-    $$
-
----
-
-## Example
-
-1. Let $X_i$ for $i=1,\ldots,n$ be a collection of random variables, each from a distribution with mean $\mu$
-2. Calculate the expected value of the sample average of the $X_i$
-$$
-  \begin{eqnarray*}
-    E\left[ \frac{1}{n}\sum_{i=1}^n X_i\right]
-    & = & \frac{1}{n} E\left[\sum_{i=1}^n X_i\right] \\
-    & = & \frac{1}{n} \sum_{i=1}^n E\left[X_i\right] \\
-    & = & \frac{1}{n} \sum_{i=1}^n \mu =  \mu.
-  \end{eqnarray*}
-$$
-
----
-
-## Remark
-
-- Therefore, the expected value of the **sample mean** is the population mean that it's trying to estimate
-- When the expected value of an estimator is what its trying to estimate, we say that the estimator is **unbiased**
-
----
-
-## The variance
-
-- The variance of a random variable is a measure of *spread*
-- If $X$ is a random variable with mean $\mu$, the variance of $X$ is defined as
-
-$$
-Var(X) = E[(X - \mu)^2]
-$$
-    
-the expected (squared) distance from the mean
-- Densities with a higher variance are more spread out than densities with a lower variance
-
----
-
-- Convenient computational form
-$$
-Var(X) = E[X^2] - E[X]^2
-$$
-- If $a$ is constant then $Var(aX) = a^2 Var(X)$
-- The square root of the variance is called the **standard deviation**
-- The standard deviation has the same units as $X$
-
----
-
-## Example
-
-- What's the sample variance from the result of a toss of a die? 
-
-  - $E[X] = 3.5$ 
-  - $E[X^2] = 1 ^ 2 \times \frac{1}{6} + 2 ^ 2 \times \frac{1}{6} + 3 ^ 2 \times \frac{1}{6} + 4 ^ 2 \times \frac{1}{6} + 5 ^ 2 \times \frac{1}{6} + 6 ^ 2 \times \frac{1}{6} = 15.17$ 
-
-- $Var(X) = E[X^2] - E[X]^2 \approx 2.92$
-
----
-
-## Example
-
-- What's the sample variance from the result of the toss of a coin with probability of heads (1) of $p$? 
-
-  - $E[X] = 0 \times (1 - p) + 1 \times p = p$
-  - $E[X^2] = E[X] = p$ 
-
-- $Var(X) = E[X^2] - E[X]^2 = p - p^2 = p(1 - p)$
-
----
-
-## Interpreting variances
-
-- Chebyshev's inequality is useful for interpreting variances
-- This inequality states that
-$$
-P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}
-$$
-- For example, the probability that a random variable lies beyond $k$ standard deviations from its mean is less than $1/k^2$
-$$
-\begin{eqnarray*}
-    2\sigma & \rightarrow & 25\% \\
-    3\sigma & \rightarrow & 11\% \\
-    4\sigma & \rightarrow &  6\% 
-\end{eqnarray*}
-$$
-- Note this is only a bound; the actual probability might be quite a bit smaller
-
----
-
-## Example
-
-- IQs are often said to be distributed with a mean of $100$ and a sd of $15$
-- What is the probability of a randomly drawn person having an IQ higher than $160$ or below $40$?
-- Thus we want to know the probability of a person being more than $4$ standard deviations from the mean
-- Thus Chebyshev's inequality suggests that this will be no larger than 6\%
-- IQs distributions are often cited as being bell shaped, in which case this bound is very conservative
-- The probability of a random draw from a bell curve being $4$ standard deviations from the mean is on the order of $10^{-5}$ (one thousandth of one percent)
-
----
-
-## Example
-
-- A former buzz phrase in industrial quality control is Motorola's "Six Sigma" whereby businesses are suggested to control extreme events or rare defective parts
-- Chebyshev's inequality states that the probability of a "Six Sigma" event is less than $1/6^2 \approx 3\%$
-- If a bell curve is assumed, the probability of a "six sigma" event is on the order of $10^{-9}$ (one ten millionth of a percent)
-
+---
+title       : Expected values
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+## Expected values
+
+- The **expected value** or **mean** of a random variable is the center of its distribution
+- For discrete random variable $X$ with PMF $p(x)$, it is defined as follows
+    $$
+    E[X] = \sum_x xp(x).
+    $$
+    where the sum is taken over the possible values of $x$
+- $E[X]$ represents the center of mass of a collection of locations and weights, $\{x, p(x)\}$
+
+---
+
+## Example
+### Find the center of mass of the bars
+
+```
+## Loading required package: MASS
+```
+
+<img src="assets/fig/unnamed-chunk-1.png" title="plot of chunk unnamed-chunk-1" alt="plot of chunk unnamed-chunk-1" style="display: block; margin: auto;" />
+
+
+---
+## Using manipulate
+```
+library(manipulate)
+myHist <- function(mu){
+  hist(galton$child,col="blue",breaks=100)
+  lines(c(mu, mu), c(0, 150),col="red",lwd=5)
+  mse <- mean((galton$child - mu)^2)
+  text(63, 150, paste("mu = ", mu))
+  text(63, 140, paste("Imbalance = ", round(mse, 2)))
+}
+manipulate(myHist(mu), mu = slider(62, 74, step = 0.5))
+```
+
+---
+## The center of mass is the empirical mean
+
+```r
+hist(galton$child, col = "blue", breaks = 100)
+meanChild <- mean(galton$child)
+lines(rep(meanChild, 100), seq(0, 150, length = 100), col = "red", lwd = 5)
+```
+
+<img src="assets/fig/lsm.png" title="plot of chunk lsm" alt="plot of chunk lsm" style="display: block; margin: auto;" />
+
+
+---
+## Example
+
+- Suppose a coin is flipped and $X$ is declared $0$ or $1$ corresponding to a head or a tail, respectively
+- What is the expected value of $X$? 
+    $$
+    E[X] = .5 \times 0 + .5 \times 1 = .5
+    $$
+- Note, if thought about geometrically, this answer is obvious; if two equal weights are spaced at 0 and 1, the center of mass will be $.5$
+
+<img src="assets/fig/unnamed-chunk-2.png" title="plot of chunk unnamed-chunk-2" alt="plot of chunk unnamed-chunk-2" style="display: block; margin: auto;" />
+
+---
+
+## Example
+
+- Suppose that a die is rolled and $X$ is the number face up
+- What is the expected value of $X$?
+    $$
+    E[X] = 1 \times \frac{1}{6} + 2 \times \frac{1}{6} +
+    3 \times \frac{1}{6} + 4 \times \frac{1}{6} +
+    5 \times \frac{1}{6} + 6 \times \frac{1}{6} = 3.5
+    $$
+- Again, the geometric argument makes this answer obvious without calculation.
+
+---
+
+## Continuous random variables
+
+- For a continuous random variable, $X$, with density, $f$, the expected
+    value is defined as follows
+    $$
+    E[X] = \mbox{the area under the function}~~~ t f(t)
+    $$
+- This definition borrows from the definition of center of mass for a continuous body
+
+---
+
+## Example
+
+- Consider a density where $f(x) = 1$ for $x$ between zero and one
+- (Is this a valid density?)
+- Suppose that $X$ follows this density; what is its expected value?  
+![plot of chunk unnamed-chunk-3](assets/fig/unnamed-chunk-3.png) 
+
+
+---
+
+## Rules about expected values
+
+- The expected value is a linear operator 
+- If $a$ and $b$ are not random and $X$ and $Y$ are two random variables then
+  - $E[aX + b] = a E[X] + b$
+  - $E[X + Y] = E[X] + E[Y]$
+
+---
+
+## Example
+
+- You flip a coin, $X$ and simulate a uniform random number $Y$, what is the expected value of their sum? 
+    $$
+    E[X + Y] = E[X] + E[Y] = .5 + .5 = 1
+    $$ 
+- Another example, you roll a die twice. What is the expected value of the average? 
+- Let $X_1$ and $X_2$ be the results of the two rolls
+    $$
+    E[(X_1 + X_2) / 2] = \frac{1}{2}(E[X_1] + E[X_2])
+    = \frac{1}{2}(3.5 + 3.5) = 3.5
+    $$
+
+---
+
+## Example
+
+1. Let $X_i$ for $i=1,\ldots,n$ be a collection of random variables, each from a distribution with mean $\mu$
+2. Calculate the expected value of the sample average of the $X_i$
+$$
+  \begin{eqnarray*}
+    E\left[ \frac{1}{n}\sum_{i=1}^n X_i\right]
+    & = & \frac{1}{n} E\left[\sum_{i=1}^n X_i\right] \\
+    & = & \frac{1}{n} \sum_{i=1}^n E\left[X_i\right] \\
+    & = & \frac{1}{n} \sum_{i=1}^n \mu =  \mu.
+  \end{eqnarray*}
+$$
+
+---
+
+## Remark
+
+- Therefore, the expected value of the **sample mean** is the population mean that it's trying to estimate
+- When the expected value of an estimator is what its trying to estimate, we say that the estimator is **unbiased**
+
+---
+
+## The variance
+
+- The variance of a random variable is a measure of *spread*
+- If $X$ is a random variable with mean $\mu$, the variance of $X$ is defined as
+
+$$
+Var(X) = E[(X - \mu)^2]
+$$
+    
+the expected (squared) distance from the mean
+- Densities with a higher variance are more spread out than densities with a lower variance
+
+---
+
+- Convenient computational form
+$$
+Var(X) = E[X^2] - E[X]^2
+$$
+- If $a$ is constant then $Var(aX) = a^2 Var(X)$
+- The square root of the variance is called the **standard deviation**
+- The standard deviation has the same units as $X$
+
+---
+
+## Example
+
+- What's the sample variance from the result of a toss of a die? 
+
+  - $E[X] = 3.5$ 
+  - $E[X^2] = 1 ^ 2 \times \frac{1}{6} + 2 ^ 2 \times \frac{1}{6} + 3 ^ 2 \times \frac{1}{6} + 4 ^ 2 \times \frac{1}{6} + 5 ^ 2 \times \frac{1}{6} + 6 ^ 2 \times \frac{1}{6} = 15.17$ 
+
+- $Var(X) = E[X^2] - E[X]^2 \approx 2.92$
+
+---
+
+## Example
+
+- What's the sample variance from the result of the toss of a coin with probability of heads (1) of $p$? 
+
+  - $E[X] = 0 \times (1 - p) + 1 \times p = p$
+  - $E[X^2] = E[X] = p$ 
+
+- $Var(X) = E[X^2] - E[X]^2 = p - p^2 = p(1 - p)$
+
+---
+
+## Interpreting variances
+
+- Chebyshev's inequality is useful for interpreting variances
+- This inequality states that
+$$
+P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}
+$$
+- For example, the probability that a random variable lies beyond $k$ standard deviations from its mean is less than $1/k^2$
+$$
+\begin{eqnarray*}
+    2\sigma & \rightarrow & 25\% \\
+    3\sigma & \rightarrow & 11\% \\
+    4\sigma & \rightarrow &  6\% 
+\end{eqnarray*}
+$$
+- Note this is only a bound; the actual probability might be quite a bit smaller
+
+---
+
+## Example
+
+- IQs are often said to be distributed with a mean of $100$ and a sd of $15$
+- What is the probability of a randomly drawn person having an IQ higher than $160$ or below $40$?
+- Thus we want to know the probability of a person being more than $4$ standard deviations from the mean
+- Thus Chebyshev's inequality suggests that this will be no larger than 6\%
+- IQs distributions are often cited as being bell shaped, in which case this bound is very conservative
+- The probability of a random draw from a bell curve being $4$ standard deviations from the mean is on the order of $10^{-5}$ (one thousandth of one percent)
+
+---
+
+## Example
+
+- A former buzz phrase in industrial quality control is Motorola's "Six Sigma" whereby businesses are suggested to control extreme events or rare defective parts
+- Chebyshev's inequality states that the probability of a "Six Sigma" event is less than $1/6^2 \approx 3\%$
+- If a bell curve is assumed, the probability of a "six sigma" event is on the order of $10^{-9}$ (one ten millionth of a percent)
+
diff --git a/06_StatisticalInference/01_03_Expectations/index.pdf b/06_StatisticalInference/01_03_Expectations/index.pdf
index c9c43b5a3..aa71b7bd4 100644
Binary files a/06_StatisticalInference/01_03_Expectations/index.pdf and b/06_StatisticalInference/01_03_Expectations/index.pdf differ
diff --git a/06_StatisticalInference/01_04_Independence/assets/fig/unnamed-chunk-2.png b/06_StatisticalInference/01_04_Independence/assets/fig/unnamed-chunk-2.png
new file mode 100644
index 000000000..eded2a301
Binary files /dev/null and b/06_StatisticalInference/01_04_Independence/assets/fig/unnamed-chunk-2.png differ
diff --git a/06_StatisticalInference/01_04_Independence/index.html b/06_StatisticalInference/01_04_Independence/index.html
index 57c94e8b1..3ce8e98d5 100644
--- a/06_StatisticalInference/01_04_Independence/index.html
+++ b/06_StatisticalInference/01_04_Independence/index.html
@@ -1,496 +1,496 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Independence</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Independence">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Independence</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Independent events</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Two events \(A\) and \(B\) are <strong>independent</strong> if \[P(A \cap B) = P(A)P(B)\]</li>
-<li>Two random variables, \(X\) and \(Y\) are independent if for any two sets \(A\) and \(B\) \[P([X \in A] \cap [Y \in B]) = P(X\in A)P(Y\in B)\]</li>
-<li><p>If \(A\) is independent of \(B\) then </p>
-
-<ul>
-<li>\(A^c\) is independent of \(B\) </li>
-<li>\(A\) is independent of \(B^c\)</li>
-<li>\(A^c\) is independent of \(B^c\)</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>What is the probability of getting two consecutive heads?</li>
-<li>\(A = \{\mbox{Head on flip 1}\}\) ~ \(P(A) = .5\)</li>
-<li>\(B = \{\mbox{Head on flip 2}\}\) ~ \(P(B) = .5\)</li>
-<li>\(A \cap B = \{\mbox{Head on flips 1 and 2}\}\)</li>
-<li>\(P(A \cap B) = P(A)P(B) = .5 \times .5 = .25\) </li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Volume 309 of Science reports on a physician who was on trial for expert testimony in a criminal trial</li>
-<li>Based on an estimated prevalence of sudden infant death syndrome of \(1\) out of \(8,543\), Dr Meadow testified that that the probability of a mother having two children with SIDS was \(\left(\frac{1}{8,543}\right)^2\)</li>
-<li>The mother on trial was convicted of murder</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Example: continued</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>For the purposes of this class, the principal mistake was to <em>assume</em> that the probabilities of having SIDs within a family are independent</li>
-<li>That is, \(P(A_1 \cap A_2)\) is not necessarily equal to \(P(A_1)P(A_2)\)</li>
-<li>Biological processes that have a believed genetic or familiar environmental component, of course, tend to be dependent within families</li>
-<li>(There are many other statistical points of discussion for this case.)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Useful fact</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>We will use the following fact extensively in this class:</p>
-
-<p><em>If a collection of random variables \(X_1, X_2, \ldots, X_n\) are independent, then their joint distribution is the product of their individual densities or mass functions</em></p>
-
-<p><em>That is, if \(f_i\) is the density for random variable \(X_i\) we have that</em>
-\[
-f(x_1,\ldots, x_n) = \prod_{i=1}^n f_i(x_i)
-\]</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>IID random variables</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Random variables are said to be iid if they are independent and identically distributed</li>
-<li>iid random variables are the default model for random samples</li>
-<li>Many of the important theories of statistics are founded on assuming that variables are iid</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose that we flip a biased coin with success probability \(p\) \(n\) times, what is the join density of the collection of outcomes?</li>
-<li>These random variables are iid with densities \(p^{x_i} (1 - p)^{1-x_i}\) </li>
-<li>Therefore
-\[
-f(x_1,\ldots,x_n) = \prod_{i=1}^n p^{x_i} (1 - p)^{1-x_i} = p^{\sum x_i} (1 - p)^{n - \sum x_i}
-\]</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Correlation</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The <strong>covariance</strong> between two random variables \(X\) and \(Y\) is defined as 
-\[
-Cov(X, Y) = E[(X - \mu_x)(Y - \mu_y)] = E[X Y] - E[X]E[Y]
-\]</li>
-<li>The following are useful facts about covariance
-
-<ol>
-<li>\(Cov(X, Y) = Cov(Y, X)\)</li>
-<li>\(Cov(X, Y)\) can be negative or positive</li>
-<li>\(|Cov(X, Y)| \leq \sqrt{Var(X) Var(y)}\)</li>
-</ol></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>Correlation</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The <strong>correlation</strong> between \(X\) and \(Y\) is 
-\[
-Cor(X, Y) = Cov(X, Y) / \sqrt{Var(X) Var(y)}
-\]</li>
-</ul>
-
-<ol>
-<li>\(-1 \leq Cor(X, Y) \leq 1\)</li>
-<li>\(Cor(X, Y) = \pm 1\) if and only if \(X = a + bY\) for some constants \(a\) and \(b\)</li>
-<li>\(Cor(X, Y)\) is unitless</li>
-<li>\(X\) and \(Y\) are <strong>uncorrelated</strong> if \(Cor(X, Y) = 0\) </li>
-<li> \(X\) and \(Y\) are more positively correlated, the closer \(Cor(X,Y)\) is to \(1\)</li>
-<li> \(X\) and \(Y\) are more negatively correlated, the closer \(Cor(X,Y)\) is to \(-1\)</li>
-</ol>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>Some useful results</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li><p>Let \(\{X_i\}_{i=1}^n\) be a collection of random variables</p>
-
-<ul>
-<li>When the \(\{X_i\}\) are uncorrelated \[Var\left(\sum_{i=1}^n a_i X_i + b\right) = \sum_{i=1}^n a_i^2 Var(X_i)\]<br></li>
-</ul></li>
-<li><p>A commonly used subcase from these properties is that <em>if a collection of random variables \(\{X_i\}\) are uncorrelated</em>, then the variance of the sum is the sum of the variances
-\[
-Var\left(\sum_{i=1}^n X_i \right) = \sum_{i=1}^n Var(X_i)
-\]</p></li>
-<li><p>Therefore, it is sums of variances that tend to be useful, not sums of standard deviations; that is, the standard deviation of the sum of bunch of independent random variables is the square root of the sum of the variances, not the sum of the standard deviations</p></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <hgroup>
-    <h2>The sample mean</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>Suppose \(X_i\) are iid with variance \(\sigma^2\)</p>
-
-<p>\[
-\begin{eqnarray*}
-    Var(\bar X) & = & Var \left( \frac{1}{n}\sum_{i=1}^n X_i \right)\\ \\
-    & = & \frac{1}{n^2} Var\left(\sum_{i=1}^n X_i \right)\\ \\
-    & = & \frac{1}{n^2} \sum_{i=1}^n Var(X_i) \\ \\
-    & = & \frac{1}{n^2} \times n\sigma^2 \\ \\
-    & = & \frac{\sigma^2}{n}
-  \end{eqnarray*}
-\]</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-12" style="background:;">
-  <hgroup>
-    <h2>Some comments</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>When \(X_i\) are independent with a common variance \(Var(\bar X) = \frac{\sigma^2}{n}\)</li>
-<li>\(\sigma/\sqrt{n}\) is called <em>the standard error</em> of the sample mean</li>
-<li>The standard error of the sample mean is the standard deviation of the distribution of the sample mean</li>
-<li>\(\sigma\) is the standard deviation of the distribution of a single observation</li>
-<li>Easy way to remember, the sample mean has to be less variable than a single observation, therefore its standard deviation is divided by a \(\sqrt{n}\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-13" style="background:;">
-  <hgroup>
-    <h2>The sample variance</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The <strong>sample variance</strong> is defined as 
-\[
-S^2 =   \frac{\sum_{i=1}^n (X_i - \bar X)^2}{n-1} 
-\]</li>
-<li>The sample variance is an estimator of \(\sigma^2\)</li>
-<li>The numerator has a version that&#39;s quicker for calculation
-\[
-\sum_{i=1}^n (X_i - \bar X)^2 = \sum_{i=1}^n X_i^2 - n \bar X^2
-\]</li>
-<li>The sample variance is (nearly) the mean of the squared deviations from the mean</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-14" style="background:;">
-  <hgroup>
-    <h2>The sample variance is unbiased</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>\[
-  \begin{eqnarray*}
-    E\left[\sum_{i=1}^n (X_i - \bar X)^2\right] & = & \sum_{i=1}^n E\left[X_i^2\right] - n E\left[\bar X^2\right] \\ \\
-    & = & \sum_{i=1}^n \left\{Var(X_i) + \mu^2\right\} - n \left\{Var(\bar X) + \mu^2\right\} \\ \\
-    & = & \sum_{i=1}^n \left\{\sigma^2 + \mu^2\right\} - n \left\{\sigma^2 / n + \mu^2\right\} \\ \\
-    & = & n \sigma^2 + n \mu ^ 2 - \sigma^2 - n \mu^2 \\ \\
-    & = & (n - 1) \sigma^2
-  \end{eqnarray*}
-\]</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-15" style="background:;">
-  <hgroup>
-    <h2>Hoping to avoid some confusion</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose \(X_i\) are iid with mean \(\mu\) and variance \(\sigma^2\)</li>
-<li>\(S^2\) estimates \(\sigma^2\)</li>
-<li>The calculation of \(S^2\) involves dividing by \(n-1\)</li>
-<li>\(S / \sqrt{n}\) estimates \(\sigma / \sqrt{n}\) the standard error of the mean</li>
-<li>\(S / \sqrt{n}\) is called the sample standard error (of the mean)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-16" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">data(father.son)
-x &lt;- father.son$sheight
-n &lt;- length(x)
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-17" style="background:;">
-  <article data-timings="">
-    <p><img src="assets/fig/unnamed-chunk-2.png" alt="plot of chunk unnamed-chunk-2"> </p>
-
-<pre><code class="r">round(c(sum((x - mean(x))^2)/(n - 1), var(x), var(x)/n, sd(x), sd(x)/sqrt(n)), 
-    2)
-</code></pre>
-
-<pre><code>## [1] 7.92 7.92 0.01 2.81 0.09
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Independent events'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Example'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Example'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Example: continued'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Useful fact'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='IID random variables'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Example'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Correlation'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='Correlation'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='Some useful results'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title='The sample mean'>
-         11
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=12 title='Some comments'>
-         12
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=13 title='The sample variance'>
-         13
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=14 title='The sample variance is unbiased'>
-         14
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=15 title='Hoping to avoid some confusion'>
-         15
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=16 title='Example'>
-         16
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=17 title=''>
-         17
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Independence</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Independence">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Independence</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Independent events</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Two events \(A\) and \(B\) are <strong>independent</strong> if \[P(A \cap B) = P(A)P(B)\]</li>
+<li>Two random variables, \(X\) and \(Y\) are independent if for any two sets \(A\) and \(B\) \[P([X \in A] \cap [Y \in B]) = P(X\in A)P(Y\in B)\]</li>
+<li><p>If \(A\) is independent of \(B\) then </p>
+
+<ul>
+<li>\(A^c\) is independent of \(B\) </li>
+<li>\(A\) is independent of \(B^c\)</li>
+<li>\(A^c\) is independent of \(B^c\)</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>What is the probability of getting two consecutive heads?</li>
+<li>\(A = \{\mbox{Head on flip 1}\}\) ~ \(P(A) = .5\)</li>
+<li>\(B = \{\mbox{Head on flip 2}\}\) ~ \(P(B) = .5\)</li>
+<li>\(A \cap B = \{\mbox{Head on flips 1 and 2}\}\)</li>
+<li>\(P(A \cap B) = P(A)P(B) = .5 \times .5 = .25\) </li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Volume 309 of Science reports on a physician who was on trial for expert testimony in a criminal trial</li>
+<li>Based on an estimated prevalence of sudden infant death syndrome of \(1\) out of \(8,543\), Dr Meadow testified that that the probability of a mother having two children with SIDS was \(\left(\frac{1}{8,543}\right)^2\)</li>
+<li>The mother on trial was convicted of murder</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Example: continued</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>For the purposes of this class, the principal mistake was to <em>assume</em> that the probabilities of having SIDs within a family are independent</li>
+<li>That is, \(P(A_1 \cap A_2)\) is not necessarily equal to \(P(A_1)P(A_2)\)</li>
+<li>Biological processes that have a believed genetic or familiar environmental component, of course, tend to be dependent within families</li>
+<li>(There are many other statistical points of discussion for this case.)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Useful fact</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>We will use the following fact extensively in this class:</p>
+
+<p><em>If a collection of random variables \(X_1, X_2, \ldots, X_n\) are independent, then their joint distribution is the product of their individual densities or mass functions</em></p>
+
+<p><em>That is, if \(f_i\) is the density for random variable \(X_i\) we have that</em>
+\[
+f(x_1,\ldots, x_n) = \prod_{i=1}^n f_i(x_i)
+\]</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>IID random variables</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Random variables are said to be iid if they are independent and identically distributed</li>
+<li>iid random variables are the default model for random samples</li>
+<li>Many of the important theories of statistics are founded on assuming that variables are iid</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose that we flip a biased coin with success probability \(p\) \(n\) times, what is the join density of the collection of outcomes?</li>
+<li>These random variables are iid with densities \(p^{x_i} (1 - p)^{1-x_i}\) </li>
+<li>Therefore
+\[
+f(x_1,\ldots,x_n) = \prod_{i=1}^n p^{x_i} (1 - p)^{1-x_i} = p^{\sum x_i} (1 - p)^{n - \sum x_i}
+\]</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Correlation</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The <strong>covariance</strong> between two random variables \(X\) and \(Y\) is defined as 
+\[
+Cov(X, Y) = E[(X - \mu_x)(Y - \mu_y)] = E[X Y] - E[X]E[Y]
+\]</li>
+<li>The following are useful facts about covariance
+
+<ol>
+<li>\(Cov(X, Y) = Cov(Y, X)\)</li>
+<li>\(Cov(X, Y)\) can be negative or positive</li>
+<li>\(|Cov(X, Y)| \leq \sqrt{Var(X) Var(y)}\)</li>
+</ol></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>Correlation</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The <strong>correlation</strong> between \(X\) and \(Y\) is 
+\[
+Cor(X, Y) = Cov(X, Y) / \sqrt{Var(X) Var(y)}
+\]</li>
+</ul>
+
+<ol>
+<li>\(-1 \leq Cor(X, Y) \leq 1\)</li>
+<li>\(Cor(X, Y) = \pm 1\) if and only if \(X = a + bY\) for some constants \(a\) and \(b\)</li>
+<li>\(Cor(X, Y)\) is unitless</li>
+<li>\(X\) and \(Y\) are <strong>uncorrelated</strong> if \(Cor(X, Y) = 0\) </li>
+<li> \(X\) and \(Y\) are more positively correlated, the closer \(Cor(X,Y)\) is to \(1\)</li>
+<li> \(X\) and \(Y\) are more negatively correlated, the closer \(Cor(X,Y)\) is to \(-1\)</li>
+</ol>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>Some useful results</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li><p>Let \(\{X_i\}_{i=1}^n\) be a collection of random variables</p>
+
+<ul>
+<li>When the \(\{X_i\}\) are uncorrelated \[Var\left(\sum_{i=1}^n a_i X_i + b\right) = \sum_{i=1}^n a_i^2 Var(X_i)\]<br></li>
+</ul></li>
+<li><p>A commonly used subcase from these properties is that <em>if a collection of random variables \(\{X_i\}\) are uncorrelated</em>, then the variance of the sum is the sum of the variances
+\[
+Var\left(\sum_{i=1}^n X_i \right) = \sum_{i=1}^n Var(X_i)
+\]</p></li>
+<li><p>Therefore, it is sums of variances that tend to be useful, not sums of standard deviations; that is, the standard deviation of the sum of bunch of independent random variables is the square root of the sum of the variances, not the sum of the standard deviations</p></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <hgroup>
+    <h2>The sample mean</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>Suppose \(X_i\) are iid with variance \(\sigma^2\)</p>
+
+<p>\[
+\begin{eqnarray*}
+    Var(\bar X) & = & Var \left( \frac{1}{n}\sum_{i=1}^n X_i \right)\\ \\
+    & = & \frac{1}{n^2} Var\left(\sum_{i=1}^n X_i \right)\\ \\
+    & = & \frac{1}{n^2} \sum_{i=1}^n Var(X_i) \\ \\
+    & = & \frac{1}{n^2} \times n\sigma^2 \\ \\
+    & = & \frac{\sigma^2}{n}
+  \end{eqnarray*}
+\]</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <hgroup>
+    <h2>Some comments</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>When \(X_i\) are independent with a common variance \(Var(\bar X) = \frac{\sigma^2}{n}\)</li>
+<li>\(\sigma/\sqrt{n}\) is called <em>the standard error</em> of the sample mean</li>
+<li>The standard error of the sample mean is the standard deviation of the distribution of the sample mean</li>
+<li>\(\sigma\) is the standard deviation of the distribution of a single observation</li>
+<li>Easy way to remember, the sample mean has to be less variable than a single observation, therefore its standard deviation is divided by a \(\sqrt{n}\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-13" style="background:;">
+  <hgroup>
+    <h2>The sample variance</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The <strong>sample variance</strong> is defined as 
+\[
+S^2 =   \frac{\sum_{i=1}^n (X_i - \bar X)^2}{n-1} 
+\]</li>
+<li>The sample variance is an estimator of \(\sigma^2\)</li>
+<li>The numerator has a version that&#39;s quicker for calculation
+\[
+\sum_{i=1}^n (X_i - \bar X)^2 = \sum_{i=1}^n X_i^2 - n \bar X^2
+\]</li>
+<li>The sample variance is (nearly) the mean of the squared deviations from the mean</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-14" style="background:;">
+  <hgroup>
+    <h2>The sample variance is unbiased</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>\[
+  \begin{eqnarray*}
+    E\left[\sum_{i=1}^n (X_i - \bar X)^2\right] & = & \sum_{i=1}^n E\left[X_i^2\right] - n E\left[\bar X^2\right] \\ \\
+    & = & \sum_{i=1}^n \left\{Var(X_i) + \mu^2\right\} - n \left\{Var(\bar X) + \mu^2\right\} \\ \\
+    & = & \sum_{i=1}^n \left\{\sigma^2 + \mu^2\right\} - n \left\{\sigma^2 / n + \mu^2\right\} \\ \\
+    & = & n \sigma^2 + n \mu ^ 2 - \sigma^2 - n \mu^2 \\ \\
+    & = & (n - 1) \sigma^2
+  \end{eqnarray*}
+\]</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-15" style="background:;">
+  <hgroup>
+    <h2>Hoping to avoid some confusion</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose \(X_i\) are iid with mean \(\mu\) and variance \(\sigma^2\)</li>
+<li>\(S^2\) estimates \(\sigma^2\)</li>
+<li>The calculation of \(S^2\) involves dividing by \(n-1\)</li>
+<li>\(S / \sqrt{n}\) estimates \(\sigma / \sqrt{n}\) the standard error of the mean</li>
+<li>\(S / \sqrt{n}\) is called the sample standard error (of the mean)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-16" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">data(father.son)
+x &lt;- father.son$sheight
+n &lt;- length(x)
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-17" style="background:;">
+  <article data-timings="">
+    <p><img src="assets/fig/unnamed-chunk-2.png" alt="plot of chunk unnamed-chunk-2"> </p>
+
+<pre><code class="r">round(c(sum((x - mean(x))^2)/(n - 1), var(x), var(x)/n, sd(x), sd(x)/sqrt(n)), 
+    2)
+</code></pre>
+
+<pre><code>## [1] 7.92 7.92 0.01 2.81 0.09
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Independent events'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Example'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Example'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Example: continued'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Useful fact'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='IID random variables'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Example'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Correlation'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='Correlation'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='Some useful results'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title='The sample mean'>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title='Some comments'>
+         12
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=13 title='The sample variance'>
+         13
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=14 title='The sample variance is unbiased'>
+         14
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=15 title='Hoping to avoid some confusion'>
+         15
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=16 title='Example'>
+         16
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=17 title=''>
+         17
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/01_04_Independence/index.md b/06_StatisticalInference/01_04_Independence/index.md
index 12b7e2264..6ba1c02e5 100644
--- a/06_StatisticalInference/01_04_Independence/index.md
+++ b/06_StatisticalInference/01_04_Independence/index.md
@@ -1,216 +1,216 @@
----
-title       : Independence
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-## Independent events
-
-- Two events $A$ and $B$ are **independent** if $$P(A \cap B) = P(A)P(B)$$
-- Two random variables, $X$ and $Y$ are independent if for any two sets $A$ and $B$ $$P([X \in A] \cap [Y \in B]) = P(X\in A)P(Y\in B)$$
-- If $A$ is independent of $B$ then 
-
-  - $A^c$ is independent of $B$ 
-  - $A$ is independent of $B^c$
-  - $A^c$ is independent of $B^c$
-
-
----
-
-## Example
-
-- What is the probability of getting two consecutive heads?
-- $A = \{\mbox{Head on flip 1}\}$ ~ $P(A) = .5$
-- $B = \{\mbox{Head on flip 2}\}$ ~ $P(B) = .5$
-- $A \cap B = \{\mbox{Head on flips 1 and 2}\}$
-- $P(A \cap B) = P(A)P(B) = .5 \times .5 = .25$ 
-
----
-
-## Example
-
-- Volume 309 of Science reports on a physician who was on trial for expert testimony in a criminal trial
-- Based on an estimated prevalence of sudden infant death syndrome of $1$ out of $8,543$, Dr Meadow testified that that the probability of a mother having two children with SIDS was $\left(\frac{1}{8,543}\right)^2$
-- The mother on trial was convicted of murder
-
----
-
-## Example: continued
-
-- For the purposes of this class, the principal mistake was to *assume* that the probabilities of having SIDs within a family are independent
-- That is, $P(A_1 \cap A_2)$ is not necessarily equal to $P(A_1)P(A_2)$
-- Biological processes that have a believed genetic or familiar environmental component, of course, tend to be dependent within families
-- (There are many other statistical points of discussion for this case.)
-
----
-
-## Useful fact
-
-We will use the following fact extensively in this class:
-
-*If a collection of random variables $X_1, X_2, \ldots, X_n$ are independent, then their joint distribution is the product of their individual densities or mass functions*
-
-*That is, if $f_i$ is the density for random variable $X_i$ we have that*
-$$
-f(x_1,\ldots, x_n) = \prod_{i=1}^n f_i(x_i)
-$$
-
----
-
-## IID random variables
-
-- Random variables are said to be iid if they are independent and identically distributed
-- iid random variables are the default model for random samples
-- Many of the important theories of statistics are founded on assuming that variables are iid
-
-
----
-
-## Example
-
-- Suppose that we flip a biased coin with success probability $p$ $n$ times, what is the join density of the collection of outcomes?
-- These random variables are iid with densities $p^{x_i} (1 - p)^{1-x_i}$ 
-- Therefore
-  $$
-  f(x_1,\ldots,x_n) = \prod_{i=1}^n p^{x_i} (1 - p)^{1-x_i} = p^{\sum x_i} (1 - p)^{n - \sum x_i}
-  $$
-
----
-
-## Correlation
-
-- The **covariance** between two random variables $X$ and $Y$ is defined as 
-$$
-Cov(X, Y) = E[(X - \mu_x)(Y - \mu_y)] = E[X Y] - E[X]E[Y]
-$$
-- The following are useful facts about covariance
-  1. $Cov(X, Y) = Cov(Y, X)$
-  2. $Cov(X, Y)$ can be negative or positive
-  3. $|Cov(X, Y)| \leq \sqrt{Var(X) Var(y)}$
-
----
-
-## Correlation
-
-- The **correlation** between $X$ and $Y$ is 
-$$
-Cor(X, Y) = Cov(X, Y) / \sqrt{Var(X) Var(y)}
-$$
-
-  1. $-1 \leq Cor(X, Y) \leq 1$
-  2. $Cor(X, Y) = \pm 1$ if and only if $X = a + bY$ for some constants $a$ and $b$
-  3. $Cor(X, Y)$ is unitless
-  4. $X$ and $Y$ are **uncorrelated** if $Cor(X, Y) = 0$ 
-  5.  $X$ and $Y$ are more positively correlated, the closer $Cor(X,Y)$ is to $1$
-  6.  $X$ and $Y$ are more negatively correlated, the closer $Cor(X,Y)$ is to $-1$
-
----
-
-## Some useful results
-
-- Let $\{X_i\}_{i=1}^n$ be a collection of random variables
-  - When the $\{X_i\}$ are uncorrelated $$Var\left(\sum_{i=1}^n a_i X_i + b\right) = \sum_{i=1}^n a_i^2 Var(X_i)$$  
-
-- A commonly used subcase from these properties is that *if a collection of random variables $\{X_i\}$ are uncorrelated*, then the variance of the sum is the sum of the variances
-$$
-Var\left(\sum_{i=1}^n X_i \right) = \sum_{i=1}^n Var(X_i)
-$$
-- Therefore, it is sums of variances that tend to be useful, not sums of standard deviations; that is, the standard deviation of the sum of bunch of independent random variables is the square root of the sum of the variances, not the sum of the standard deviations
-
----
-
-## The sample mean
-
-Suppose $X_i$ are iid with variance $\sigma^2$
-
-$$
-\begin{eqnarray*}
-    Var(\bar X) & = & Var \left( \frac{1}{n}\sum_{i=1}^n X_i \right)\\ \\
-    & = & \frac{1}{n^2} Var\left(\sum_{i=1}^n X_i \right)\\ \\
-    & = & \frac{1}{n^2} \sum_{i=1}^n Var(X_i) \\ \\
-    & = & \frac{1}{n^2} \times n\sigma^2 \\ \\
-    & = & \frac{\sigma^2}{n}
-  \end{eqnarray*}
-$$
-
----
-
-## Some comments
-
-- When $X_i$ are independent with a common variance $Var(\bar X) = \frac{\sigma^2}{n}$
-- $\sigma/\sqrt{n}$ is called *the standard error* of the sample mean
-- The standard error of the sample mean is the standard deviation of the distribution of the sample mean
-- $\sigma$ is the standard deviation of the distribution of a single observation
-- Easy way to remember, the sample mean has to be less variable than a single observation, therefore its standard deviation is divided by a $\sqrt{n}$
-
----
-
-## The sample variance
-- The **sample variance** is defined as 
-$$
-S^2 =   \frac{\sum_{i=1}^n (X_i - \bar X)^2}{n-1} 
-$$
-- The sample variance is an estimator of $\sigma^2$
-- The numerator has a version that's quicker for calculation
-$$
-\sum_{i=1}^n (X_i - \bar X)^2 = \sum_{i=1}^n X_i^2 - n \bar X^2
-$$
-- The sample variance is (nearly) the mean of the squared deviations from the mean
-
----
-
-## The sample variance is unbiased
-
-$$
-  \begin{eqnarray*}
-    E\left[\sum_{i=1}^n (X_i - \bar X)^2\right] & = & \sum_{i=1}^n E\left[X_i^2\right] - n E\left[\bar X^2\right] \\ \\
-    & = & \sum_{i=1}^n \left\{Var(X_i) + \mu^2\right\} - n \left\{Var(\bar X) + \mu^2\right\} \\ \\
-    & = & \sum_{i=1}^n \left\{\sigma^2 + \mu^2\right\} - n \left\{\sigma^2 / n + \mu^2\right\} \\ \\
-    & = & n \sigma^2 + n \mu ^ 2 - \sigma^2 - n \mu^2 \\ \\
-    & = & (n - 1) \sigma^2
-  \end{eqnarray*}
-$$
-
----
-
-## Hoping to avoid some confusion
-
-- Suppose $X_i$ are iid with mean $\mu$ and variance $\sigma^2$
-- $S^2$ estimates $\sigma^2$
-- The calculation of $S^2$ involves dividing by $n-1$
-- $S / \sqrt{n}$ estimates $\sigma / \sqrt{n}$ the standard error of the mean
-- $S / \sqrt{n}$ is called the sample standard error (of the mean)
-
----
-## Example
-
-```r
-data(father.son)
-x <- father.son$sheight
-n <- length(x)
-```
-
-
----
-![plot of chunk unnamed-chunk-2](assets/fig/unnamed-chunk-2.png) 
-
-
-```r
-round(c(sum((x - mean(x))^2)/(n - 1), var(x), var(x)/n, sd(x), sd(x)/sqrt(n)), 
-    2)
-```
-
-```
-## [1] 7.92 7.92 0.01 2.81 0.09
-```
-
+---
+title       : Independence
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Independent events
+
+- Two events $A$ and $B$ are **independent** if $$P(A \cap B) = P(A)P(B)$$
+- Two random variables, $X$ and $Y$ are independent if for any two sets $A$ and $B$ $$P([X \in A] \cap [Y \in B]) = P(X\in A)P(Y\in B)$$
+- If $A$ is independent of $B$ then 
+
+  - $A^c$ is independent of $B$ 
+  - $A$ is independent of $B^c$
+  - $A^c$ is independent of $B^c$
+
+
+---
+
+## Example
+
+- What is the probability of getting two consecutive heads?
+- $A = \{\mbox{Head on flip 1}\}$ ~ $P(A) = .5$
+- $B = \{\mbox{Head on flip 2}\}$ ~ $P(B) = .5$
+- $A \cap B = \{\mbox{Head on flips 1 and 2}\}$
+- $P(A \cap B) = P(A)P(B) = .5 \times .5 = .25$ 
+
+---
+
+## Example
+
+- Volume 309 of Science reports on a physician who was on trial for expert testimony in a criminal trial
+- Based on an estimated prevalence of sudden infant death syndrome of $1$ out of $8,543$, Dr Meadow testified that that the probability of a mother having two children with SIDS was $\left(\frac{1}{8,543}\right)^2$
+- The mother on trial was convicted of murder
+
+---
+
+## Example: continued
+
+- For the purposes of this class, the principal mistake was to *assume* that the probabilities of having SIDs within a family are independent
+- That is, $P(A_1 \cap A_2)$ is not necessarily equal to $P(A_1)P(A_2)$
+- Biological processes that have a believed genetic or familiar environmental component, of course, tend to be dependent within families
+- (There are many other statistical points of discussion for this case.)
+
+---
+
+## Useful fact
+
+We will use the following fact extensively in this class:
+
+*If a collection of random variables $X_1, X_2, \ldots, X_n$ are independent, then their joint distribution is the product of their individual densities or mass functions*
+
+*That is, if $f_i$ is the density for random variable $X_i$ we have that*
+$$
+f(x_1,\ldots, x_n) = \prod_{i=1}^n f_i(x_i)
+$$
+
+---
+
+## IID random variables
+
+- Random variables are said to be iid if they are independent and identically distributed
+- iid random variables are the default model for random samples
+- Many of the important theories of statistics are founded on assuming that variables are iid
+
+
+---
+
+## Example
+
+- Suppose that we flip a biased coin with success probability $p$ $n$ times, what is the join density of the collection of outcomes?
+- These random variables are iid with densities $p^{x_i} (1 - p)^{1-x_i}$ 
+- Therefore
+  $$
+  f(x_1,\ldots,x_n) = \prod_{i=1}^n p^{x_i} (1 - p)^{1-x_i} = p^{\sum x_i} (1 - p)^{n - \sum x_i}
+  $$
+
+---
+
+## Correlation
+
+- The **covariance** between two random variables $X$ and $Y$ is defined as 
+$$
+Cov(X, Y) = E[(X - \mu_x)(Y - \mu_y)] = E[X Y] - E[X]E[Y]
+$$
+- The following are useful facts about covariance
+  1. $Cov(X, Y) = Cov(Y, X)$
+  2. $Cov(X, Y)$ can be negative or positive
+  3. $|Cov(X, Y)| \leq \sqrt{Var(X) Var(y)}$
+
+---
+
+## Correlation
+
+- The **correlation** between $X$ and $Y$ is 
+$$
+Cor(X, Y) = Cov(X, Y) / \sqrt{Var(X) Var(y)}
+$$
+
+  1. $-1 \leq Cor(X, Y) \leq 1$
+  2. $Cor(X, Y) = \pm 1$ if and only if $X = a + bY$ for some constants $a$ and $b$
+  3. $Cor(X, Y)$ is unitless
+  4. $X$ and $Y$ are **uncorrelated** if $Cor(X, Y) = 0$ 
+  5.  $X$ and $Y$ are more positively correlated, the closer $Cor(X,Y)$ is to $1$
+  6.  $X$ and $Y$ are more negatively correlated, the closer $Cor(X,Y)$ is to $-1$
+
+---
+
+## Some useful results
+
+- Let $\{X_i\}_{i=1}^n$ be a collection of random variables
+  - When the $\{X_i\}$ are uncorrelated $$Var\left(\sum_{i=1}^n a_i X_i + b\right) = \sum_{i=1}^n a_i^2 Var(X_i)$$  
+
+- A commonly used subcase from these properties is that *if a collection of random variables $\{X_i\}$ are uncorrelated*, then the variance of the sum is the sum of the variances
+$$
+Var\left(\sum_{i=1}^n X_i \right) = \sum_{i=1}^n Var(X_i)
+$$
+- Therefore, it is sums of variances that tend to be useful, not sums of standard deviations; that is, the standard deviation of the sum of bunch of independent random variables is the square root of the sum of the variances, not the sum of the standard deviations
+
+---
+
+## The sample mean
+
+Suppose $X_i$ are iid with variance $\sigma^2$
+
+$$
+\begin{eqnarray*}
+    Var(\bar X) & = & Var \left( \frac{1}{n}\sum_{i=1}^n X_i \right)\\ \\
+    & = & \frac{1}{n^2} Var\left(\sum_{i=1}^n X_i \right)\\ \\
+    & = & \frac{1}{n^2} \sum_{i=1}^n Var(X_i) \\ \\
+    & = & \frac{1}{n^2} \times n\sigma^2 \\ \\
+    & = & \frac{\sigma^2}{n}
+  \end{eqnarray*}
+$$
+
+---
+
+## Some comments
+
+- When $X_i$ are independent with a common variance $Var(\bar X) = \frac{\sigma^2}{n}$
+- $\sigma/\sqrt{n}$ is called *the standard error* of the sample mean
+- The standard error of the sample mean is the standard deviation of the distribution of the sample mean
+- $\sigma$ is the standard deviation of the distribution of a single observation
+- Easy way to remember, the sample mean has to be less variable than a single observation, therefore its standard deviation is divided by a $\sqrt{n}$
+
+---
+
+## The sample variance
+- The **sample variance** is defined as 
+$$
+S^2 =   \frac{\sum_{i=1}^n (X_i - \bar X)^2}{n-1} 
+$$
+- The sample variance is an estimator of $\sigma^2$
+- The numerator has a version that's quicker for calculation
+$$
+\sum_{i=1}^n (X_i - \bar X)^2 = \sum_{i=1}^n X_i^2 - n \bar X^2
+$$
+- The sample variance is (nearly) the mean of the squared deviations from the mean
+
+---
+
+## The sample variance is unbiased
+
+$$
+  \begin{eqnarray*}
+    E\left[\sum_{i=1}^n (X_i - \bar X)^2\right] & = & \sum_{i=1}^n E\left[X_i^2\right] - n E\left[\bar X^2\right] \\ \\
+    & = & \sum_{i=1}^n \left\{Var(X_i) + \mu^2\right\} - n \left\{Var(\bar X) + \mu^2\right\} \\ \\
+    & = & \sum_{i=1}^n \left\{\sigma^2 + \mu^2\right\} - n \left\{\sigma^2 / n + \mu^2\right\} \\ \\
+    & = & n \sigma^2 + n \mu ^ 2 - \sigma^2 - n \mu^2 \\ \\
+    & = & (n - 1) \sigma^2
+  \end{eqnarray*}
+$$
+
+---
+
+## Hoping to avoid some confusion
+
+- Suppose $X_i$ are iid with mean $\mu$ and variance $\sigma^2$
+- $S^2$ estimates $\sigma^2$
+- The calculation of $S^2$ involves dividing by $n-1$
+- $S / \sqrt{n}$ estimates $\sigma / \sqrt{n}$ the standard error of the mean
+- $S / \sqrt{n}$ is called the sample standard error (of the mean)
+
+---
+## Example
+
+```r
+data(father.son)
+x <- father.son$sheight
+n <- length(x)
+```
+
+
+---
+![plot of chunk unnamed-chunk-2](assets/fig/unnamed-chunk-2.png) 
+
+
+```r
+round(c(sum((x - mean(x))^2)/(n - 1), var(x), var(x)/n, sd(x), sd(x)/sqrt(n)), 
+    2)
+```
+
+```
+## [1] 7.92 7.92 0.01 2.81 0.09
+```
+
diff --git a/06_StatisticalInference/01_04_Independence/index.pdf b/06_StatisticalInference/01_04_Independence/index.pdf
index ba92e4d8e..fd2201506 100644
Binary files a/06_StatisticalInference/01_04_Independence/index.pdf and b/06_StatisticalInference/01_04_Independence/index.pdf differ
diff --git a/06_StatisticalInference/01_05_ConditionalProbability/index.html b/06_StatisticalInference/01_05_ConditionalProbability/index.html
index dba5dc912..0267feacb 100644
--- a/06_StatisticalInference/01_05_ConditionalProbability/index.html
+++ b/06_StatisticalInference/01_05_ConditionalProbability/index.html
@@ -1,411 +1,411 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Conditional Probability</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Conditional Probability">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Conditional Probability</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Conditional probability, motivation</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The probability of getting a one when rolling a (standard) die
-is usually assumed to be one sixth</li>
-<li>Suppose you were given the extra information that the die roll
-was an odd number (hence 1, 3 or 5)</li>
-<li><em>conditional on this new information</em>, the probability of a
-one is now one third</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Conditional probability, definition</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Let \(B\) be an event so that \(P(B) > 0\)</li>
-<li>Then the conditional probability of an event \(A\) given that \(B\) has occurred is
-\[
-P(A ~|~ B) = \frac{P(A \cap B)}{P(B)}
-\]</li>
-<li>Notice that if \(A\) and \(B\) are independent, then
-\[
-P(A ~|~ B) = \frac{P(A) P(B)}{P(B)} = P(A)
-\]</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Consider our die roll example</li>
-<li>\(B = \{1, 3, 5\}\)</li>
-<li>\(A = \{1\}\)
-\[
-\begin{eqnarray*}
-P(\mbox{one given that roll is odd})  & = & P(A ~|~ B) \\ \\
-& = & \frac{P(A \cap B)}{P(B)} \\ \\
-& = & \frac{P(A)}{P(B)} \\ \\ 
-& = & \frac{1/6}{3/6} = \frac{1}{3}
-\end{eqnarray*}
-\]</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Bayes&#39; rule</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>\[
-P(B ~|~ A) = \frac{P(A ~|~ B) P(B)}{P(A ~|~ B) P(B) + P(A ~|~ B^c)P(B^c)}.
-\]</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Diagnostic tests</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Let \(+\) and \(-\) be the events that the result of a diagnostic test is positive or negative respectively</li>
-<li>Let \(D\) and \(D^c\) be the event that the subject of the test has or does not have the disease respectively </li>
-<li>The <strong>sensitivity</strong> is the probability that the test is positive given that the subject actually has the disease, \(P(+ ~|~ D)\)</li>
-<li>The <strong>specificity</strong> is the probability that the test is negative given that the subject does not have the disease, \(P(- ~|~ D^c)\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>More definitions</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The <strong>positive predictive value</strong> is the probability that the subject has the  disease given that the test is positive, \(P(D ~|~ +)\)</li>
-<li>The <strong>negative predictive value</strong> is the probability that the subject does not have the disease given that the test is negative, \(P(D^c ~|~ -)\)</li>
-<li>The <strong>prevalence of the disease</strong> is the marginal probability of disease, \(P(D)\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>More definitions</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The <strong>diagnostic likelihood ratio of a positive test</strong>, labeled \(DLR_+\), is \(P(+ ~|~ D) / P(+ ~|~ D^c)\), which is the \[sensitivity / (1 - specificity)\]</li>
-<li>The <strong>diagnostic likelihood ratio of a negative test</strong>, labeled \(DLR_-\), is \(P(- ~|~ D) / P(- ~|~ D^c)\), which is the \[(1 - sensitivity) / specificity\]</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>A study comparing the efficacy of HIV tests, reports on an experiment which concluded that HIV antibody tests have a sensitivity of 99.7% and a specificity of 98.5%</li>
-<li>Suppose that a subject, from a population with a .1% prevalence of HIV, receives a positive test result. What is the probability that this subject has HIV?</li>
-<li>Mathematically, we want \(P(D ~|~ +)\) given the sensitivity, \(P(+ ~|~ D) = .997\), the specificity, \(P(- ~|~ D^c) =.985\), and the prevalence \(P(D) = .001\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>Using Bayes&#39; formula</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>\[
-\begin{eqnarray*}
-  P(D ~|~ +) & = &\frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)}\\ \\
- & = & \frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + \{1-P(-~|~D^c)\}\{1 - P(D)\}} \\ \\
- & = & \frac{.997\times .001}{.997 \times .001 + .015 \times .999}\\ \\
- & = & .062
-\end{eqnarray*}
-\]</p>
-
-<ul>
-<li>In this population a positive test result only suggests a 6% probability that the subject has the disease </li>
-<li>(The positive predictive value is 6% for this test)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>More on this example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The low positive predictive value is due to low prevalence of disease and the somewhat modest specificity</li>
-<li>Suppose it was known that the subject was an intravenous drug user and routinely had intercourse with an HIV infected partner</li>
-<li>Notice that the evidence implied by a positive test result does not change because of the prevalence of disease in the subject&#39;s population, only our interpretation of that evidence changes</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <hgroup>
-    <h2>Likelihood ratios</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Using Bayes rule, we have
-\[
-P(D ~|~ +) = \frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)} 
-\]
-and
-\[
-P(D^c ~|~ +) = \frac{P(+~|~D^c)P(D^c)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)}.
-\]</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-12" style="background:;">
-  <hgroup>
-    <h2>Likelihood ratios</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Therefore
-\[
-\frac{P(D ~|~ +)}{P(D^c ~|~ +)} = \frac{P(+~|~D)}{P(+~|~D^c)}\times \frac{P(D)}{P(D^c)}
-\]
-ie
-\[
-\mbox{post-test odds of }D = DLR_+\times\mbox{pre-test odds of }D
-\]</li>
-<li>Similarly, \(DLR_-\) relates the decrease in the odds of the
-disease after a negative test result to the odds of disease prior to
-the test.</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-13" style="background:;">
-  <hgroup>
-    <h2>HIV example revisited</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose a subject has a positive HIV test</li>
-<li>\(DLR_+ = .997 / (1 - .985) \approx 66\)</li>
-<li>The result of the positive test is that the odds of disease is now 66 times the pretest odds</li>
-<li>Or, equivalently, the hypothesis of disease is 66 times more supported by the data than the hypothesis of no disease</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-14" style="background:;">
-  <hgroup>
-    <h2>HIV example revisited</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose that a subject has a negative test result </li>
-<li>\(DLR_- = (1 - .997) / .985  \approx .003\)</li>
-<li>Therefore, the post-test odds of disease is now \(.3\%\) of the pretest odds given the negative test.</li>
-<li>Or, the hypothesis of disease is supported \(.003\) times that of the hypothesis of absence of disease given the negative test result</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Conditional probability, motivation'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Conditional probability, definition'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Example'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Bayes&#39; rule'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Diagnostic tests'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='More definitions'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='More definitions'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Example'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='Using Bayes&#39; formula'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='More on this example'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title='Likelihood ratios'>
-         11
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=12 title='Likelihood ratios'>
-         12
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=13 title='HIV example revisited'>
-         13
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=14 title='HIV example revisited'>
-         14
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Conditional Probability</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Conditional Probability">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Conditional Probability</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Conditional probability, motivation</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The probability of getting a one when rolling a (standard) die
+is usually assumed to be one sixth</li>
+<li>Suppose you were given the extra information that the die roll
+was an odd number (hence 1, 3 or 5)</li>
+<li><em>conditional on this new information</em>, the probability of a
+one is now one third</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Conditional probability, definition</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Let \(B\) be an event so that \(P(B) > 0\)</li>
+<li>Then the conditional probability of an event \(A\) given that \(B\) has occurred is
+\[
+P(A ~|~ B) = \frac{P(A \cap B)}{P(B)}
+\]</li>
+<li>Notice that if \(A\) and \(B\) are independent, then
+\[
+P(A ~|~ B) = \frac{P(A) P(B)}{P(B)} = P(A)
+\]</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Consider our die roll example</li>
+<li>\(B = \{1, 3, 5\}\)</li>
+<li>\(A = \{1\}\)
+\[
+\begin{eqnarray*}
+P(\mbox{one given that roll is odd})  & = & P(A ~|~ B) \\ \\
+& = & \frac{P(A \cap B)}{P(B)} \\ \\
+& = & \frac{P(A)}{P(B)} \\ \\ 
+& = & \frac{1/6}{3/6} = \frac{1}{3}
+\end{eqnarray*}
+\]</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Bayes&#39; rule</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>\[
+P(B ~|~ A) = \frac{P(A ~|~ B) P(B)}{P(A ~|~ B) P(B) + P(A ~|~ B^c)P(B^c)}.
+\]</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Diagnostic tests</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Let \(+\) and \(-\) be the events that the result of a diagnostic test is positive or negative respectively</li>
+<li>Let \(D\) and \(D^c\) be the event that the subject of the test has or does not have the disease respectively </li>
+<li>The <strong>sensitivity</strong> is the probability that the test is positive given that the subject actually has the disease, \(P(+ ~|~ D)\)</li>
+<li>The <strong>specificity</strong> is the probability that the test is negative given that the subject does not have the disease, \(P(- ~|~ D^c)\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>More definitions</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The <strong>positive predictive value</strong> is the probability that the subject has the  disease given that the test is positive, \(P(D ~|~ +)\)</li>
+<li>The <strong>negative predictive value</strong> is the probability that the subject does not have the disease given that the test is negative, \(P(D^c ~|~ -)\)</li>
+<li>The <strong>prevalence of the disease</strong> is the marginal probability of disease, \(P(D)\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>More definitions</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The <strong>diagnostic likelihood ratio of a positive test</strong>, labeled \(DLR_+\), is \(P(+ ~|~ D) / P(+ ~|~ D^c)\), which is the \[sensitivity / (1 - specificity)\]</li>
+<li>The <strong>diagnostic likelihood ratio of a negative test</strong>, labeled \(DLR_-\), is \(P(- ~|~ D) / P(- ~|~ D^c)\), which is the \[(1 - sensitivity) / specificity\]</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>A study comparing the efficacy of HIV tests, reports on an experiment which concluded that HIV antibody tests have a sensitivity of 99.7% and a specificity of 98.5%</li>
+<li>Suppose that a subject, from a population with a .1% prevalence of HIV, receives a positive test result. What is the probability that this subject has HIV?</li>
+<li>Mathematically, we want \(P(D ~|~ +)\) given the sensitivity, \(P(+ ~|~ D) = .997\), the specificity, \(P(- ~|~ D^c) =.985\), and the prevalence \(P(D) = .001\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>Using Bayes&#39; formula</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>\[
+\begin{eqnarray*}
+  P(D ~|~ +) & = &\frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)}\\ \\
+ & = & \frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + \{1-P(-~|~D^c)\}\{1 - P(D)\}} \\ \\
+ & = & \frac{.997\times .001}{.997 \times .001 + .015 \times .999}\\ \\
+ & = & .062
+\end{eqnarray*}
+\]</p>
+
+<ul>
+<li>In this population a positive test result only suggests a 6% probability that the subject has the disease </li>
+<li>(The positive predictive value is 6% for this test)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>More on this example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The low positive predictive value is due to low prevalence of disease and the somewhat modest specificity</li>
+<li>Suppose it was known that the subject was an intravenous drug user and routinely had intercourse with an HIV infected partner</li>
+<li>Notice that the evidence implied by a positive test result does not change because of the prevalence of disease in the subject&#39;s population, only our interpretation of that evidence changes</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <hgroup>
+    <h2>Likelihood ratios</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Using Bayes rule, we have
+\[
+P(D ~|~ +) = \frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)} 
+\]
+and
+\[
+P(D^c ~|~ +) = \frac{P(+~|~D^c)P(D^c)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)}.
+\]</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <hgroup>
+    <h2>Likelihood ratios</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Therefore
+\[
+\frac{P(D ~|~ +)}{P(D^c ~|~ +)} = \frac{P(+~|~D)}{P(+~|~D^c)}\times \frac{P(D)}{P(D^c)}
+\]
+ie
+\[
+\mbox{post-test odds of }D = DLR_+\times\mbox{pre-test odds of }D
+\]</li>
+<li>Similarly, \(DLR_-\) relates the decrease in the odds of the
+disease after a negative test result to the odds of disease prior to
+the test.</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-13" style="background:;">
+  <hgroup>
+    <h2>HIV example revisited</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose a subject has a positive HIV test</li>
+<li>\(DLR_+ = .997 / (1 - .985) \approx 66\)</li>
+<li>The result of the positive test is that the odds of disease is now 66 times the pretest odds</li>
+<li>Or, equivalently, the hypothesis of disease is 66 times more supported by the data than the hypothesis of no disease</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-14" style="background:;">
+  <hgroup>
+    <h2>HIV example revisited</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose that a subject has a negative test result </li>
+<li>\(DLR_- = (1 - .997) / .985  \approx .003\)</li>
+<li>Therefore, the post-test odds of disease is now \(.3\%\) of the pretest odds given the negative test.</li>
+<li>Or, the hypothesis of disease is supported \(.003\) times that of the hypothesis of absence of disease given the negative test result</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Conditional probability, motivation'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Conditional probability, definition'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Example'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Bayes&#39; rule'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Diagnostic tests'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='More definitions'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='More definitions'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Example'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='Using Bayes&#39; formula'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='More on this example'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title='Likelihood ratios'>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title='Likelihood ratios'>
+         12
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=13 title='HIV example revisited'>
+         13
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=14 title='HIV example revisited'>
+         14
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/01_05_ConditionalProbability/index.md b/06_StatisticalInference/01_05_ConditionalProbability/index.md
index db2151e51..93dad87ce 100644
--- a/06_StatisticalInference/01_05_ConditionalProbability/index.md
+++ b/06_StatisticalInference/01_05_ConditionalProbability/index.md
@@ -1,169 +1,169 @@
----
-title       : Conditional Probability
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-## Conditional probability, motivation
-
-- The probability of getting a one when rolling a (standard) die
-  is usually assumed to be one sixth
-- Suppose you were given the extra information that the die roll
-  was an odd number (hence 1, 3 or 5)
-- *conditional on this new information*, the probability of a
-  one is now one third
-
----
-
-## Conditional probability, definition
-
-- Let $B$ be an event so that $P(B) > 0$
-- Then the conditional probability of an event $A$ given that $B$ has occurred is
-  $$
-  P(A ~|~ B) = \frac{P(A \cap B)}{P(B)}
-  $$
-- Notice that if $A$ and $B$ are independent, then
-  $$
-  P(A ~|~ B) = \frac{P(A) P(B)}{P(B)} = P(A)
-  $$
-
----
-
-## Example
-
-- Consider our die roll example
-- $B = \{1, 3, 5\}$
-- $A = \{1\}$
-$$
-  \begin{eqnarray*}
-P(\mbox{one given that roll is odd})  & = & P(A ~|~ B) \\ \\
-  & = & \frac{P(A \cap B)}{P(B)} \\ \\
-  & = & \frac{P(A)}{P(B)} \\ \\ 
-  & = & \frac{1/6}{3/6} = \frac{1}{3}
-  \end{eqnarray*}
-$$
-
-
-
----
-
-## Bayes' rule
-
-$$
-P(B ~|~ A) = \frac{P(A ~|~ B) P(B)}{P(A ~|~ B) P(B) + P(A ~|~ B^c)P(B^c)}.
-$$
-  
-
----
-
-## Diagnostic tests
-
-- Let $+$ and $-$ be the events that the result of a diagnostic test is positive or negative respectively
-- Let $D$ and $D^c$ be the event that the subject of the test has or does not have the disease respectively 
-- The **sensitivity** is the probability that the test is positive given that the subject actually has the disease, $P(+ ~|~ D)$
-- The **specificity** is the probability that the test is negative given that the subject does not have the disease, $P(- ~|~ D^c)$
-
----
-
-## More definitions
-
-- The **positive predictive value** is the probability that the subject has the  disease given that the test is positive, $P(D ~|~ +)$
-- The **negative predictive value** is the probability that the subject does not have the disease given that the test is negative, $P(D^c ~|~ -)$
-- The **prevalence of the disease** is the marginal probability of disease, $P(D)$
-
----
-
-## More definitions
-
-- The **diagnostic likelihood ratio of a positive test**, labeled $DLR_+$, is $P(+ ~|~ D) / P(+ ~|~ D^c)$, which is the $$sensitivity / (1 - specificity)$$
-- The **diagnostic likelihood ratio of a negative test**, labeled $DLR_-$, is $P(- ~|~ D) / P(- ~|~ D^c)$, which is the $$(1 - sensitivity) / specificity$$
-
----
-
-## Example
-
-- A study comparing the efficacy of HIV tests, reports on an experiment which concluded that HIV antibody tests have a sensitivity of 99.7% and a specificity of 98.5%
-- Suppose that a subject, from a population with a .1% prevalence of HIV, receives a positive test result. What is the probability that this subject has HIV?
-- Mathematically, we want $P(D ~|~ +)$ given the sensitivity, $P(+ ~|~ D) = .997$, the specificity, $P(- ~|~ D^c) =.985$, and the prevalence $P(D) = .001$
-
----
-
-## Using Bayes' formula
-
-$$
-\begin{eqnarray*}
-  P(D ~|~ +) & = &\frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)}\\ \\
- & = & \frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + \{1-P(-~|~D^c)\}\{1 - P(D)\}} \\ \\
- & = & \frac{.997\times .001}{.997 \times .001 + .015 \times .999}\\ \\
- & = & .062
-\end{eqnarray*}
-$$
-
-- In this population a positive test result only suggests a 6% probability that the subject has the disease 
-- (The positive predictive value is 6% for this test)
-
----
-
-## More on this example
-
-- The low positive predictive value is due to low prevalence of disease and the somewhat modest specificity
-- Suppose it was known that the subject was an intravenous drug user and routinely had intercourse with an HIV infected partner
-- Notice that the evidence implied by a positive test result does not change because of the prevalence of disease in the subject's population, only our interpretation of that evidence changes
-
----
-
-## Likelihood ratios
-
-- Using Bayes rule, we have
-  $$
-  P(D ~|~ +) = \frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)} 
-  $$
-  and
-  $$
-  P(D^c ~|~ +) = \frac{P(+~|~D^c)P(D^c)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)}.
-  $$
-
----
-
-## Likelihood ratios
-
-- Therefore
-$$
-\frac{P(D ~|~ +)}{P(D^c ~|~ +)} = \frac{P(+~|~D)}{P(+~|~D^c)}\times \frac{P(D)}{P(D^c)}
-$$
-ie
-$$
-\mbox{post-test odds of }D = DLR_+\times\mbox{pre-test odds of }D
-$$
-- Similarly, $DLR_-$ relates the decrease in the odds of the
-  disease after a negative test result to the odds of disease prior to
-  the test.
-
----
-
-## HIV example revisited
-
-- Suppose a subject has a positive HIV test
-- $DLR_+ = .997 / (1 - .985) \approx 66$
-- The result of the positive test is that the odds of disease is now 66 times the pretest odds
-- Or, equivalently, the hypothesis of disease is 66 times more supported by the data than the hypothesis of no disease
-
----
-
-## HIV example revisited
-
-- Suppose that a subject has a negative test result 
-- $DLR_- = (1 - .997) / .985  \approx .003$
-- Therefore, the post-test odds of disease is now $.3\%$ of the pretest odds given the negative test.
-- Or, the hypothesis of disease is supported $.003$ times that of the hypothesis of absence of disease given the negative test result
-
+---
+title       : Conditional Probability
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Conditional probability, motivation
+
+- The probability of getting a one when rolling a (standard) die
+  is usually assumed to be one sixth
+- Suppose you were given the extra information that the die roll
+  was an odd number (hence 1, 3 or 5)
+- *conditional on this new information*, the probability of a
+  one is now one third
+
+---
+
+## Conditional probability, definition
+
+- Let $B$ be an event so that $P(B) > 0$
+- Then the conditional probability of an event $A$ given that $B$ has occurred is
+  $$
+  P(A ~|~ B) = \frac{P(A \cap B)}{P(B)}
+  $$
+- Notice that if $A$ and $B$ are independent, then
+  $$
+  P(A ~|~ B) = \frac{P(A) P(B)}{P(B)} = P(A)
+  $$
+
+---
+
+## Example
+
+- Consider our die roll example
+- $B = \{1, 3, 5\}$
+- $A = \{1\}$
+$$
+  \begin{eqnarray*}
+P(\mbox{one given that roll is odd})  & = & P(A ~|~ B) \\ \\
+  & = & \frac{P(A \cap B)}{P(B)} \\ \\
+  & = & \frac{P(A)}{P(B)} \\ \\ 
+  & = & \frac{1/6}{3/6} = \frac{1}{3}
+  \end{eqnarray*}
+$$
+
+
+
+---
+
+## Bayes' rule
+
+$$
+P(B ~|~ A) = \frac{P(A ~|~ B) P(B)}{P(A ~|~ B) P(B) + P(A ~|~ B^c)P(B^c)}.
+$$
+  
+
+---
+
+## Diagnostic tests
+
+- Let $+$ and $-$ be the events that the result of a diagnostic test is positive or negative respectively
+- Let $D$ and $D^c$ be the event that the subject of the test has or does not have the disease respectively 
+- The **sensitivity** is the probability that the test is positive given that the subject actually has the disease, $P(+ ~|~ D)$
+- The **specificity** is the probability that the test is negative given that the subject does not have the disease, $P(- ~|~ D^c)$
+
+---
+
+## More definitions
+
+- The **positive predictive value** is the probability that the subject has the  disease given that the test is positive, $P(D ~|~ +)$
+- The **negative predictive value** is the probability that the subject does not have the disease given that the test is negative, $P(D^c ~|~ -)$
+- The **prevalence of the disease** is the marginal probability of disease, $P(D)$
+
+---
+
+## More definitions
+
+- The **diagnostic likelihood ratio of a positive test**, labeled $DLR_+$, is $P(+ ~|~ D) / P(+ ~|~ D^c)$, which is the $$sensitivity / (1 - specificity)$$
+- The **diagnostic likelihood ratio of a negative test**, labeled $DLR_-$, is $P(- ~|~ D) / P(- ~|~ D^c)$, which is the $$(1 - sensitivity) / specificity$$
+
+---
+
+## Example
+
+- A study comparing the efficacy of HIV tests, reports on an experiment which concluded that HIV antibody tests have a sensitivity of 99.7% and a specificity of 98.5%
+- Suppose that a subject, from a population with a .1% prevalence of HIV, receives a positive test result. What is the probability that this subject has HIV?
+- Mathematically, we want $P(D ~|~ +)$ given the sensitivity, $P(+ ~|~ D) = .997$, the specificity, $P(- ~|~ D^c) =.985$, and the prevalence $P(D) = .001$
+
+---
+
+## Using Bayes' formula
+
+$$
+\begin{eqnarray*}
+  P(D ~|~ +) & = &\frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)}\\ \\
+ & = & \frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + \{1-P(-~|~D^c)\}\{1 - P(D)\}} \\ \\
+ & = & \frac{.997\times .001}{.997 \times .001 + .015 \times .999}\\ \\
+ & = & .062
+\end{eqnarray*}
+$$
+
+- In this population a positive test result only suggests a 6% probability that the subject has the disease 
+- (The positive predictive value is 6% for this test)
+
+---
+
+## More on this example
+
+- The low positive predictive value is due to low prevalence of disease and the somewhat modest specificity
+- Suppose it was known that the subject was an intravenous drug user and routinely had intercourse with an HIV infected partner
+- Notice that the evidence implied by a positive test result does not change because of the prevalence of disease in the subject's population, only our interpretation of that evidence changes
+
+---
+
+## Likelihood ratios
+
+- Using Bayes rule, we have
+  $$
+  P(D ~|~ +) = \frac{P(+~|~D)P(D)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)} 
+  $$
+  and
+  $$
+  P(D^c ~|~ +) = \frac{P(+~|~D^c)P(D^c)}{P(+~|~D)P(D) + P(+~|~D^c)P(D^c)}.
+  $$
+
+---
+
+## Likelihood ratios
+
+- Therefore
+$$
+\frac{P(D ~|~ +)}{P(D^c ~|~ +)} = \frac{P(+~|~D)}{P(+~|~D^c)}\times \frac{P(D)}{P(D^c)}
+$$
+ie
+$$
+\mbox{post-test odds of }D = DLR_+\times\mbox{pre-test odds of }D
+$$
+- Similarly, $DLR_-$ relates the decrease in the odds of the
+  disease after a negative test result to the odds of disease prior to
+  the test.
+
+---
+
+## HIV example revisited
+
+- Suppose a subject has a positive HIV test
+- $DLR_+ = .997 / (1 - .985) \approx 66$
+- The result of the positive test is that the odds of disease is now 66 times the pretest odds
+- Or, equivalently, the hypothesis of disease is 66 times more supported by the data than the hypothesis of no disease
+
+---
+
+## HIV example revisited
+
+- Suppose that a subject has a negative test result 
+- $DLR_- = (1 - .997) / .985  \approx .003$
+- Therefore, the post-test odds of disease is now $.3\%$ of the pretest odds given the negative test.
+- Or, the hypothesis of disease is supported $.003$ times that of the hypothesis of absence of disease given the negative test result
+
diff --git a/06_StatisticalInference/01_05_ConditionalProbability/index.pdf b/06_StatisticalInference/01_05_ConditionalProbability/index.pdf
index 7cbac08c4..a5a7edead 100644
Binary files a/06_StatisticalInference/01_05_ConditionalProbability/index.pdf and b/06_StatisticalInference/01_05_ConditionalProbability/index.pdf differ
diff --git a/06_StatisticalInference/02_01_CommonDistributions/index.Rmd b/06_StatisticalInference/02_01_CommonDistributions/index.Rmd
index 7293d2964..586fd0eae 100644
--- a/06_StatisticalInference/02_01_CommonDistributions/index.Rmd
+++ b/06_StatisticalInference/02_01_CommonDistributions/index.Rmd
@@ -1,346 +1,331 @@
----
-title       : Some Common Distributions
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F, results='hide'}
-# make this an external chunk that can be included in any file
-options(width = 100)
-opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/')
-
-options(xtable.type = 'html')
-knit_hooks$set(inline = function(x) {
-  if(is.numeric(x)) {
-    round(x, getOption('digits'))
-  } else {
-    paste(as.character(x), collapse = ', ')
-  }
-})
-knit_hooks$set(plot = knitr:::hook_plot_html)
-runif(1)
-```
-
-## The Bernoulli distribution
-
-- The **Bernoulli distribution** arises as the result of a binary outcome
-- Bernoulli random variables take (only) the values 1 and 0 with probabilities of (say) $p$ and $1-p$ respectively
-- The PMF for a Bernoulli random variable $X$ is $$P(X = x) =  p^x (1 - p)^{1 - x}$$
-- The mean of a Bernoulli random variable is $p$ and the variance is $p(1 - p)$
-- If we let $X$ be a Bernoulli random variable, it is typical to call $X=1$ as a "success" and $X=0$ as a "failure"
-
----
-
-## iid Bernoulli trials
-
-- If several iid Bernoulli observations, say $x_1,\ldots, x_n$, are observed the
-likelihood is 
-$$
-  \prod_{i=1}^n p^{x_i} (1 - p)^{1 - x_i} = p^{\sum x_i} (1 - p)^{n - \sum x_i}
-$$
-- Notice that the likelihood depends only on the sum of the $x_i$
-- Because $n$ is fixed and assumed known, this implies that the sample proportion $\sum_i x_i / n$ contains all of the relevant information about $p$
-- We can maximize the Bernoulli likelihood over $p$ to obtain that $\hat p = \sum_i x_i / n$ is the maximum likelihood estimator for $p$
-
----
-## Plotting all possible likelihoods for a small n
-```
-n <- 5
-pvals <- seq(0, 1, length = 1000)
-plot(c(0, 1), c(0, 1.2), type = "n", frame = FALSE, xlab = "p", ylab = "likelihood")
-text((0 : n) /n, 1.1, as.character(0 : n))
-sapply(0 : n, function(x) {
-  phat <- x / n
-  if (x == 0) lines(pvals,  ( (1 - pvals) / (1 - phat) )^(n-x), lwd = 3)
-  else if (x == n) lines(pvals, (pvals / phat) ^ x, lwd = 3)
-  else lines(pvals, (pvals / phat ) ^ x * ( (1 - pvals) / (1 - phat) ) ^ (n-x), lwd = 3) 
-  }
-)
-title(paste("Likelihoods for n = ", n))
-```
-
----
-```{r, fig.height=6, fig.width=6, echo = FALSE, results='hide'}
-n <- 5
-pvals <- seq(0, 1, length = 1000)
-plot(c(0, 1), c(0, 1.2), type = "n", frame = FALSE, xlab = "p", ylab = "likelihood")
-text((0 : n) /n, 1.1, as.character(0 : n))
-sapply(0 : n, function(x) {
-  phat <- x / n
-  if (x == 0) lines(pvals,  ( (1 - pvals) / (1 - phat) )^(n-x), lwd = 3)
-  else if (x == n) lines(pvals, (pvals / phat) ^ x, lwd = 3)
-  else lines(pvals, (pvals / phat ) ^ x * ( (1 - pvals) / (1 - phat) ) ^ (n-x), lwd = 3) 
-  }
-)
-title(paste("Likelihoods for n = ", n))
-```
-
----
-
-## Binomial trials
-
-- The *binomial random variables* are obtained as the sum of iid Bernoulli trials
-- In specific, let $X_1,\ldots,X_n$ be iid Bernoulli$(p)$; then $X = \sum_{i=1}^n X_i$ is a binomial random variable
-- The binomial mass function is
-$$
-P(X = x) = 
-\left(
-\begin{array}{c}
-  n \\ x
-\end{array}
-\right)
-p^x(1 - p)^{n-x}
-$$
-for $x=0,\ldots,n$
-
----
-
-## Choose
-
-- Recall that the notation 
-  $$\left(
-    \begin{array}{c}
-      n \\ x
-    \end{array}
-  \right) = \frac{n!}{x!(n-x)!}
-  $$ (read "$n$ choose $x$") counts the number of ways of selecting $x$ items out of $n$
-  without replacement disregarding the order of the items
-
-$$\left(
-    \begin{array}{c}
-      n \\ 0
-    \end{array}
-  \right) =
-\left(
-    \begin{array}{c}
-      n \\ n
-    \end{array}
-  \right) =  1
-  $$ 
-
----
-
-## Example justification of the binomial likelihood
-
-- Consider the probability of getting $6$ heads out of $10$ coin flips from a coin with success probability $p$ 
-- The probability of getting $6$ heads and $4$ tails in any specific order is
-  $$
-  p^6(1-p)^4
-  $$
-- There are 
-$$\left(
-\begin{array}{c}
-  10 \\ 6
-\end{array}
-\right)
-$$
-possible orders of $6$ heads and $4$ tails
-
----
-
-## Example
-
-- Suppose a friend has $8$ children (oh my!), $7$ of which are girls and none are twins
-- If each gender has an independent $50$% probability for each birth, what's the probability of getting $7$ or more girls out of $8$ births?
-$$\left(
-\begin{array}{c}
-  8 \\ 7
-\end{array}
-\right) .5^{7}(1-.5)^{1}
-+
-\left(
-\begin{array}{c}
-  8 \\ 8
-\end{array}
-\right) .5^{8}(1-.5)^{0} \approx 0.04
-$$
-```{r}
-choose(8, 7) * .5 ^ 8 + choose(8, 8) * .5 ^ 8 
-pbinom(6, size = 8, prob = .5, lower.tail = FALSE)
-```
-
----
-```{r, fig.height=5, fig.width=5}
-plot(pvals, dbinom(7, 8, pvals) / dbinom(7, 8, 7/8) , 
-     lwd = 3, frame = FALSE, type = "l", xlab = "p", ylab = "likelihood")
-```
-
----
-
-## The normal distribution
-
-- A random variable is said to follow a **normal** or **Gaussian** distribution with mean $\mu$ and variance $\sigma^2$ if the associated density is
-  $$
-  (2\pi \sigma^2)^{-1/2}e^{-(x - \mu)^2/2\sigma^2}
-  $$
-  If $X$ a RV with this density then $E[X] = \mu$ and $Var(X) = \sigma^2$
-- We write $X\sim \mbox{N}(\mu, \sigma^2)$
-- When $\mu = 0$ and $\sigma = 1$ the resulting distribution is called **the standard normal distribution**
-- The standard normal density function is labeled $\phi$
-- Standard normal RVs are often labeled $Z$
-
----
-```{r, fig.height=4.5, fig.width=4.5}
-zvals <- seq(-3, 3, length = 1000)
-plot(zvals, dnorm(zvals), 
-     type = "l", lwd = 3, frame = FALSE, xlab = "z", ylab = "Density")
-sapply(-3 : 3, function(k) abline(v = k))
-```
-
----
-
-## Facts about the normal density
-
-- If $X \sim \mbox{N}(\mu,\sigma^2)$ the $Z = \frac{X -\mu}{\sigma}$ is standard normal
-- If $Z$ is standard normal $$X = \mu + \sigma Z \sim \mbox{N}(\mu, \sigma^2)$$
-- The non-standard normal density is $$\phi\{(x - \mu) / \sigma\}/\sigma$$
-
----
-
-## More facts about the normal density
-
-1. Approximately $68\%$, $95\%$ and $99\%$  of the normal density lies within $1$, $2$ and $3$ standard deviations from the mean, respectively
-2. $-1.28$, $-1.645$, $-1.96$ and $-2.33$ are the $10^{th}$, $5^{th}$, $2.5^{th}$ and $1^{st}$ percentiles of the standard normal distribution respectively
-3. By symmetry, $1.28$, $1.645$, $1.96$ and $2.33$ are the $90^{th}$, $95^{th}$, $97.5^{th}$ and $99^{th}$ percentiles of the standard normal distribution respectively
-
----
-
-## Question
-
-- What is the $95^{th}$ percentile of a $N(\mu, \sigma^2)$ distribution? 
-  - Quick answer in R `qnorm(.95, mean = mu, sd = sd)`
-- We want the point $x_0$ so that $P(X \leq x_0) = .95$
-$$
-  \begin{eqnarray*}
-    P(X \leq x_0) & = & P\left(\frac{X - \mu}{\sigma} \leq \frac{x_0 - \mu}{\sigma}\right) \\ \\
-                  & = & P\left(Z \leq \frac{x_0 - \mu}{\sigma}\right) =  .95
-  \end{eqnarray*}
-$$
-- Therefore
-  $$\frac{x_0 - \mu}{\sigma} = 1.645$$
-  or $x_0 = \mu + \sigma 1.645$
-- In general $x_0 = \mu + \sigma z_0$ where $z_0$ is the appropriate standard normal quantile
-
----
-
-## Question
-
-- What is the probability that a $\mbox{N}(\mu,\sigma^2)$ RV is 2 standard deviations above the mean?
-- We want to know
-$$
-  \begin{eqnarray*}
-  P(X > \mu + 2\sigma) & = & 
-P\left(\frac{X -\mu}{\sigma} > \frac{\mu + 2\sigma - \mu}{\sigma}\right)    \\ \\
-& = & P(Z \geq 2 ) \\ \\ 
-& \approx & 2.5\%
-  \end{eqnarray*}
-$$
-
----
-
-## Other properties
-
-- The normal distribution is symmetric and peaked about its mean (therefore the mean, median and mode are all equal)
-- A constant times a normally distributed random variable is also normally distributed (what is the mean and variance?)
-- Sums of normally distributed random variables are again normally distributed even if the variables are dependent (what is the mean and variance?)
-- Sample means of normally distributed random variables are again normally distributed (with what mean and variance?)
-- The square of a *standard normal* random variable follows what is called **chi-squared** distribution 
-- The exponent of a normally distributed random variables follows what is called the **log-normal** distribution 
-- As we will see later, many random variables, properly normalized, *limit* to a normal distribution
-
----
-
-## Final thoughts on normal likelihoods
-- The MLE for $\mu$ is $\bar X$.
-- The MLE for $\sigma^2$ is
-  $$
-  \frac{\sum_{i=1}^n (X_i - \bar X)^2}{n}
-  $$
-  (Which is the biased version of the sample variance.)
-- The MLE of $\sigma$ is simply the square root of this
-  estimate
-
----
-## The Poisson distribution
-* Used to model counts
-* The Poisson mass function is
-$$
-P(X = x; \lambda) = \frac{\lambda^x e^{-\lambda}}{x!}
-$$
-for $x=0,1,\ldots$
-* The mean of this distribution is $\lambda$
-* The variance of this distribution is $\lambda$
-* Notice that $x$ ranges from $0$ to $\infty$
-
----
-## Some uses for the Poisson distribution
-* Modeling event/time data
-* Modeling radioactive decay
-* Modeling survival data
-* Modeling unbounded count data 
-* Modeling contingency tables
-* Approximating binomials when $n$ is large and $p$ is small
-
----
-## Poisson derivation
-* $\lambda$ is the mean number of events per unit time
-* Let $h$ be very small 
-* Suppose we assume that 
-  * Prob. of an event in an interval of length $h$ is $\lambda h$
-    while the prob. of more than one event is negligible
-  * Whether or not an event occurs in one small interval
-    does not impact whether or not an event occurs in another
-    small interval
-then, the number of events per unit time is Poisson with mean $\lambda$ 
-
----
-## Rates and Poisson random variables
-* Poisson random variables are used to model rates
-* $X \sim Poisson(\lambda t)$ where 
-  * $\lambda = E[X / t]$ is the expected count per unit of time
-  * $t$ is the total monitoring time
-
----
-## Poisson approximation to the binomial
-* When $n$ is large and $p$ is small the Poisson distribution
-  is an accurate approximation to the binomial distribution
-* Notation
-  * $\lambda = n p$
-  * $X \sim \mbox{Binomial}(n, p)$, $\lambda = n p$ and
-  * $n$ gets large 
-  * $p$ gets small
-  * $\lambda$ stays constant
-
----
-## Example
-The number of people that show up at a bus stop is Poisson with
-a mean of $2.5$ per hour.
-
-If watching the bus stop for 4 hours, what is the probability that $3$
-or fewer people show up for the whole time?
-
-```{r}
-ppois(3, lambda = 2.5 * 4)
-```
-
----
-## Example, Poisson approximation to the binomial
-
-We flip a coin with success probablity $0.01$ five hundred times. 
-
-What's the probability of 2 or fewer successes?
-
-```{r}
-pbinom(2, size = 500, prob = .01)
-ppois(2, lambda=500 * .01)
-```
-
+---
+title       : Some Common Distributions
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+
+## The Bernoulli distribution
+
+- The **Bernoulli distribution** arises as the result of a binary outcome
+- Bernoulli random variables take (only) the values 1 and 0 with probabilities of (say) $p$ and $1-p$ respectively
+- The PMF for a Bernoulli random variable $X$ is $$P(X = x) =  p^x (1 - p)^{1 - x}$$
+- The mean of a Bernoulli random variable is $p$ and the variance is $p(1 - p)$
+- If we let $X$ be a Bernoulli random variable, it is typical to call $X=1$ as a "success" and $X=0$ as a "failure"
+
+---
+
+## iid Bernoulli trials
+
+- If several iid Bernoulli observations, say $x_1,\ldots, x_n$, are observed the
+likelihood is 
+$$
+  \prod_{i=1}^n p^{x_i} (1 - p)^{1 - x_i} = p^{\sum x_i} (1 - p)^{n - \sum x_i}
+$$
+- Notice that the likelihood depends only on the sum of the $x_i$
+- Because $n$ is fixed and assumed known, this implies that the sample proportion $\sum_i x_i / n$ contains all of the relevant information about $p$
+- We can maximize the Bernoulli likelihood over $p$ to obtain that $\hat p = \sum_i x_i / n$ is the maximum likelihood estimator for $p$
+
+---
+## Plotting all possible likelihoods for a small n
+```
+n <- 5
+pvals <- seq(0, 1, length = 1000)
+plot(c(0, 1), c(0, 1.2), type = "n", frame = FALSE, xlab = "p", ylab = "likelihood")
+text((0 : n) /n, 1.1, as.character(0 : n))
+sapply(0 : n, function(x) {
+  phat <- x / n
+  if (x == 0) lines(pvals,  ( (1 - pvals) / (1 - phat) )^(n-x), lwd = 3)
+  else if (x == n) lines(pvals, (pvals / phat) ^ x, lwd = 3)
+  else lines(pvals, (pvals / phat ) ^ x * ( (1 - pvals) / (1 - phat) ) ^ (n-x), lwd = 3) 
+  }
+)
+title(paste("Likelihoods for n = ", n))
+```
+
+---
+```{r, fig.height=6, fig.width=6, echo = FALSE, results='hide'}
+n <- 5
+pvals <- seq(0, 1, length = 1000)
+plot(c(0, 1), c(0, 1.2), type = "n", frame = FALSE, xlab = "p", ylab = "likelihood")
+text((0 : n) /n, 1.1, as.character(0 : n))
+sapply(0 : n, function(x) {
+  phat <- x / n
+  if (x == 0) lines(pvals,  ( (1 - pvals) / (1 - phat) )^(n-x), lwd = 3)
+  else if (x == n) lines(pvals, (pvals / phat) ^ x, lwd = 3)
+  else lines(pvals, (pvals / phat ) ^ x * ( (1 - pvals) / (1 - phat) ) ^ (n-x), lwd = 3) 
+  }
+)
+title(paste("Likelihoods for n = ", n))
+```
+
+---
+
+## Binomial trials
+
+- The *binomial random variables* are obtained as the sum of iid Bernoulli trials
+- In specific, let $X_1,\ldots,X_n$ be iid Bernoulli$(p)$; then $X = \sum_{i=1}^n X_i$ is a binomial random variable
+- The binomial mass function is
+$$
+P(X = x) = 
+\left(
+\begin{array}{c}
+  n \\ x
+\end{array}
+\right)
+p^x(1 - p)^{n-x}
+$$
+for $x=0,\ldots,n$
+
+---
+
+## Choose
+
+- Recall that the notation 
+  $$\left(
+    \begin{array}{c}
+      n \\ x
+    \end{array}
+  \right) = \frac{n!}{x!(n-x)!}
+  $$ (read "$n$ choose $x$") counts the number of ways of selecting $x$ items out of $n$
+  without replacement disregarding the order of the items
+
+$$\left(
+    \begin{array}{c}
+      n \\ 0
+    \end{array}
+  \right) =
+\left(
+    \begin{array}{c}
+      n \\ n
+    \end{array}
+  \right) =  1
+  $$ 
+
+---
+
+## Example justification of the binomial likelihood
+
+- Consider the probability of getting $6$ heads out of $10$ coin flips from a coin with success probability $p$ 
+- The probability of getting $6$ heads and $4$ tails in any specific order is
+  $$
+  p^6(1-p)^4
+  $$
+- There are 
+$$\left(
+\begin{array}{c}
+  10 \\ 6
+\end{array}
+\right)
+$$
+possible orders of $6$ heads and $4$ tails
+
+---
+
+## Example
+
+- Suppose a friend has $8$ children (oh my!), $7$ of which are girls and none are twins
+- If each gender has an independent $50$% probability for each birth, what's the probability of getting $7$ or more girls out of $8$ births?
+$$\left(
+\begin{array}{c}
+  8 \\ 7
+\end{array}
+\right) .5^{7}(1-.5)^{1}
++
+\left(
+\begin{array}{c}
+  8 \\ 8
+\end{array}
+\right) .5^{8}(1-.5)^{0} \approx 0.04
+$$
+```{r}
+choose(8, 7) * .5 ^ 8 + choose(8, 8) * .5 ^ 8 
+pbinom(6, size = 8, prob = .5, lower.tail = FALSE)
+```
+
+---
+```{r, fig.height=5, fig.width=5}
+plot(pvals, dbinom(7, 8, pvals) / dbinom(7, 8, 7/8) , 
+     lwd = 3, frame = FALSE, type = "l", xlab = "p", ylab = "likelihood")
+```
+
+---
+
+## The normal distribution
+
+- A random variable is said to follow a **normal** or **Gaussian** distribution with mean $\mu$ and variance $\sigma^2$ if the associated density is
+  $$
+  (2\pi \sigma^2)^{-1/2}e^{-(x - \mu)^2/2\sigma^2}
+  $$
+  If $X$ a RV with this density then $E[X] = \mu$ and $Var(X) = \sigma^2$
+- We write $X\sim \mbox{N}(\mu, \sigma^2)$
+- When $\mu = 0$ and $\sigma = 1$ the resulting distribution is called **the standard normal distribution**
+- The standard normal density function is labeled $\phi$
+- Standard normal RVs are often labeled $Z$
+
+---
+```{r, fig.height=5, fig.width=5, fig.align='center', results='hide'}
+zvals <- seq(-3, 3, length = 1000)
+plot(zvals, dnorm(zvals), 
+     type = "l", lwd = 3, frame = FALSE, xlab = "z", ylab = "Density")
+sapply(-3 : 3, function(k) abline(v = k))
+```
+
+---
+
+## Facts about the normal density
+
+- If $X \sim \mbox{N}(\mu,\sigma^2)$ the $Z = \frac{X -\mu}{\sigma}$ is standard normal
+- If $Z$ is standard normal $$X = \mu + \sigma Z \sim \mbox{N}(\mu, \sigma^2)$$
+- The non-standard normal density is $$\phi\{(x - \mu) / \sigma\}/\sigma$$
+
+---
+
+## More facts about the normal density
+
+1. Approximately $68\%$, $95\%$ and $99\%$  of the normal density lies within $1$, $2$ and $3$ standard deviations from the mean, respectively
+2. $-1.28$, $-1.645$, $-1.96$ and $-2.33$ are the $10^{th}$, $5^{th}$, $2.5^{th}$ and $1^{st}$ percentiles of the standard normal distribution respectively
+3. By symmetry, $1.28$, $1.645$, $1.96$ and $2.33$ are the $90^{th}$, $95^{th}$, $97.5^{th}$ and $99^{th}$ percentiles of the standard normal distribution respectively
+
+---
+
+## Question
+
+- What is the $95^{th}$ percentile of a $N(\mu, \sigma^2)$ distribution? 
+  - Quick answer in R `qnorm(.95, mean = mu, sd = sd)`
+- We want the point $x_0$ so that $P(X \leq x_0) = .95$
+$$
+  \begin{eqnarray*}
+    P(X \leq x_0) & = & P\left(\frac{X - \mu}{\sigma} \leq \frac{x_0 - \mu}{\sigma}\right) \\ \\
+                  & = & P\left(Z \leq \frac{x_0 - \mu}{\sigma}\right) =  .95
+  \end{eqnarray*}
+$$
+- Therefore
+  $$\frac{x_0 - \mu}{\sigma} = 1.645$$
+  or $x_0 = \mu + \sigma 1.645$
+- In general $x_0 = \mu + \sigma z_0$ where $z_0$ is the appropriate standard normal quantile
+
+---
+
+## Question
+
+- What is the probability that a $\mbox{N}(\mu,\sigma^2)$ RV is 2 standard deviations above the mean?
+- We want to know
+$$
+  \begin{eqnarray*}
+  P(X > \mu + 2\sigma) & = & 
+P\left(\frac{X -\mu}{\sigma} > \frac{\mu + 2\sigma - \mu}{\sigma}\right)    \\ \\
+& = & P(Z \geq 2 ) \\ \\ 
+& \approx & 2.5\%
+  \end{eqnarray*}
+$$
+
+---
+
+## Other properties
+
+- The normal distribution is symmetric and peaked about its mean (therefore the mean, median and mode are all equal)
+- A constant times a normally distributed random variable is also normally distributed (what is the mean and variance?)
+- Sums of normally distributed random variables are again normally distributed even if the variables are dependent (what is the mean and variance?)
+- Sample means of normally distributed random variables are again normally distributed (with what mean and variance?)
+- The square of a *standard normal* random variable follows what is called **chi-squared** distribution 
+- The exponent of a normally distributed random variables follows what is called the **log-normal** distribution 
+- As we will see later, many random variables, properly normalized, *limit* to a normal distribution
+
+---
+
+## Final thoughts on normal likelihoods
+- The MLE for $\mu$ is $\bar X$.
+- The MLE for $\sigma^2$ is
+  $$
+  \frac{\sum_{i=1}^n (X_i - \bar X)^2}{n}
+  $$
+  (Which is the biased version of the sample variance.)
+- The MLE of $\sigma$ is simply the square root of this
+  estimate
+
+---
+## The Poisson distribution
+* Used to model counts
+* The Poisson mass function is
+$$
+P(X = x; \lambda) = \frac{\lambda^x e^{-\lambda}}{x!}
+$$
+for $x=0,1,\ldots$
+* The mean of this distribution is $\lambda$
+* The variance of this distribution is $\lambda$
+* Notice that $x$ ranges from $0$ to $\infty$
+
+---
+## Some uses for the Poisson distribution
+* Modeling event/time data
+* Modeling radioactive decay
+* Modeling survival data
+* Modeling unbounded count data 
+* Modeling contingency tables
+* Approximating binomials when $n$ is large and $p$ is small
+
+---
+## Poisson derivation
+* $\lambda$ is the mean number of events per unit time
+* Let $h$ be very small 
+* Suppose we assume that 
+  * Prob. of an event in an interval of length $h$ is $\lambda h$
+    while the prob. of more than one event is negligible
+  * Whether or not an event occurs in one small interval
+    does not impact whether or not an event occurs in another
+    small interval
+then, the number of events per unit time is Poisson with mean $\lambda$ 
+
+---
+## Rates and Poisson random variables
+* Poisson random variables are used to model rates
+* $X \sim Poisson(\lambda t)$ where 
+  * $\lambda = E[X / t]$ is the expected count per unit of time
+  * $t$ is the total monitoring time
+
+---
+## Poisson approximation to the binomial
+* When $n$ is large and $p$ is small the Poisson distribution
+  is an accurate approximation to the binomial distribution
+* Notation
+  * $\lambda = n p$
+  * $X \sim \mbox{Binomial}(n, p)$, $\lambda = n p$ and
+  * $n$ gets large 
+  * $p$ gets small
+  * $\lambda$ stays constant
+
+---
+## Example
+The number of people that show up at a bus stop is Poisson with
+a mean of $2.5$ per hour.
+
+If watching the bus stop for 4 hours, what is the probability that $3$
+or fewer people show up for the whole time?
+
+```{r}
+ppois(3, lambda = 2.5 * 4)
+```
+
+---
+## Example, Poisson approximation to the binomial
+
+We flip a coin with success probablity $0.01$ five hundred times. 
+
+What's the probability of 2 or fewer successes?
+
+```{r}
+pbinom(2, size = 500, prob = .01)
+ppois(2, lambda=500 * .01)
+```
+
diff --git a/06_StatisticalInference/02_01_CommonDistributions/index.html b/06_StatisticalInference/02_01_CommonDistributions/index.html
index be8769a08..8b616ccc0 100644
--- a/06_StatisticalInference/02_01_CommonDistributions/index.html
+++ b/06_StatisticalInference/02_01_CommonDistributions/index.html
@@ -1,750 +1,727 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Some Common Distributions</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Some Common Distributions">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Some Common Distributions</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>The Bernoulli distribution</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The <strong>Bernoulli distribution</strong> arises as the result of a binary outcome</li>
-<li>Bernoulli random variables take (only) the values 1 and 0 with probabilities of (say) \(p\) and \(1-p\) respectively</li>
-<li>The PMF for a Bernoulli random variable \(X\) is \[P(X = x) =  p^x (1 - p)^{1 - x}\]</li>
-<li>The mean of a Bernoulli random variable is \(p\) and the variance is \(p(1 - p)\)</li>
-<li>If we let \(X\) be a Bernoulli random variable, it is typical to call \(X=1\) as a &quot;success&quot; and \(X=0\) as a &quot;failure&quot;</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>iid Bernoulli trials</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>If several iid Bernoulli observations, say \(x_1,\ldots, x_n\), are observed the
-likelihood is 
-\[
-\prod_{i=1}^n p^{x_i} (1 - p)^{1 - x_i} = p^{\sum x_i} (1 - p)^{n - \sum x_i}
-\]</li>
-<li>Notice that the likelihood depends only on the sum of the \(x_i\)</li>
-<li>Because \(n\) is fixed and assumed known, this implies that the sample proportion \(\sum_i x_i / n\) contains all of the relevant information about \(p\)</li>
-<li>We can maximize the Bernoulli likelihood over \(p\) to obtain that \(\hat p = \sum_i x_i / n\) is the maximum likelihood estimator for \(p\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Plotting all possible likelihoods for a small n</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code>n &lt;- 5
-pvals &lt;- seq(0, 1, length = 1000)
-plot(c(0, 1), c(0, 1.2), type = &quot;n&quot;, frame = FALSE, xlab = &quot;p&quot;, ylab = &quot;likelihood&quot;)
-text((0 : n) /n, 1.1, as.character(0 : n))
-sapply(0 : n, function(x) {
-  phat &lt;- x / n
-  if (x == 0) lines(pvals,  ( (1 - pvals) / (1 - phat) )^(n-x), lwd = 3)
-  else if (x == n) lines(pvals, (pvals / phat) ^ x, lwd = 3)
-  else lines(pvals, (pvals / phat ) ^ x * ( (1 - pvals) / (1 - phat) ) ^ (n-x), lwd = 3) 
-  }
-)
-title(paste(&quot;Likelihoods for n = &quot;, n))
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <article data-timings="">
-    <div class="rimage center"><img src="fig/unnamed-chunk-1.png" title="plot of chunk unnamed-chunk-1" alt="plot of chunk unnamed-chunk-1" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Binomial trials</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The <em>binomial random variables</em> are obtained as the sum of iid Bernoulli trials</li>
-<li>In specific, let \(X_1,\ldots,X_n\) be iid Bernoulli\((p)\); then \(X = \sum_{i=1}^n X_i\) is a binomial random variable</li>
-<li>The binomial mass function is
-\[
-P(X = x) = 
-\left(
-\begin{array}{c}
-n \\ x
-\end{array}
-\right)
-p^x(1 - p)^{n-x}
-\]
-for \(x=0,\ldots,n\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Choose</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Recall that the notation 
-\[\left(
-\begin{array}{c}
-  n \\ x
-\end{array}
-\right) = \frac{n!}{x!(n-x)!}
-\] (read &quot;\(n\) choose \(x\)&quot;) counts the number of ways of selecting \(x\) items out of \(n\)
-without replacement disregarding the order of the items</li>
-</ul>
-
-<p>\[\left(
-    \begin{array}{c}
-      n \\ 0
-    \end{array}
-  \right) =
-\left(
-    \begin{array}{c}
-      n \\ n
-    \end{array}
-  \right) =  1
-  \] </p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Example justification of the binomial likelihood</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Consider the probability of getting \(6\) heads out of \(10\) coin flips from a coin with success probability \(p\) </li>
-<li>The probability of getting \(6\) heads and \(4\) tails in any specific order is
-\[
-p^6(1-p)^4
-\]</li>
-<li>There are 
-\[\left(
-\begin{array}{c}
-10 \\ 6
-\end{array}
-\right)
-\]
-possible orders of \(6\) heads and \(4\) tails</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose a friend has \(8\) children (oh my!), \(7\) of which are girls and none are twins</li>
-<li>If each gender has an independent \(50\)% probability for each birth, what&#39;s the probability of getting \(7\) or more girls out of \(8\) births?
-\[\left(
-\begin{array}{c}
-8 \\ 7
-\end{array}
-\right) .5^{7}(1-.5)^{1}
-+
-\left(
-\begin{array}{c}
-8 \\ 8
-\end{array}
-\right) .5^{8}(1-.5)^{0} \approx 0.04
-\]</li>
-</ul>
-
-<pre><code class="r">choose(8, 7) * .5 ^ 8 + choose(8, 8) * .5 ^ 8 
-</code></pre>
-
-<pre><code>[1] 0.03516
-</code></pre>
-
-<pre><code class="r">pbinom(6, size = 8, prob = .5, lower.tail = FALSE)
-</code></pre>
-
-<pre><code>[1] 0.03516
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <article data-timings="">
-    <pre><code class="r">plot(pvals, dbinom(7, 8, pvals) / dbinom(7, 8, 7/8) , 
-     lwd = 3, frame = FALSE, type = &quot;l&quot;, xlab = &quot;p&quot;, ylab = &quot;likelihood&quot;)
-</code></pre>
-
-<div class="rimage center"><img src="fig/unnamed-chunk-3.png" title="plot of chunk unnamed-chunk-3" alt="plot of chunk unnamed-chunk-3" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>The normal distribution</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>A random variable is said to follow a <strong>normal</strong> or <strong>Gaussian</strong> distribution with mean \(\mu\) and variance \(\sigma^2\) if the associated density is
-\[
-(2\pi \sigma^2)^{-1/2}e^{-(x - \mu)^2/2\sigma^2}
-\]
-If \(X\) a RV with this density then \(E[X] = \mu\) and \(Var(X) = \sigma^2\)</li>
-<li>We write \(X\sim \mbox{N}(\mu, \sigma^2)\)</li>
-<li>When \(\mu = 0\) and \(\sigma = 1\) the resulting distribution is called <strong>the standard normal distribution</strong></li>
-<li>The standard normal density function is labeled \(\phi\)</li>
-<li>Standard normal RVs are often labeled \(Z\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <article data-timings="">
-    <pre><code class="r">zvals &lt;- seq(-3, 3, length = 1000)
-plot(zvals, dnorm(zvals), 
-     type = &quot;l&quot;, lwd = 3, frame = FALSE, xlab = &quot;z&quot;, ylab = &quot;Density&quot;)
-sapply(-3 : 3, function(k) abline(v = k))
-</code></pre>
-
-<div class="rimage center"><img src="fig/unnamed-chunk-4.png" title="plot of chunk unnamed-chunk-4" alt="plot of chunk unnamed-chunk-4" class="plot" /></div>
-
-<pre><code>[[1]]
-NULL
-
-[[2]]
-NULL
-
-[[3]]
-NULL
-
-[[4]]
-NULL
-
-[[5]]
-NULL
-
-[[6]]
-NULL
-
-[[7]]
-NULL
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-12" style="background:;">
-  <hgroup>
-    <h2>Facts about the normal density</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>If \(X \sim \mbox{N}(\mu,\sigma^2)\) the \(Z = \frac{X -\mu}{\sigma}\) is standard normal</li>
-<li>If \(Z\) is standard normal \[X = \mu + \sigma Z \sim \mbox{N}(\mu, \sigma^2)\]</li>
-<li>The non-standard normal density is \[\phi\{(x - \mu) / \sigma\}/\sigma\]</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-13" style="background:;">
-  <hgroup>
-    <h2>More facts about the normal density</h2>
-  </hgroup>
-  <article data-timings="">
-    <ol>
-<li>Approximately \(68\%\), \(95\%\) and \(99\%\)  of the normal density lies within \(1\), \(2\) and \(3\) standard deviations from the mean, respectively</li>
-<li>\(-1.28\), \(-1.645\), \(-1.96\) and \(-2.33\) are the \(10^{th}\), \(5^{th}\), \(2.5^{th}\) and \(1^{st}\) percentiles of the standard normal distribution respectively</li>
-<li>By symmetry, \(1.28\), \(1.645\), \(1.96\) and \(2.33\) are the \(90^{th}\), \(95^{th}\), \(97.5^{th}\) and \(99^{th}\) percentiles of the standard normal distribution respectively</li>
-</ol>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-14" style="background:;">
-  <hgroup>
-    <h2>Question</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>What is the \(95^{th}\) percentile of a \(N(\mu, \sigma^2)\) distribution? 
-
-<ul>
-<li>Quick answer in R <code>qnorm(.95, mean = mu, sd = sd)</code></li>
-</ul></li>
-<li>We want the point \(x_0\) so that \(P(X \leq x_0) = .95\)
-\[
-\begin{eqnarray*}
-P(X \leq x_0) & = & P\left(\frac{X - \mu}{\sigma} \leq \frac{x_0 - \mu}{\sigma}\right) \\ \\
-              & = & P\left(Z \leq \frac{x_0 - \mu}{\sigma}\right) =  .95
-\end{eqnarray*}
-\]</li>
-<li>Therefore
-\[\frac{x_0 - \mu}{\sigma} = 1.645\]
-or \(x_0 = \mu + \sigma 1.645\)</li>
-<li>In general \(x_0 = \mu + \sigma z_0\) where \(z_0\) is the appropriate standard normal quantile</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-15" style="background:;">
-  <hgroup>
-    <h2>Question</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>What is the probability that a \(\mbox{N}(\mu,\sigma^2)\) RV is 2 standard deviations above the mean?</li>
-<li>We want to know
-\[
-\begin{eqnarray*}
-P(X > \mu + 2\sigma) & = & 
-P\left(\frac{X -\mu}{\sigma} > \frac{\mu + 2\sigma - \mu}{\sigma}\right)    \\ \\
-& = & P(Z \geq 2 ) \\ \\ 
-& \approx & 2.5\%
-\end{eqnarray*}
-\]</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-16" style="background:;">
-  <hgroup>
-    <h2>Other properties</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The normal distribution is symmetric and peaked about its mean (therefore the mean, median and mode are all equal)</li>
-<li>A constant times a normally distributed random variable is also normally distributed (what is the mean and variance?)</li>
-<li>Sums of normally distributed random variables are again normally distributed even if the variables are dependent (what is the mean and variance?)</li>
-<li>Sample means of normally distributed random variables are again normally distributed (with what mean and variance?)</li>
-<li>The square of a <em>standard normal</em> random variable follows what is called <strong>chi-squared</strong> distribution </li>
-<li>The exponent of a normally distributed random variables follows what is called the <strong>log-normal</strong> distribution </li>
-<li>As we will see later, many random variables, properly normalized, <em>limit</em> to a normal distribution</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-17" style="background:;">
-  <hgroup>
-    <h2>Final thoughts on normal likelihoods</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The MLE for \(\mu\) is \(\bar X\).</li>
-<li>The MLE for \(\sigma^2\) is
-\[
-\frac{\sum_{i=1}^n (X_i - \bar X)^2}{n}
-\]
-(Which is the biased version of the sample variance.)</li>
-<li>The MLE of \(\sigma\) is simply the square root of this
-estimate</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-18" style="background:;">
-  <hgroup>
-    <h2>The Poisson distribution</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Used to model counts</li>
-<li>The Poisson mass function is
-\[
-P(X = x; \lambda) = \frac{\lambda^x e^{-\lambda}}{x!}
-\]
-for \(x=0,1,\ldots\)</li>
-<li>The mean of this distribution is \(\lambda\)</li>
-<li>The variance of this distribution is \(\lambda\)</li>
-<li>Notice that \(x\) ranges from \(0\) to \(\infty\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-19" style="background:;">
-  <hgroup>
-    <h2>Some uses for the Poisson distribution</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Modeling event/time data</li>
-<li>Modeling radioactive decay</li>
-<li>Modeling survival data</li>
-<li>Modeling unbounded count data </li>
-<li>Modeling contingency tables</li>
-<li>Approximating binomials when \(n\) is large and \(p\) is small</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-20" style="background:;">
-  <hgroup>
-    <h2>Poisson derivation</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>\(\lambda\) is the mean number of events per unit time</li>
-<li>Let \(h\) be very small </li>
-<li>Suppose we assume that 
-
-<ul>
-<li>Prob. of an event in an interval of length \(h\) is \(\lambda h\)
-while the prob. of more than one event is negligible</li>
-<li>Whether or not an event occurs in one small interval
-does not impact whether or not an event occurs in another
-small interval
-then, the number of events per unit time is Poisson with mean \(\lambda\) </li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-21" style="background:;">
-  <hgroup>
-    <h2>Rates and Poisson random variables</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Poisson random variables are used to model rates</li>
-<li>\(X \sim Poisson(\lambda t)\) where 
-
-<ul>
-<li>\(\lambda = E[X / t]\) is the expected count per unit of time</li>
-<li>\(t\) is the total monitoring time</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-22" style="background:;">
-  <hgroup>
-    <h2>Poisson approximation to the binomial</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>When \(n\) is large and \(p\) is small the Poisson distribution
-is an accurate approximation to the binomial distribution</li>
-<li>Notation
-
-<ul>
-<li>\(\lambda = n p\)</li>
-<li>\(X \sim \mbox{Binomial}(n, p)\), \(\lambda = n p\) and</li>
-<li>\(n\) gets large </li>
-<li>\(p\) gets small</li>
-<li>\(\lambda\) stays constant</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-23" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>The number of people that show up at a bus stop is Poisson with
-a mean of \(2.5\) per hour.</p>
-
-<p>If watching the bus stop for 4 hours, what is the probability that \(3\)
-or fewer people show up for the whole time?</p>
-
-<pre><code class="r">ppois(3, lambda = 2.5 * 4)
-</code></pre>
-
-<pre><code>[1] 0.01034
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-24" style="background:;">
-  <hgroup>
-    <h2>Example, Poisson approximation to the binomial</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>We flip a coin with success probablity \(0.01\) five hundred times. </p>
-
-<p>What&#39;s the probability of 2 or fewer successes?</p>
-
-<pre><code class="r">pbinom(2, size = 500, prob = .01)
-</code></pre>
-
-<pre><code>[1] 0.1234
-</code></pre>
-
-<pre><code class="r">ppois(2, lambda=500 * .01)
-</code></pre>
-
-<pre><code>[1] 0.1247
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='The Bernoulli distribution'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='iid Bernoulli trials'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Plotting all possible likelihoods for a small n'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title=''>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Binomial trials'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Choose'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Example justification of the binomial likelihood'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Example'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title=''>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='The normal distribution'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title=''>
-         11
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=12 title='Facts about the normal density'>
-         12
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=13 title='More facts about the normal density'>
-         13
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=14 title='Question'>
-         14
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=15 title='Question'>
-         15
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=16 title='Other properties'>
-         16
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=17 title='Final thoughts on normal likelihoods'>
-         17
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=18 title='The Poisson distribution'>
-         18
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=19 title='Some uses for the Poisson distribution'>
-         19
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=20 title='Poisson derivation'>
-         20
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=21 title='Rates and Poisson random variables'>
-         21
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=22 title='Poisson approximation to the binomial'>
-         22
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=23 title='Example'>
-         23
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=24 title='Example, Poisson approximation to the binomial'>
-         24
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Some Common Distributions</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Some Common Distributions">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Some Common Distributions</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>The Bernoulli distribution</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The <strong>Bernoulli distribution</strong> arises as the result of a binary outcome</li>
+<li>Bernoulli random variables take (only) the values 1 and 0 with probabilities of (say) \(p\) and \(1-p\) respectively</li>
+<li>The PMF for a Bernoulli random variable \(X\) is \[P(X = x) =  p^x (1 - p)^{1 - x}\]</li>
+<li>The mean of a Bernoulli random variable is \(p\) and the variance is \(p(1 - p)\)</li>
+<li>If we let \(X\) be a Bernoulli random variable, it is typical to call \(X=1\) as a &quot;success&quot; and \(X=0\) as a &quot;failure&quot;</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>iid Bernoulli trials</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>If several iid Bernoulli observations, say \(x_1,\ldots, x_n\), are observed the
+likelihood is 
+\[
+\prod_{i=1}^n p^{x_i} (1 - p)^{1 - x_i} = p^{\sum x_i} (1 - p)^{n - \sum x_i}
+\]</li>
+<li>Notice that the likelihood depends only on the sum of the \(x_i\)</li>
+<li>Because \(n\) is fixed and assumed known, this implies that the sample proportion \(\sum_i x_i / n\) contains all of the relevant information about \(p\)</li>
+<li>We can maximize the Bernoulli likelihood over \(p\) to obtain that \(\hat p = \sum_i x_i / n\) is the maximum likelihood estimator for \(p\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Plotting all possible likelihoods for a small n</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code>n &lt;- 5
+pvals &lt;- seq(0, 1, length = 1000)
+plot(c(0, 1), c(0, 1.2), type = &quot;n&quot;, frame = FALSE, xlab = &quot;p&quot;, ylab = &quot;likelihood&quot;)
+text((0 : n) /n, 1.1, as.character(0 : n))
+sapply(0 : n, function(x) {
+  phat &lt;- x / n
+  if (x == 0) lines(pvals,  ( (1 - pvals) / (1 - phat) )^(n-x), lwd = 3)
+  else if (x == n) lines(pvals, (pvals / phat) ^ x, lwd = 3)
+  else lines(pvals, (pvals / phat ) ^ x * ( (1 - pvals) / (1 - phat) ) ^ (n-x), lwd = 3) 
+  }
+)
+title(paste(&quot;Likelihoods for n = &quot;, n))
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <article data-timings="">
+    <p><img src="assets/fig/unnamed-chunk-1.png" alt="plot of chunk unnamed-chunk-1"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Binomial trials</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The <em>binomial random variables</em> are obtained as the sum of iid Bernoulli trials</li>
+<li>In specific, let \(X_1,\ldots,X_n\) be iid Bernoulli\((p)\); then \(X = \sum_{i=1}^n X_i\) is a binomial random variable</li>
+<li>The binomial mass function is
+\[
+P(X = x) = 
+\left(
+\begin{array}{c}
+n \\ x
+\end{array}
+\right)
+p^x(1 - p)^{n-x}
+\]
+for \(x=0,\ldots,n\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Choose</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Recall that the notation 
+\[\left(
+\begin{array}{c}
+  n \\ x
+\end{array}
+\right) = \frac{n!}{x!(n-x)!}
+\] (read &quot;\(n\) choose \(x\)&quot;) counts the number of ways of selecting \(x\) items out of \(n\)
+without replacement disregarding the order of the items</li>
+</ul>
+
+<p>\[\left(
+    \begin{array}{c}
+      n \\ 0
+    \end{array}
+  \right) =
+\left(
+    \begin{array}{c}
+      n \\ n
+    \end{array}
+  \right) =  1
+  \] </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Example justification of the binomial likelihood</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Consider the probability of getting \(6\) heads out of \(10\) coin flips from a coin with success probability \(p\) </li>
+<li>The probability of getting \(6\) heads and \(4\) tails in any specific order is
+\[
+p^6(1-p)^4
+\]</li>
+<li>There are 
+\[\left(
+\begin{array}{c}
+10 \\ 6
+\end{array}
+\right)
+\]
+possible orders of \(6\) heads and \(4\) tails</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose a friend has \(8\) children (oh my!), \(7\) of which are girls and none are twins</li>
+<li>If each gender has an independent \(50\)% probability for each birth, what&#39;s the probability of getting \(7\) or more girls out of \(8\) births?
+\[\left(
+\begin{array}{c}
+8 \\ 7
+\end{array}
+\right) .5^{7}(1-.5)^{1}
++
+\left(
+\begin{array}{c}
+8 \\ 8
+\end{array}
+\right) .5^{8}(1-.5)^{0} \approx 0.04
+\]</li>
+</ul>
+
+<pre><code class="r">choose(8, 7) * 0.5^8 + choose(8, 8) * 0.5^8
+</code></pre>
+
+<pre><code>## [1] 0.03516
+</code></pre>
+
+<pre><code class="r">pbinom(6, size = 8, prob = 0.5, lower.tail = FALSE)
+</code></pre>
+
+<pre><code>## [1] 0.03516
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <article data-timings="">
+    <pre><code class="r">plot(pvals, dbinom(7, 8, pvals)/dbinom(7, 8, 7/8), lwd = 3, frame = FALSE, type = &quot;l&quot;, 
+    xlab = &quot;p&quot;, ylab = &quot;likelihood&quot;)
+</code></pre>
+
+<p><img src="assets/fig/unnamed-chunk-3.png" alt="plot of chunk unnamed-chunk-3"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>The normal distribution</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>A random variable is said to follow a <strong>normal</strong> or <strong>Gaussian</strong> distribution with mean \(\mu\) and variance \(\sigma^2\) if the associated density is
+\[
+(2\pi \sigma^2)^{-1/2}e^{-(x - \mu)^2/2\sigma^2}
+\]
+If \(X\) a RV with this density then \(E[X] = \mu\) and \(Var(X) = \sigma^2\)</li>
+<li>We write \(X\sim \mbox{N}(\mu, \sigma^2)\)</li>
+<li>When \(\mu = 0\) and \(\sigma = 1\) the resulting distribution is called <strong>the standard normal distribution</strong></li>
+<li>The standard normal density function is labeled \(\phi\)</li>
+<li>Standard normal RVs are often labeled \(Z\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <article data-timings="">
+    <pre><code class="r">zvals &lt;- seq(-3, 3, length = 1000)
+plot(zvals, dnorm(zvals), type = &quot;l&quot;, lwd = 3, frame = FALSE, xlab = &quot;z&quot;, ylab = &quot;Density&quot;)
+sapply(-3:3, function(k) abline(v = k))
+</code></pre>
+
+<p><img src="assets/fig/unnamed-chunk-4.png" title="plot of chunk unnamed-chunk-4" alt="plot of chunk unnamed-chunk-4" style="display: block; margin: auto;" /></p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <hgroup>
+    <h2>Facts about the normal density</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>If \(X \sim \mbox{N}(\mu,\sigma^2)\) the \(Z = \frac{X -\mu}{\sigma}\) is standard normal</li>
+<li>If \(Z\) is standard normal \[X = \mu + \sigma Z \sim \mbox{N}(\mu, \sigma^2)\]</li>
+<li>The non-standard normal density is \[\phi\{(x - \mu) / \sigma\}/\sigma\]</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-13" style="background:;">
+  <hgroup>
+    <h2>More facts about the normal density</h2>
+  </hgroup>
+  <article data-timings="">
+    <ol>
+<li>Approximately \(68\%\), \(95\%\) and \(99\%\)  of the normal density lies within \(1\), \(2\) and \(3\) standard deviations from the mean, respectively</li>
+<li>\(-1.28\), \(-1.645\), \(-1.96\) and \(-2.33\) are the \(10^{th}\), \(5^{th}\), \(2.5^{th}\) and \(1^{st}\) percentiles of the standard normal distribution respectively</li>
+<li>By symmetry, \(1.28\), \(1.645\), \(1.96\) and \(2.33\) are the \(90^{th}\), \(95^{th}\), \(97.5^{th}\) and \(99^{th}\) percentiles of the standard normal distribution respectively</li>
+</ol>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-14" style="background:;">
+  <hgroup>
+    <h2>Question</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>What is the \(95^{th}\) percentile of a \(N(\mu, \sigma^2)\) distribution? 
+
+<ul>
+<li>Quick answer in R <code>qnorm(.95, mean = mu, sd = sd)</code></li>
+</ul></li>
+<li>We want the point \(x_0\) so that \(P(X \leq x_0) = .95\)
+\[
+\begin{eqnarray*}
+P(X \leq x_0) & = & P\left(\frac{X - \mu}{\sigma} \leq \frac{x_0 - \mu}{\sigma}\right) \\ \\
+              & = & P\left(Z \leq \frac{x_0 - \mu}{\sigma}\right) =  .95
+\end{eqnarray*}
+\]</li>
+<li>Therefore
+\[\frac{x_0 - \mu}{\sigma} = 1.645\]
+or \(x_0 = \mu + \sigma 1.645\)</li>
+<li>In general \(x_0 = \mu + \sigma z_0\) where \(z_0\) is the appropriate standard normal quantile</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-15" style="background:;">
+  <hgroup>
+    <h2>Question</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>What is the probability that a \(\mbox{N}(\mu,\sigma^2)\) RV is 2 standard deviations above the mean?</li>
+<li>We want to know
+\[
+\begin{eqnarray*}
+P(X > \mu + 2\sigma) & = & 
+P\left(\frac{X -\mu}{\sigma} > \frac{\mu + 2\sigma - \mu}{\sigma}\right)    \\ \\
+& = & P(Z \geq 2 ) \\ \\ 
+& \approx & 2.5\%
+\end{eqnarray*}
+\]</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-16" style="background:;">
+  <hgroup>
+    <h2>Other properties</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The normal distribution is symmetric and peaked about its mean (therefore the mean, median and mode are all equal)</li>
+<li>A constant times a normally distributed random variable is also normally distributed (what is the mean and variance?)</li>
+<li>Sums of normally distributed random variables are again normally distributed even if the variables are dependent (what is the mean and variance?)</li>
+<li>Sample means of normally distributed random variables are again normally distributed (with what mean and variance?)</li>
+<li>The square of a <em>standard normal</em> random variable follows what is called <strong>chi-squared</strong> distribution </li>
+<li>The exponent of a normally distributed random variables follows what is called the <strong>log-normal</strong> distribution </li>
+<li>As we will see later, many random variables, properly normalized, <em>limit</em> to a normal distribution</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-17" style="background:;">
+  <hgroup>
+    <h2>Final thoughts on normal likelihoods</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The MLE for \(\mu\) is \(\bar X\).</li>
+<li>The MLE for \(\sigma^2\) is
+\[
+\frac{\sum_{i=1}^n (X_i - \bar X)^2}{n}
+\]
+(Which is the biased version of the sample variance.)</li>
+<li>The MLE of \(\sigma\) is simply the square root of this
+estimate</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-18" style="background:;">
+  <hgroup>
+    <h2>The Poisson distribution</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Used to model counts</li>
+<li>The Poisson mass function is
+\[
+P(X = x; \lambda) = \frac{\lambda^x e^{-\lambda}}{x!}
+\]
+for \(x=0,1,\ldots\)</li>
+<li>The mean of this distribution is \(\lambda\)</li>
+<li>The variance of this distribution is \(\lambda\)</li>
+<li>Notice that \(x\) ranges from \(0\) to \(\infty\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-19" style="background:;">
+  <hgroup>
+    <h2>Some uses for the Poisson distribution</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Modeling event/time data</li>
+<li>Modeling radioactive decay</li>
+<li>Modeling survival data</li>
+<li>Modeling unbounded count data </li>
+<li>Modeling contingency tables</li>
+<li>Approximating binomials when \(n\) is large and \(p\) is small</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-20" style="background:;">
+  <hgroup>
+    <h2>Poisson derivation</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>\(\lambda\) is the mean number of events per unit time</li>
+<li>Let \(h\) be very small </li>
+<li>Suppose we assume that 
+
+<ul>
+<li>Prob. of an event in an interval of length \(h\) is \(\lambda h\)
+while the prob. of more than one event is negligible</li>
+<li>Whether or not an event occurs in one small interval
+does not impact whether or not an event occurs in another
+small interval
+then, the number of events per unit time is Poisson with mean \(\lambda\) </li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-21" style="background:;">
+  <hgroup>
+    <h2>Rates and Poisson random variables</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Poisson random variables are used to model rates</li>
+<li>\(X \sim Poisson(\lambda t)\) where 
+
+<ul>
+<li>\(\lambda = E[X / t]\) is the expected count per unit of time</li>
+<li>\(t\) is the total monitoring time</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-22" style="background:;">
+  <hgroup>
+    <h2>Poisson approximation to the binomial</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>When \(n\) is large and \(p\) is small the Poisson distribution
+is an accurate approximation to the binomial distribution</li>
+<li>Notation
+
+<ul>
+<li>\(\lambda = n p\)</li>
+<li>\(X \sim \mbox{Binomial}(n, p)\), \(\lambda = n p\) and</li>
+<li>\(n\) gets large </li>
+<li>\(p\) gets small</li>
+<li>\(\lambda\) stays constant</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-23" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>The number of people that show up at a bus stop is Poisson with
+a mean of \(2.5\) per hour.</p>
+
+<p>If watching the bus stop for 4 hours, what is the probability that \(3\)
+or fewer people show up for the whole time?</p>
+
+<pre><code class="r">ppois(3, lambda = 2.5 * 4)
+</code></pre>
+
+<pre><code>## [1] 0.01034
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-24" style="background:;">
+  <hgroup>
+    <h2>Example, Poisson approximation to the binomial</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>We flip a coin with success probablity \(0.01\) five hundred times. </p>
+
+<p>What&#39;s the probability of 2 or fewer successes?</p>
+
+<pre><code class="r">pbinom(2, size = 500, prob = 0.01)
+</code></pre>
+
+<pre><code>## [1] 0.1234
+</code></pre>
+
+<pre><code class="r">ppois(2, lambda = 500 * 0.01)
+</code></pre>
+
+<pre><code>## [1] 0.1247
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='The Bernoulli distribution'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='iid Bernoulli trials'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Plotting all possible likelihoods for a small n'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title=''>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Binomial trials'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Choose'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Example justification of the binomial likelihood'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Example'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title=''>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='The normal distribution'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title=''>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title='Facts about the normal density'>
+         12
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=13 title='More facts about the normal density'>
+         13
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=14 title='Question'>
+         14
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=15 title='Question'>
+         15
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=16 title='Other properties'>
+         16
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=17 title='Final thoughts on normal likelihoods'>
+         17
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=18 title='The Poisson distribution'>
+         18
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=19 title='Some uses for the Poisson distribution'>
+         19
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=20 title='Poisson derivation'>
+         20
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=21 title='Rates and Poisson random variables'>
+         21
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=22 title='Poisson approximation to the binomial'>
+         22
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=23 title='Example'>
+         23
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=24 title='Example, Poisson approximation to the binomial'>
+         24
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/02_01_CommonDistributions/index.md b/06_StatisticalInference/02_01_CommonDistributions/index.md
index 186885c92..a7c796780 100644
--- a/06_StatisticalInference/02_01_CommonDistributions/index.md
+++ b/06_StatisticalInference/02_01_CommonDistributions/index.md
@@ -1,383 +1,358 @@
----
-title       : Some Common Distributions
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-
-## The Bernoulli distribution
-
-- The **Bernoulli distribution** arises as the result of a binary outcome
-- Bernoulli random variables take (only) the values 1 and 0 with probabilities of (say) $p$ and $1-p$ respectively
-- The PMF for a Bernoulli random variable $X$ is $$P(X = x) =  p^x (1 - p)^{1 - x}$$
-- The mean of a Bernoulli random variable is $p$ and the variance is $p(1 - p)$
-- If we let $X$ be a Bernoulli random variable, it is typical to call $X=1$ as a "success" and $X=0$ as a "failure"
-
----
-
-## iid Bernoulli trials
-
-- If several iid Bernoulli observations, say $x_1,\ldots, x_n$, are observed the
-likelihood is 
-$$
-  \prod_{i=1}^n p^{x_i} (1 - p)^{1 - x_i} = p^{\sum x_i} (1 - p)^{n - \sum x_i}
-$$
-- Notice that the likelihood depends only on the sum of the $x_i$
-- Because $n$ is fixed and assumed known, this implies that the sample proportion $\sum_i x_i / n$ contains all of the relevant information about $p$
-- We can maximize the Bernoulli likelihood over $p$ to obtain that $\hat p = \sum_i x_i / n$ is the maximum likelihood estimator for $p$
-
----
-## Plotting all possible likelihoods for a small n
-```
-n <- 5
-pvals <- seq(0, 1, length = 1000)
-plot(c(0, 1), c(0, 1.2), type = "n", frame = FALSE, xlab = "p", ylab = "likelihood")
-text((0 : n) /n, 1.1, as.character(0 : n))
-sapply(0 : n, function(x) {
-  phat <- x / n
-  if (x == 0) lines(pvals,  ( (1 - pvals) / (1 - phat) )^(n-x), lwd = 3)
-  else if (x == n) lines(pvals, (pvals / phat) ^ x, lwd = 3)
-  else lines(pvals, (pvals / phat ) ^ x * ( (1 - pvals) / (1 - phat) ) ^ (n-x), lwd = 3) 
-  }
-)
-title(paste("Likelihoods for n = ", n))
-```
-
----
-<div class="rimage center"><img src="fig/unnamed-chunk-1.png" title="plot of chunk unnamed-chunk-1" alt="plot of chunk unnamed-chunk-1" class="plot" /></div>
-
-
----
-
-## Binomial trials
-
-- The *binomial random variables* are obtained as the sum of iid Bernoulli trials
-- In specific, let $X_1,\ldots,X_n$ be iid Bernoulli$(p)$; then $X = \sum_{i=1}^n X_i$ is a binomial random variable
-- The binomial mass function is
-$$
-P(X = x) = 
-\left(
-\begin{array}{c}
-  n \\ x
-\end{array}
-\right)
-p^x(1 - p)^{n-x}
-$$
-for $x=0,\ldots,n$
-
----
-
-## Choose
-
-- Recall that the notation 
-  $$\left(
-    \begin{array}{c}
-      n \\ x
-    \end{array}
-  \right) = \frac{n!}{x!(n-x)!}
-  $$ (read "$n$ choose $x$") counts the number of ways of selecting $x$ items out of $n$
-  without replacement disregarding the order of the items
-
-$$\left(
-    \begin{array}{c}
-      n \\ 0
-    \end{array}
-  \right) =
-\left(
-    \begin{array}{c}
-      n \\ n
-    \end{array}
-  \right) =  1
-  $$ 
-
----
-
-## Example justification of the binomial likelihood
-
-- Consider the probability of getting $6$ heads out of $10$ coin flips from a coin with success probability $p$ 
-- The probability of getting $6$ heads and $4$ tails in any specific order is
-  $$
-  p^6(1-p)^4
-  $$
-- There are 
-$$\left(
-\begin{array}{c}
-  10 \\ 6
-\end{array}
-\right)
-$$
-possible orders of $6$ heads and $4$ tails
-
----
-
-## Example
-
-- Suppose a friend has $8$ children (oh my!), $7$ of which are girls and none are twins
-- If each gender has an independent $50$% probability for each birth, what's the probability of getting $7$ or more girls out of $8$ births?
-$$\left(
-\begin{array}{c}
-  8 \\ 7
-\end{array}
-\right) .5^{7}(1-.5)^{1}
-+
-\left(
-\begin{array}{c}
-  8 \\ 8
-\end{array}
-\right) .5^{8}(1-.5)^{0} \approx 0.04
-$$
-
-```r
-choose(8, 7) * .5 ^ 8 + choose(8, 8) * .5 ^ 8 
-```
-
-```
-[1] 0.03516
-```
-
-```r
-pbinom(6, size = 8, prob = .5, lower.tail = FALSE)
-```
-
-```
-[1] 0.03516
-```
-
-
----
-
-```r
-plot(pvals, dbinom(7, 8, pvals) / dbinom(7, 8, 7/8) , 
-     lwd = 3, frame = FALSE, type = "l", xlab = "p", ylab = "likelihood")
-```
-
-<div class="rimage center"><img src="fig/unnamed-chunk-3.png" title="plot of chunk unnamed-chunk-3" alt="plot of chunk unnamed-chunk-3" class="plot" /></div>
-
-
----
-
-## The normal distribution
-
-- A random variable is said to follow a **normal** or **Gaussian** distribution with mean $\mu$ and variance $\sigma^2$ if the associated density is
-  $$
-  (2\pi \sigma^2)^{-1/2}e^{-(x - \mu)^2/2\sigma^2}
-  $$
-  If $X$ a RV with this density then $E[X] = \mu$ and $Var(X) = \sigma^2$
-- We write $X\sim \mbox{N}(\mu, \sigma^2)$
-- When $\mu = 0$ and $\sigma = 1$ the resulting distribution is called **the standard normal distribution**
-- The standard normal density function is labeled $\phi$
-- Standard normal RVs are often labeled $Z$
-
----
-
-```r
-zvals <- seq(-3, 3, length = 1000)
-plot(zvals, dnorm(zvals), 
-     type = "l", lwd = 3, frame = FALSE, xlab = "z", ylab = "Density")
-sapply(-3 : 3, function(k) abline(v = k))
-```
-
-<div class="rimage center"><img src="fig/unnamed-chunk-4.png" title="plot of chunk unnamed-chunk-4" alt="plot of chunk unnamed-chunk-4" class="plot" /></div>
-
-```
-[[1]]
-NULL
-
-[[2]]
-NULL
-
-[[3]]
-NULL
-
-[[4]]
-NULL
-
-[[5]]
-NULL
-
-[[6]]
-NULL
-
-[[7]]
-NULL
-```
-
-
----
-
-## Facts about the normal density
-
-- If $X \sim \mbox{N}(\mu,\sigma^2)$ the $Z = \frac{X -\mu}{\sigma}$ is standard normal
-- If $Z$ is standard normal $$X = \mu + \sigma Z \sim \mbox{N}(\mu, \sigma^2)$$
-- The non-standard normal density is $$\phi\{(x - \mu) / \sigma\}/\sigma$$
-
----
-
-## More facts about the normal density
-
-1. Approximately $68\%$, $95\%$ and $99\%$  of the normal density lies within $1$, $2$ and $3$ standard deviations from the mean, respectively
-2. $-1.28$, $-1.645$, $-1.96$ and $-2.33$ are the $10^{th}$, $5^{th}$, $2.5^{th}$ and $1^{st}$ percentiles of the standard normal distribution respectively
-3. By symmetry, $1.28$, $1.645$, $1.96$ and $2.33$ are the $90^{th}$, $95^{th}$, $97.5^{th}$ and $99^{th}$ percentiles of the standard normal distribution respectively
-
----
-
-## Question
-
-- What is the $95^{th}$ percentile of a $N(\mu, \sigma^2)$ distribution? 
-  - Quick answer in R `qnorm(.95, mean = mu, sd = sd)`
-- We want the point $x_0$ so that $P(X \leq x_0) = .95$
-$$
-  \begin{eqnarray*}
-    P(X \leq x_0) & = & P\left(\frac{X - \mu}{\sigma} \leq \frac{x_0 - \mu}{\sigma}\right) \\ \\
-                  & = & P\left(Z \leq \frac{x_0 - \mu}{\sigma}\right) =  .95
-  \end{eqnarray*}
-$$
-- Therefore
-  $$\frac{x_0 - \mu}{\sigma} = 1.645$$
-  or $x_0 = \mu + \sigma 1.645$
-- In general $x_0 = \mu + \sigma z_0$ where $z_0$ is the appropriate standard normal quantile
-
----
-
-## Question
-
-- What is the probability that a $\mbox{N}(\mu,\sigma^2)$ RV is 2 standard deviations above the mean?
-- We want to know
-$$
-  \begin{eqnarray*}
-  P(X > \mu + 2\sigma) & = & 
-P\left(\frac{X -\mu}{\sigma} > \frac{\mu + 2\sigma - \mu}{\sigma}\right)    \\ \\
-& = & P(Z \geq 2 ) \\ \\ 
-& \approx & 2.5\%
-  \end{eqnarray*}
-$$
-
----
-
-## Other properties
-
-- The normal distribution is symmetric and peaked about its mean (therefore the mean, median and mode are all equal)
-- A constant times a normally distributed random variable is also normally distributed (what is the mean and variance?)
-- Sums of normally distributed random variables are again normally distributed even if the variables are dependent (what is the mean and variance?)
-- Sample means of normally distributed random variables are again normally distributed (with what mean and variance?)
-- The square of a *standard normal* random variable follows what is called **chi-squared** distribution 
-- The exponent of a normally distributed random variables follows what is called the **log-normal** distribution 
-- As we will see later, many random variables, properly normalized, *limit* to a normal distribution
-
----
-
-## Final thoughts on normal likelihoods
-- The MLE for $\mu$ is $\bar X$.
-- The MLE for $\sigma^2$ is
-  $$
-  \frac{\sum_{i=1}^n (X_i - \bar X)^2}{n}
-  $$
-  (Which is the biased version of the sample variance.)
-- The MLE of $\sigma$ is simply the square root of this
-  estimate
-
----
-## The Poisson distribution
-* Used to model counts
-* The Poisson mass function is
-$$
-P(X = x; \lambda) = \frac{\lambda^x e^{-\lambda}}{x!}
-$$
-for $x=0,1,\ldots$
-* The mean of this distribution is $\lambda$
-* The variance of this distribution is $\lambda$
-* Notice that $x$ ranges from $0$ to $\infty$
-
----
-## Some uses for the Poisson distribution
-* Modeling event/time data
-* Modeling radioactive decay
-* Modeling survival data
-* Modeling unbounded count data 
-* Modeling contingency tables
-* Approximating binomials when $n$ is large and $p$ is small
-
----
-## Poisson derivation
-* $\lambda$ is the mean number of events per unit time
-* Let $h$ be very small 
-* Suppose we assume that 
-  * Prob. of an event in an interval of length $h$ is $\lambda h$
-    while the prob. of more than one event is negligible
-  * Whether or not an event occurs in one small interval
-    does not impact whether or not an event occurs in another
-    small interval
-then, the number of events per unit time is Poisson with mean $\lambda$ 
-
----
-## Rates and Poisson random variables
-* Poisson random variables are used to model rates
-* $X \sim Poisson(\lambda t)$ where 
-  * $\lambda = E[X / t]$ is the expected count per unit of time
-  * $t$ is the total monitoring time
-
----
-## Poisson approximation to the binomial
-* When $n$ is large and $p$ is small the Poisson distribution
-  is an accurate approximation to the binomial distribution
-* Notation
-  * $\lambda = n p$
-  * $X \sim \mbox{Binomial}(n, p)$, $\lambda = n p$ and
-  * $n$ gets large 
-  * $p$ gets small
-  * $\lambda$ stays constant
-
----
-## Example
-The number of people that show up at a bus stop is Poisson with
-a mean of $2.5$ per hour.
-
-If watching the bus stop for 4 hours, what is the probability that $3$
-or fewer people show up for the whole time?
-
-
-```r
-ppois(3, lambda = 2.5 * 4)
-```
-
-```
-[1] 0.01034
-```
-
-
----
-## Example, Poisson approximation to the binomial
-
-We flip a coin with success probablity $0.01$ five hundred times. 
-
-What's the probability of 2 or fewer successes?
-
-
-```r
-pbinom(2, size = 500, prob = .01)
-```
-
-```
-[1] 0.1234
-```
-
-```r
-ppois(2, lambda=500 * .01)
-```
-
-```
-[1] 0.1247
-```
-
-
+---
+title       : Some Common Distributions
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+
+## The Bernoulli distribution
+
+- The **Bernoulli distribution** arises as the result of a binary outcome
+- Bernoulli random variables take (only) the values 1 and 0 with probabilities of (say) $p$ and $1-p$ respectively
+- The PMF for a Bernoulli random variable $X$ is $$P(X = x) =  p^x (1 - p)^{1 - x}$$
+- The mean of a Bernoulli random variable is $p$ and the variance is $p(1 - p)$
+- If we let $X$ be a Bernoulli random variable, it is typical to call $X=1$ as a "success" and $X=0$ as a "failure"
+
+---
+
+## iid Bernoulli trials
+
+- If several iid Bernoulli observations, say $x_1,\ldots, x_n$, are observed the
+likelihood is 
+$$
+  \prod_{i=1}^n p^{x_i} (1 - p)^{1 - x_i} = p^{\sum x_i} (1 - p)^{n - \sum x_i}
+$$
+- Notice that the likelihood depends only on the sum of the $x_i$
+- Because $n$ is fixed and assumed known, this implies that the sample proportion $\sum_i x_i / n$ contains all of the relevant information about $p$
+- We can maximize the Bernoulli likelihood over $p$ to obtain that $\hat p = \sum_i x_i / n$ is the maximum likelihood estimator for $p$
+
+---
+## Plotting all possible likelihoods for a small n
+```
+n <- 5
+pvals <- seq(0, 1, length = 1000)
+plot(c(0, 1), c(0, 1.2), type = "n", frame = FALSE, xlab = "p", ylab = "likelihood")
+text((0 : n) /n, 1.1, as.character(0 : n))
+sapply(0 : n, function(x) {
+  phat <- x / n
+  if (x == 0) lines(pvals,  ( (1 - pvals) / (1 - phat) )^(n-x), lwd = 3)
+  else if (x == n) lines(pvals, (pvals / phat) ^ x, lwd = 3)
+  else lines(pvals, (pvals / phat ) ^ x * ( (1 - pvals) / (1 - phat) ) ^ (n-x), lwd = 3) 
+  }
+)
+title(paste("Likelihoods for n = ", n))
+```
+
+---
+![plot of chunk unnamed-chunk-1](assets/fig/unnamed-chunk-1.png) 
+
+
+---
+
+## Binomial trials
+
+- The *binomial random variables* are obtained as the sum of iid Bernoulli trials
+- In specific, let $X_1,\ldots,X_n$ be iid Bernoulli$(p)$; then $X = \sum_{i=1}^n X_i$ is a binomial random variable
+- The binomial mass function is
+$$
+P(X = x) = 
+\left(
+\begin{array}{c}
+  n \\ x
+\end{array}
+\right)
+p^x(1 - p)^{n-x}
+$$
+for $x=0,\ldots,n$
+
+---
+
+## Choose
+
+- Recall that the notation 
+  $$\left(
+    \begin{array}{c}
+      n \\ x
+    \end{array}
+  \right) = \frac{n!}{x!(n-x)!}
+  $$ (read "$n$ choose $x$") counts the number of ways of selecting $x$ items out of $n$
+  without replacement disregarding the order of the items
+
+$$\left(
+    \begin{array}{c}
+      n \\ 0
+    \end{array}
+  \right) =
+\left(
+    \begin{array}{c}
+      n \\ n
+    \end{array}
+  \right) =  1
+  $$ 
+
+---
+
+## Example justification of the binomial likelihood
+
+- Consider the probability of getting $6$ heads out of $10$ coin flips from a coin with success probability $p$ 
+- The probability of getting $6$ heads and $4$ tails in any specific order is
+  $$
+  p^6(1-p)^4
+  $$
+- There are 
+$$\left(
+\begin{array}{c}
+  10 \\ 6
+\end{array}
+\right)
+$$
+possible orders of $6$ heads and $4$ tails
+
+---
+
+## Example
+
+- Suppose a friend has $8$ children (oh my!), $7$ of which are girls and none are twins
+- If each gender has an independent $50$% probability for each birth, what's the probability of getting $7$ or more girls out of $8$ births?
+$$\left(
+\begin{array}{c}
+  8 \\ 7
+\end{array}
+\right) .5^{7}(1-.5)^{1}
++
+\left(
+\begin{array}{c}
+  8 \\ 8
+\end{array}
+\right) .5^{8}(1-.5)^{0} \approx 0.04
+$$
+
+```r
+choose(8, 7) * 0.5^8 + choose(8, 8) * 0.5^8
+```
+
+```
+## [1] 0.03516
+```
+
+```r
+pbinom(6, size = 8, prob = 0.5, lower.tail = FALSE)
+```
+
+```
+## [1] 0.03516
+```
+
+
+---
+
+```r
+plot(pvals, dbinom(7, 8, pvals)/dbinom(7, 8, 7/8), lwd = 3, frame = FALSE, type = "l", 
+    xlab = "p", ylab = "likelihood")
+```
+
+![plot of chunk unnamed-chunk-3](assets/fig/unnamed-chunk-3.png) 
+
+
+---
+
+## The normal distribution
+
+- A random variable is said to follow a **normal** or **Gaussian** distribution with mean $\mu$ and variance $\sigma^2$ if the associated density is
+  $$
+  (2\pi \sigma^2)^{-1/2}e^{-(x - \mu)^2/2\sigma^2}
+  $$
+  If $X$ a RV with this density then $E[X] = \mu$ and $Var(X) = \sigma^2$
+- We write $X\sim \mbox{N}(\mu, \sigma^2)$
+- When $\mu = 0$ and $\sigma = 1$ the resulting distribution is called **the standard normal distribution**
+- The standard normal density function is labeled $\phi$
+- Standard normal RVs are often labeled $Z$
+
+---
+
+```r
+zvals <- seq(-3, 3, length = 1000)
+plot(zvals, dnorm(zvals), type = "l", lwd = 3, frame = FALSE, xlab = "z", ylab = "Density")
+sapply(-3:3, function(k) abline(v = k))
+```
+
+<img src="assets/fig/unnamed-chunk-4.png" title="plot of chunk unnamed-chunk-4" alt="plot of chunk unnamed-chunk-4" style="display: block; margin: auto;" />
+
+
+---
+
+## Facts about the normal density
+
+- If $X \sim \mbox{N}(\mu,\sigma^2)$ the $Z = \frac{X -\mu}{\sigma}$ is standard normal
+- If $Z$ is standard normal $$X = \mu + \sigma Z \sim \mbox{N}(\mu, \sigma^2)$$
+- The non-standard normal density is $$\phi\{(x - \mu) / \sigma\}/\sigma$$
+
+---
+
+## More facts about the normal density
+
+1. Approximately $68\%$, $95\%$ and $99\%$  of the normal density lies within $1$, $2$ and $3$ standard deviations from the mean, respectively
+2. $-1.28$, $-1.645$, $-1.96$ and $-2.33$ are the $10^{th}$, $5^{th}$, $2.5^{th}$ and $1^{st}$ percentiles of the standard normal distribution respectively
+3. By symmetry, $1.28$, $1.645$, $1.96$ and $2.33$ are the $90^{th}$, $95^{th}$, $97.5^{th}$ and $99^{th}$ percentiles of the standard normal distribution respectively
+
+---
+
+## Question
+
+- What is the $95^{th}$ percentile of a $N(\mu, \sigma^2)$ distribution? 
+  - Quick answer in R `qnorm(.95, mean = mu, sd = sd)`
+- We want the point $x_0$ so that $P(X \leq x_0) = .95$
+$$
+  \begin{eqnarray*}
+    P(X \leq x_0) & = & P\left(\frac{X - \mu}{\sigma} \leq \frac{x_0 - \mu}{\sigma}\right) \\ \\
+                  & = & P\left(Z \leq \frac{x_0 - \mu}{\sigma}\right) =  .95
+  \end{eqnarray*}
+$$
+- Therefore
+  $$\frac{x_0 - \mu}{\sigma} = 1.645$$
+  or $x_0 = \mu + \sigma 1.645$
+- In general $x_0 = \mu + \sigma z_0$ where $z_0$ is the appropriate standard normal quantile
+
+---
+
+## Question
+
+- What is the probability that a $\mbox{N}(\mu,\sigma^2)$ RV is 2 standard deviations above the mean?
+- We want to know
+$$
+  \begin{eqnarray*}
+  P(X > \mu + 2\sigma) & = & 
+P\left(\frac{X -\mu}{\sigma} > \frac{\mu + 2\sigma - \mu}{\sigma}\right)    \\ \\
+& = & P(Z \geq 2 ) \\ \\ 
+& \approx & 2.5\%
+  \end{eqnarray*}
+$$
+
+---
+
+## Other properties
+
+- The normal distribution is symmetric and peaked about its mean (therefore the mean, median and mode are all equal)
+- A constant times a normally distributed random variable is also normally distributed (what is the mean and variance?)
+- Sums of normally distributed random variables are again normally distributed even if the variables are dependent (what is the mean and variance?)
+- Sample means of normally distributed random variables are again normally distributed (with what mean and variance?)
+- The square of a *standard normal* random variable follows what is called **chi-squared** distribution 
+- The exponent of a normally distributed random variables follows what is called the **log-normal** distribution 
+- As we will see later, many random variables, properly normalized, *limit* to a normal distribution
+
+---
+
+## Final thoughts on normal likelihoods
+- The MLE for $\mu$ is $\bar X$.
+- The MLE for $\sigma^2$ is
+  $$
+  \frac{\sum_{i=1}^n (X_i - \bar X)^2}{n}
+  $$
+  (Which is the biased version of the sample variance.)
+- The MLE of $\sigma$ is simply the square root of this
+  estimate
+
+---
+## The Poisson distribution
+* Used to model counts
+* The Poisson mass function is
+$$
+P(X = x; \lambda) = \frac{\lambda^x e^{-\lambda}}{x!}
+$$
+for $x=0,1,\ldots$
+* The mean of this distribution is $\lambda$
+* The variance of this distribution is $\lambda$
+* Notice that $x$ ranges from $0$ to $\infty$
+
+---
+## Some uses for the Poisson distribution
+* Modeling event/time data
+* Modeling radioactive decay
+* Modeling survival data
+* Modeling unbounded count data 
+* Modeling contingency tables
+* Approximating binomials when $n$ is large and $p$ is small
+
+---
+## Poisson derivation
+* $\lambda$ is the mean number of events per unit time
+* Let $h$ be very small 
+* Suppose we assume that 
+  * Prob. of an event in an interval of length $h$ is $\lambda h$
+    while the prob. of more than one event is negligible
+  * Whether or not an event occurs in one small interval
+    does not impact whether or not an event occurs in another
+    small interval
+then, the number of events per unit time is Poisson with mean $\lambda$ 
+
+---
+## Rates and Poisson random variables
+* Poisson random variables are used to model rates
+* $X \sim Poisson(\lambda t)$ where 
+  * $\lambda = E[X / t]$ is the expected count per unit of time
+  * $t$ is the total monitoring time
+
+---
+## Poisson approximation to the binomial
+* When $n$ is large and $p$ is small the Poisson distribution
+  is an accurate approximation to the binomial distribution
+* Notation
+  * $\lambda = n p$
+  * $X \sim \mbox{Binomial}(n, p)$, $\lambda = n p$ and
+  * $n$ gets large 
+  * $p$ gets small
+  * $\lambda$ stays constant
+
+---
+## Example
+The number of people that show up at a bus stop is Poisson with
+a mean of $2.5$ per hour.
+
+If watching the bus stop for 4 hours, what is the probability that $3$
+or fewer people show up for the whole time?
+
+
+```r
+ppois(3, lambda = 2.5 * 4)
+```
+
+```
+## [1] 0.01034
+```
+
+
+---
+## Example, Poisson approximation to the binomial
+
+We flip a coin with success probablity $0.01$ five hundred times. 
+
+What's the probability of 2 or fewer successes?
+
+
+```r
+pbinom(2, size = 500, prob = 0.01)
+```
+
+```
+## [1] 0.1234
+```
+
+```r
+ppois(2, lambda = 500 * 0.01)
+```
+
+```
+## [1] 0.1247
+```
+
+
diff --git a/06_StatisticalInference/02_01_CommonDistributions/index.pdf b/06_StatisticalInference/02_01_CommonDistributions/index.pdf
index 633936fe6..899e8bd29 100644
Binary files a/06_StatisticalInference/02_01_CommonDistributions/index.pdf and b/06_StatisticalInference/02_01_CommonDistributions/index.pdf differ
diff --git a/06_StatisticalInference/02_02_Asymptopia/assets/fig/unnamed-chunk-1.png b/06_StatisticalInference/02_02_Asymptopia/assets/fig/unnamed-chunk-1.png
new file mode 100644
index 000000000..94ea19152
Binary files /dev/null and b/06_StatisticalInference/02_02_Asymptopia/assets/fig/unnamed-chunk-1.png differ
diff --git a/06_StatisticalInference/02_02_Asymptopia/assets/fig/unnamed-chunk-2.png b/06_StatisticalInference/02_02_Asymptopia/assets/fig/unnamed-chunk-2.png
new file mode 100644
index 000000000..2a217f6ad
Binary files /dev/null and b/06_StatisticalInference/02_02_Asymptopia/assets/fig/unnamed-chunk-2.png differ
diff --git a/06_StatisticalInference/02_02_Asymptopia/assets/fig/unnamed-chunk-3.png b/06_StatisticalInference/02_02_Asymptopia/assets/fig/unnamed-chunk-3.png
new file mode 100644
index 000000000..484506bca
Binary files /dev/null and b/06_StatisticalInference/02_02_Asymptopia/assets/fig/unnamed-chunk-3.png differ
diff --git a/06_StatisticalInference/02_02_Asymptopia/index.Rmd b/06_StatisticalInference/02_02_Asymptopia/index.Rmd
index a199bd6bf..646a93fe6 100644
--- a/06_StatisticalInference/02_02_Asymptopia/index.Rmd
+++ b/06_StatisticalInference/02_02_Asymptopia/index.Rmd
@@ -1,271 +1,255 @@
----
-title       : A trip to Asymptopia
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F, results='hide'}
-# make this an external chunk that can be included in any file
-options(width = 100)
-opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/')
-
-options(xtable.type = 'html')
-knit_hooks$set(inline = function(x) {
-  if(is.numeric(x)) {
-    round(x, getOption('digits'))
-  } else {
-    paste(as.character(x), collapse = ', ')
-  }
-})
-knit_hooks$set(plot = knitr:::hook_plot_html)
-runif(1)
-```
-## Asymptotics
-* Asymptotics is the term for the behavior of statistics as the sample size (or some other relevant quantity) limits to infinity (or some other relevant number)
-* (Asymptopia is my name for the land of asymptotics, where everything works out well and there's no messes. The land of infinite data is nice that way.)
-* Asymptotics are incredibly useful for simple statistical inference and approximations 
-* (Not covered in this class) Asymptotics often lead to nice understanding of procedures
-* Asymptotics generally give no assurances about finite sample performance
-  * The kinds of asymptotics that do are orders of magnitude more difficult to work with
-* Asymptotics form the basis for frequency interpretation of probabilities 
-  (the long run proportion of times an event occurs)
-* To understand asymptotics, we need a very basic understanding of limits.
-
-
----
-## Numerical limits
-
-- Imagine a sequence
-
-  - $a_1 = .9$,
-  - $a_2 = .99$,
-  - $a_3 = .999$, ...
-
-- Clearly this sequence converges to $1$
-- Definition of a limit: For any fixed distance we can find a point in the sequence so that the sequence is closer to the limit than that distance from that point on
-
----
-
-## Limits of random variables
-
-- The problem is harder for random variables
-- Consider $\bar X_n$ the sample average of the first $n$ of a collection of $iid$ observations
-
-  - Example $\bar X_n$ could be the average of the result of $n$ coin flips (i.e. the sample proportion of heads)
-
-- We say that $\bar X_n$ converges in probability to a limit if for any fixed distance the  probability of $\bar X_n$ being closer (further away) than that distance from the limit converges to one (zero)
-
----
-
-## The Law of Large Numbers
-
-- Establishing that a random sequence converges to a limit is hard
-- Fortunately, we have a theorem that does all the work for us, called
-    the **Law of Large Numbers**
-- The law of large numbers states that if $X_1,\ldots X_n$ are iid from a population with mean $\mu$ and variance $\sigma^2$ then $\bar X_n$ converges in probability to $\mu$
-- (There are many variations on the LLN; we are using a particularly lazy version, my favorite kind of version)
-
----
-## Law of large numbers in action
-```{r, fig.height=4, fig.width=4}
-n <- 10000; means <- cumsum(rnorm(n)) / (1  : n)
-plot(1 : n, means, type = "l", lwd = 2, 
-     frame = FALSE, ylab = "cumulative means", xlab = "sample size")
-abline(h = 0)
-```
----
-## Discussion
-- An estimator is **consistent** if it converges to what you want to estimate
-  - Consistency is neither necessary nor sufficient for one estimator to be better than another
-  - Typically, good estimators are consistent; it's not too much to ask that if we go to the trouble of collecting an infinite amount of data that we get the right answer
-- The LLN basically states that the sample mean is consistent
-- The sample variance and the sample standard deviation are consistent as well
-- Recall also that the sample mean and the sample variance are unbiased as well
-- (The sample standard deviation is biased, by the way)
-
----
-
-## The Central Limit Theorem
-
-- The **Central Limit Theorem** (CLT) is one of the most important theorems in statistics
-- For our purposes, the CLT states that the distribution of averages of iid variables, properly normalized, becomes that of a standard normal as the sample size increases
-- The CLT applies in an endless variety of settings
-- Let $X_1,\ldots,X_n$ be a collection of iid random variables with mean $\mu$ and variance $\sigma^2$
-- Let $\bar X_n$ be their sample average
-- Then $\frac{\bar X_n - \mu}{\sigma / \sqrt{n}}$ has a distribution like that of a standard normal for large $n$.
-- Remember the form
-$$\frac{\bar X_n - \mu}{\sigma / \sqrt{n}} = 
-    \frac{\mbox{Estimate} - \mbox{Mean of estimate}}{\mbox{Std. Err. of estimate}}.
-$$
-- Usually, replacing the standard error by its estimated value doesn't change the CLT
-
----
-
-## Example
-
-- Simulate a standard normal random variable by rolling $n$ (six sided)
-- Let $X_i$ be the outcome for die $i$
-- Then note that $\mu = E[X_i] = 3.5$
-- $Var(X_i) = 2.92$ 
-- SE $\sqrt{2.92 / n} = 1.71 / \sqrt{n}$
-- Standardized mean
-$$
-    \frac{\bar X_n - 3.5}{1.71/\sqrt{n}}
-$$ 
-
----
-## Simulation of mean of $n$ dice
-```{r, echo = FALSE, fig.width=9, fig.height = 3}
-par(mfrow = c(1, 3))
-for (n in c(1, 2, 6)){
-  temp <- matrix(sample(1 : 6, n * 10000, replace = TRUE), ncol = n)
-  temp <- apply(temp, 1, mean)
-  temp <- (temp - 3.5) / (1.71 / sqrt(n)) 
-  dty <- density(temp)
-  plot(dty$x, dty$y, xlab = "", ylab = "density", type = "n", xlim = c(-3, 3), ylim = c(0, .5))
-  title(paste("sample mean of", n, "obs"))
-  lines(seq(-3, 3, length = 100), dnorm(seq(-3, 3, length = 100)), col = grey(.8), lwd = 3)
-  lines(dty$x, dty$y, lwd = 2)
-}
-```
----
-
-## Coin CLT
-
- - Let $X_i$ be the $0$ or $1$ result of the $i^{th}$ flip of a possibly unfair coin
-- The sample proportion, say $\hat p$, is the average of the coin flips
-- $E[X_i] = p$ and $Var(X_i) = p(1-p)$
-- Standard error of the mean is $\sqrt{p(1-p)/n}$
-- Then
-$$
-    \frac{\hat p - p}{\sqrt{p(1-p)/n}}
-$$
-will be approximately normally distributed
-
----
-
-```{r, echo = FALSE, fig.width=7.5, fig.height = 5}
-par(mfrow = c(2, 3))
-for (n in c(1, 10, 20)){
-  temp <- matrix(sample(0 : 1, n * 10000, replace = TRUE), ncol = n)
-  temp <- apply(temp, 1, mean)
-  temp <- (temp - .5) * 2 * sqrt(n)
-  dty <- density(temp)
-  plot(dty$x, dty$y, xlab = "", ylab = "density", type = "n", xlim = c(-3, 3), ylim = c(0, .5))
-  title(paste("sample mean of", n, "obs"))
-  lines(seq(-3, 3, length = 100), dnorm(seq(-3, 3, length = 100)), col = grey(.8), lwd = 3)
-  lines(dty$x, dty$y, lwd = 2)
-}
-for (n in c(1, 10, 20)){
-  temp <- matrix(sample(0 : 1, n * 10000, replace = TRUE, prob = c(.9, .1)), ncol = n)
-  temp <- apply(temp, 1, mean)
-  temp <- (temp - .1) / sqrt(.1 * .9 / n)
-  dty <- density(temp)
-  plot(dty$x, dty$y, xlab = "", ylab = "density", type = "n", xlim = c(-3, 3), ylim = c(0, .5))
-  title(paste("sample mean of", n, "obs"))
-  lines(seq(-3, 3, length = 100), dnorm(seq(-3, 3, length = 100)), col = grey(.8), lwd = 3)
-  lines(dty$x, dty$y, lwd = 2)
-}
-```
-
----
-
-## CLT in practice
-
-- In practice the CLT is mostly useful as an approximation
-$$
-    P\left( \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq z \right) \approx \Phi(z).  
-$$
-- Recall $1.96$ is a good approximation to the $.975^{th}$ quantile of the standard normal
-- Consider
-$$
-    \begin{eqnarray*}
-      .95 & \approx & P\left( -1.96 \leq \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq 1.96 \right)\\ \\
-      & =       & P\left(\bar X_n +1.96 \sigma/\sqrt{n} \geq \mu \geq \bar X_n - 1.96\sigma/\sqrt{n} \right),\\
-    \end{eqnarray*}
-$$
-
----
-
-## Confidence intervals
-
-- Therefore, according to the CLT, the probability that the random interval $$\bar X_n \pm z_{1-\alpha/2}\sigma / \sqrt{n}$$ contains $\mu$ is approximately 100$(1-\alpha)$%, where $z_{1-\alpha/2}$ is the $1-\alpha/2$ quantile of the standard normal distribution
-- This is called a $100(1 - \alpha)$% **confidence interval** for $\mu$
-- We can replace the unknown $\sigma$ with $s$
-
----
-## Give a confidence interval for the average height of sons
-in Galton's data
-```{r}
-library(UsingR);data(father.son); x <- father.son$sheight
-(mean(x) + c(-1, 1) * qnorm(.975) * sd(x) / sqrt(length(x))) / 12
-```
-
----
-
-## Sample proportions
-
-- In the event that each $X_i$ is $0$ or $1$ with common success probability $p$ then $\sigma^2 = p(1 - p)$
-- The interval takes the form
-$$
-    \hat p \pm z_{1 - \alpha/2}  \sqrt{\frac{p(1 - p)}{n}}
-$$
-- Replacing $p$ by $\hat p$ in the standard error results in what is called a Wald confidence interval for $p$
-- Also note that $p(1-p) \leq 1/4$ for $0 \leq p \leq 1$
-- Let $\alpha = .05$ so that $z_{1 -\alpha/2} = 1.96 \approx 2$ then
-$$
-    2  \sqrt{\frac{p(1 - p)}{n}} \leq 2 \sqrt{\frac{1}{4n}} = \frac{1}{\sqrt{n}} 
-$$
-- Therefore $\hat p \pm \frac{1}{\sqrt{n}}$ is a quick CI estimate for $p$
-
----
-## Example
-* Your campaign advisor told you that in a random sample of 100 likely voters,
-  56 intent to vote for you. 
-  * Can you relax? Do you have this race in the bag?
-  * Without access to a computer or calculator, how precise is this estimate?
-* `1/sqrt(100)=.1` so a back of the envelope calculation gives an approximate 95% interval of `(0.46, 0.66)`
-  * Not enough for you to relax, better go do more campaigning!
-* Rough guidelines, 100 for 1 decimal place, 10,000 for 2, 1,000,000 for 3.
-```{r}
-round(1 / sqrt(10 ^ (1 : 6)), 3)
-```
----
-## Poisson interval
-* A nuclear pump failed 5 times out of 94.32 days, give a 95% confidence interval for the failure rate per day?
-* $X \sim Poisson(\lambda t)$.
-* Estimate $\hat \lambda = X/t$
-* $Var(\hat \lambda) = \lambda / t$ 
-$$
-\frac{\hat \lambda - \lambda}{\sqrt{\hat \lambda / t}} 
-= 
-\frac{X - t \lambda}{\sqrt{X}} 
-\rightarrow N(0,1)
-$$
-* This isn't the best interval.
-  * There are better asymptotic intervals.
-  * You can get an exact CI in this case.
-
----
-### R code
-```{r}
-x <- 5; t <- 94.32; lambda <- x / t
-round(lambda + c(-1, 1) * qnorm(.975) * sqrt(lambda / t), 3)
-poisson.test(x, T = 94.32)$conf
-```
-
----
-## In the regression class
-```{r}
-exp(confint(glm(x ~ 1 + offset(log(t)), family = poisson(link = log))))
-```
-
+---
+title       : A trip to Asymptopia
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+## Asymptotics
+* Asymptotics is the term for the behavior of statistics as the sample size (or some other relevant quantity) limits to infinity (or some other relevant number)
+* (Asymptopia is my name for the land of asymptotics, where everything works out well and there's no messes. The land of infinite data is nice that way.)
+* Asymptotics are incredibly useful for simple statistical inference and approximations 
+* (Not covered in this class) Asymptotics often lead to nice understanding of procedures
+* Asymptotics generally give no assurances about finite sample performance
+  * The kinds of asymptotics that do are orders of magnitude more difficult to work with
+* Asymptotics form the basis for frequency interpretation of probabilities 
+  (the long run proportion of times an event occurs)
+* To understand asymptotics, we need a very basic understanding of limits.
+
+
+---
+## Numerical limits
+
+- Imagine a sequence
+
+  - $a_1 = .9$,
+  - $a_2 = .99$,
+  - $a_3 = .999$, ...
+
+- Clearly this sequence converges to $1$
+- Definition of a limit: For any fixed distance we can find a point in the sequence so that the sequence is closer to the limit than that distance from that point on
+
+---
+
+## Limits of random variables
+
+- The problem is harder for random variables
+- Consider $\bar X_n$ the sample average of the first $n$ of a collection of $iid$ observations
+
+  - Example $\bar X_n$ could be the average of the result of $n$ coin flips (i.e. the sample proportion of heads)
+
+- We say that $\bar X_n$ converges in probability to a limit if for any fixed distance the  probability of $\bar X_n$ being closer (further away) than that distance from the limit converges to one (zero)
+
+---
+
+## The Law of Large Numbers
+
+- Establishing that a random sequence converges to a limit is hard
+- Fortunately, we have a theorem that does all the work for us, called
+    the **Law of Large Numbers**
+- The law of large numbers states that if $X_1,\ldots X_n$ are iid from a population with mean $\mu$ and variance $\sigma^2$ then $\bar X_n$ converges in probability to $\mu$
+- (There are many variations on the LLN; we are using a particularly lazy version, my favorite kind of version)
+
+---
+## Law of large numbers in action
+```{r, fig.height=4, fig.width=4}
+n <- 10000; means <- cumsum(rnorm(n)) / (1  : n)
+plot(1 : n, means, type = "l", lwd = 2, 
+     frame = FALSE, ylab = "cumulative means", xlab = "sample size")
+abline(h = 0)
+```
+---
+## Discussion
+- An estimator is **consistent** if it converges to what you want to estimate
+  - Consistency is neither necessary nor sufficient for one estimator to be better than another
+  - Typically, good estimators are consistent; it's not too much to ask that if we go to the trouble of collecting an infinite amount of data that we get the right answer
+- The LLN basically states that the sample mean is consistent
+- The sample variance and the sample standard deviation are consistent as well
+- Recall also that the sample mean and the sample variance are unbiased as well
+- (The sample standard deviation is biased, by the way)
+
+---
+
+## The Central Limit Theorem
+
+- The **Central Limit Theorem** (CLT) is one of the most important theorems in statistics
+- For our purposes, the CLT states that the distribution of averages of iid variables, properly normalized, becomes that of a standard normal as the sample size increases
+- The CLT applies in an endless variety of settings
+- Let $X_1,\ldots,X_n$ be a collection of iid random variables with mean $\mu$ and variance $\sigma^2$
+- Let $\bar X_n$ be their sample average
+- Then $\frac{\bar X_n - \mu}{\sigma / \sqrt{n}}$ has a distribution like that of a standard normal for large $n$.
+- Remember the form
+$$\frac{\bar X_n - \mu}{\sigma / \sqrt{n}} = 
+    \frac{\mbox{Estimate} - \mbox{Mean of estimate}}{\mbox{Std. Err. of estimate}}.
+$$
+- Usually, replacing the standard error by its estimated value doesn't change the CLT
+
+---
+
+## Example
+
+- Simulate a standard normal random variable by rolling $n$ (six sided)
+- Let $X_i$ be the outcome for die $i$
+- Then note that $\mu = E[X_i] = 3.5$
+- $Var(X_i) = 2.92$ 
+- SE $\sqrt{2.92 / n} = 1.71 / \sqrt{n}$
+- Standardized mean
+$$
+    \frac{\bar X_n - 3.5}{1.71/\sqrt{n}}
+$$ 
+
+---
+## Simulation of mean of $n$ dice
+```{r, echo = FALSE, fig.width=9, fig.height = 3}
+par(mfrow = c(1, 3))
+for (n in c(1, 2, 6)){
+  temp <- matrix(sample(1 : 6, n * 10000, replace = TRUE), ncol = n)
+  temp <- apply(temp, 1, mean)
+  temp <- (temp - 3.5) / (1.71 / sqrt(n)) 
+  dty <- density(temp)
+  plot(dty$x, dty$y, xlab = "", ylab = "density", type = "n", xlim = c(-3, 3), ylim = c(0, .5))
+  title(paste("sample mean of", n, "obs"))
+  lines(seq(-3, 3, length = 100), dnorm(seq(-3, 3, length = 100)), col = grey(.8), lwd = 3)
+  lines(dty$x, dty$y, lwd = 2)
+}
+```
+---
+
+## Coin CLT
+
+ - Let $X_i$ be the $0$ or $1$ result of the $i^{th}$ flip of a possibly unfair coin
+- The sample proportion, say $\hat p$, is the average of the coin flips
+- $E[X_i] = p$ and $Var(X_i) = p(1-p)$
+- Standard error of the mean is $\sqrt{p(1-p)/n}$
+- Then
+$$
+    \frac{\hat p - p}{\sqrt{p(1-p)/n}}
+$$
+will be approximately normally distributed
+
+---
+
+```{r, echo = FALSE, fig.width=7.5, fig.height = 5}
+par(mfrow = c(2, 3))
+for (n in c(1, 10, 20)){
+  temp <- matrix(sample(0 : 1, n * 10000, replace = TRUE), ncol = n)
+  temp <- apply(temp, 1, mean)
+  temp <- (temp - .5) * 2 * sqrt(n)
+  dty <- density(temp)
+  plot(dty$x, dty$y, xlab = "", ylab = "density", type = "n", xlim = c(-3, 3), ylim = c(0, .5))
+  title(paste("sample mean of", n, "obs"))
+  lines(seq(-3, 3, length = 100), dnorm(seq(-3, 3, length = 100)), col = grey(.8), lwd = 3)
+  lines(dty$x, dty$y, lwd = 2)
+}
+for (n in c(1, 10, 20)){
+  temp <- matrix(sample(0 : 1, n * 10000, replace = TRUE, prob = c(.9, .1)), ncol = n)
+  temp <- apply(temp, 1, mean)
+  temp <- (temp - .1) / sqrt(.1 * .9 / n)
+  dty <- density(temp)
+  plot(dty$x, dty$y, xlab = "", ylab = "density", type = "n", xlim = c(-3, 3), ylim = c(0, .5))
+  title(paste("sample mean of", n, "obs"))
+  lines(seq(-3, 3, length = 100), dnorm(seq(-3, 3, length = 100)), col = grey(.8), lwd = 3)
+  lines(dty$x, dty$y, lwd = 2)
+}
+```
+
+---
+
+## CLT in practice
+
+- In practice the CLT is mostly useful as an approximation
+$$
+    P\left( \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq z \right) \approx \Phi(z).  
+$$
+- Recall $1.96$ is a good approximation to the $.975^{th}$ quantile of the standard normal
+- Consider
+$$
+    \begin{eqnarray*}
+      .95 & \approx & P\left( -1.96 \leq \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq 1.96 \right)\\ \\
+      & =       & P\left(\bar X_n +1.96 \sigma/\sqrt{n} \geq \mu \geq \bar X_n - 1.96\sigma/\sqrt{n} \right),\\
+    \end{eqnarray*}
+$$
+
+---
+
+## Confidence intervals
+
+- Therefore, according to the CLT, the probability that the random interval $$\bar X_n \pm z_{1-\alpha/2}\sigma / \sqrt{n}$$ contains $\mu$ is approximately 100$(1-\alpha)$%, where $z_{1-\alpha/2}$ is the $1-\alpha/2$ quantile of the standard normal distribution
+- This is called a $100(1 - \alpha)$% **confidence interval** for $\mu$
+- We can replace the unknown $\sigma$ with $s$
+
+---
+## Give a confidence interval for the average height of sons
+in Galton's data
+```{r}
+library(UsingR);data(father.son); x <- father.son$sheight
+(mean(x) + c(-1, 1) * qnorm(.975) * sd(x) / sqrt(length(x))) / 12
+```
+
+---
+
+## Sample proportions
+
+- In the event that each $X_i$ is $0$ or $1$ with common success probability $p$ then $\sigma^2 = p(1 - p)$
+- The interval takes the form
+$$
+    \hat p \pm z_{1 - \alpha/2}  \sqrt{\frac{p(1 - p)}{n}}
+$$
+- Replacing $p$ by $\hat p$ in the standard error results in what is called a Wald confidence interval for $p$
+- Also note that $p(1-p) \leq 1/4$ for $0 \leq p \leq 1$
+- Let $\alpha = .05$ so that $z_{1 -\alpha/2} = 1.96 \approx 2$ then
+$$
+    2  \sqrt{\frac{p(1 - p)}{n}} \leq 2 \sqrt{\frac{1}{4n}} = \frac{1}{\sqrt{n}} 
+$$
+- Therefore $\hat p \pm \frac{1}{\sqrt{n}}$ is a quick CI estimate for $p$
+
+---
+## Example
+* Your campaign advisor told you that in a random sample of 100 likely voters,
+  56 intent to vote for you. 
+  * Can you relax? Do you have this race in the bag?
+  * Without access to a computer or calculator, how precise is this estimate?
+* `1/sqrt(100)=.1` so a back of the envelope calculation gives an approximate 95% interval of `(0.46, 0.66)`
+  * Not enough for you to relax, better go do more campaigning!
+* Rough guidelines, 100 for 1 decimal place, 10,000 for 2, 1,000,000 for 3.
+```{r}
+round(1 / sqrt(10 ^ (1 : 6)), 3)
+```
+---
+## Poisson interval
+* A nuclear pump failed 5 times out of 94.32 days, give a 95% confidence interval for the failure rate per day?
+* $X \sim Poisson(\lambda t)$.
+* Estimate $\hat \lambda = X/t$
+* $Var(\hat \lambda) = \lambda / t$ 
+$$
+\frac{\hat \lambda - \lambda}{\sqrt{\hat \lambda / t}} 
+= 
+\frac{X - t \lambda}{\sqrt{X}} 
+\rightarrow N(0,1)
+$$
+* This isn't the best interval.
+  * There are better asymptotic intervals.
+  * You can get an exact CI in this case.
+
+---
+### R code
+```{r}
+x <- 5; t <- 94.32; lambda <- x / t
+round(lambda + c(-1, 1) * qnorm(.975) * sqrt(lambda / t), 3)
+poisson.test(x, T = 94.32)$conf
+```
+
+---
+## In the regression class
+```{r}
+exp(confint(glm(x ~ 1 + offset(log(t)), family = poisson(link = log))))
+```
+
diff --git a/06_StatisticalInference/02_02_Asymptopia/index.html b/06_StatisticalInference/02_02_Asymptopia/index.html
index 0cf77bf3a..025ef3536 100644
--- a/06_StatisticalInference/02_02_Asymptopia/index.html
+++ b/06_StatisticalInference/02_02_Asymptopia/index.html
@@ -1,580 +1,588 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>A trip to Asymptopia</title>
-  <meta charset="utf-8">
-  <meta name="description" content="A trip to Asymptopia">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>A trip to Asymptopia</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Asymptotics</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Asymptotics is the term for the behavior of statistics as the sample size (or some other relevant quantity) limits to infinity (or some other relevant number)</li>
-<li>(Asymptopia is my name for the land of asymptotics, where everything works out well and there&#39;s no messes. The land of infinite data is nice that way.)</li>
-<li>Asymptotics are incredibly useful for simple statistical inference and approximations </li>
-<li>(Not covered in this class) Asymptotics often lead to nice understanding of procedures</li>
-<li>Asymptotics generally give no assurances about finite sample performance
-
-<ul>
-<li>The kinds of asymptotics that do are orders of magnitude more difficult to work with</li>
-</ul></li>
-<li>Asymptotics form the basis for frequency interpretation of probabilities 
-(the long run proportion of times an event occurs)</li>
-<li>To understand asymptotics, we need a very basic understanding of limits.</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Numerical limits</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li><p>Imagine a sequence</p>
-
-<ul>
-<li>\(a_1 = .9\),</li>
-<li>\(a_2 = .99\),</li>
-<li>\(a_3 = .999\), ...</li>
-</ul></li>
-<li><p>Clearly this sequence converges to \(1\)</p></li>
-<li><p>Definition of a limit: For any fixed distance we can find a point in the sequence so that the sequence is closer to the limit than that distance from that point on</p></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Limits of random variables</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The problem is harder for random variables</li>
-<li><p>Consider \(\bar X_n\) the sample average of the first \(n\) of a collection of \(iid\) observations</p>
-
-<ul>
-<li>Example \(\bar X_n\) could be the average of the result of \(n\) coin flips (i.e. the sample proportion of heads)</li>
-</ul></li>
-<li><p>We say that \(\bar X_n\) converges in probability to a limit if for any fixed distance the  probability of \(\bar X_n\) being closer (further away) than that distance from the limit converges to one (zero)</p></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>The Law of Large Numbers</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Establishing that a random sequence converges to a limit is hard</li>
-<li>Fortunately, we have a theorem that does all the work for us, called
-the <strong>Law of Large Numbers</strong></li>
-<li>The law of large numbers states that if \(X_1,\ldots X_n\) are iid from a population with mean \(\mu\) and variance \(\sigma^2\) then \(\bar X_n\) converges in probability to \(\mu\)</li>
-<li>(There are many variations on the LLN; we are using a particularly lazy version, my favorite kind of version)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Law of large numbers in action</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">n &lt;- 10000; means &lt;- cumsum(rnorm(n)) / (1  : n)
-plot(1 : n, means, type = &quot;l&quot;, lwd = 2, 
-     frame = FALSE, ylab = &quot;cumulative means&quot;, xlab = &quot;sample size&quot;)
-abline(h = 0)
-</code></pre>
-
-<div class="rimage center"><img src="fig/unnamed-chunk-1.png" title="plot of chunk unnamed-chunk-1" alt="plot of chunk unnamed-chunk-1" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Discussion</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>An estimator is <strong>consistent</strong> if it converges to what you want to estimate
-
-<ul>
-<li>Consistency is neither necessary nor sufficient for one estimator to be better than another</li>
-<li>Typically, good estimators are consistent; it&#39;s not too much to ask that if we go to the trouble of collecting an infinite amount of data that we get the right answer</li>
-</ul></li>
-<li>The LLN basically states that the sample mean is consistent</li>
-<li>The sample variance and the sample standard deviation are consistent as well</li>
-<li>Recall also that the sample mean and the sample variance are unbiased as well</li>
-<li>(The sample standard deviation is biased, by the way)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>The Central Limit Theorem</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The <strong>Central Limit Theorem</strong> (CLT) is one of the most important theorems in statistics</li>
-<li>For our purposes, the CLT states that the distribution of averages of iid variables, properly normalized, becomes that of a standard normal as the sample size increases</li>
-<li>The CLT applies in an endless variety of settings</li>
-<li>Let \(X_1,\ldots,X_n\) be a collection of iid random variables with mean \(\mu\) and variance \(\sigma^2\)</li>
-<li>Let \(\bar X_n\) be their sample average</li>
-<li>Then \(\frac{\bar X_n - \mu}{\sigma / \sqrt{n}}\) has a distribution like that of a standard normal for large \(n\).</li>
-<li>Remember the form
-\[\frac{\bar X_n - \mu}{\sigma / \sqrt{n}} = 
-\frac{\mbox{Estimate} - \mbox{Mean of estimate}}{\mbox{Std. Err. of estimate}}.
-\]</li>
-<li>Usually, replacing the standard error by its estimated value doesn&#39;t change the CLT</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Simulate a standard normal random variable by rolling \(n\) (six sided)</li>
-<li>Let \(X_i\) be the outcome for die \(i\)</li>
-<li>Then note that \(\mu = E[X_i] = 3.5\)</li>
-<li>\(Var(X_i) = 2.92\) </li>
-<li>SE \(\sqrt{2.92 / n} = 1.71 / \sqrt{n}\)</li>
-<li>Standardized mean
-\[
-\frac{\bar X_n - 3.5}{1.71/\sqrt{n}}
-\] </li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>Simulation of mean of \(n\) dice</h2>
-  </hgroup>
-  <article data-timings="">
-    <div class="rimage center"><img src="fig/unnamed-chunk-2.png" title="plot of chunk unnamed-chunk-2" alt="plot of chunk unnamed-chunk-2" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>Coin CLT</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Let \(X_i\) be the \(0\) or \(1\) result of the \(i^{th}\) flip of a possibly unfair coin
-
-<ul>
-<li>The sample proportion, say \(\hat p\), is the average of the coin flips</li>
-<li>\(E[X_i] = p\) and \(Var(X_i) = p(1-p)\)</li>
-<li>Standard error of the mean is \(\sqrt{p(1-p)/n}\)</li>
-<li>Then
-\[
-\frac{\hat p - p}{\sqrt{p(1-p)/n}}
-\]
-will be approximately normally distributed</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <article data-timings="">
-    <div class="rimage center"><img src="fig/unnamed-chunk-3.png" title="plot of chunk unnamed-chunk-3" alt="plot of chunk unnamed-chunk-3" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-12" style="background:;">
-  <hgroup>
-    <h2>CLT in practice</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>In practice the CLT is mostly useful as an approximation
-\[
-P\left( \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq z \right) \approx \Phi(z).  
-\]</li>
-<li>Recall \(1.96\) is a good approximation to the \(.975^{th}\) quantile of the standard normal</li>
-<li>Consider
-\[
-\begin{eqnarray*}
-  .95 & \approx & P\left( -1.96 \leq \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq 1.96 \right)\\ \\
-  & =       & P\left(\bar X_n +1.96 \sigma/\sqrt{n} \geq \mu \geq \bar X_n - 1.96\sigma/\sqrt{n} \right),\\
-\end{eqnarray*}
-\]</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-13" style="background:;">
-  <hgroup>
-    <h2>Confidence intervals</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Therefore, according to the CLT, the probability that the random interval \[\bar X_n \pm z_{1-\alpha/2}\sigma / \sqrt{n}\] contains \(\mu\) is approximately 100\((1-\alpha)\)%, where \(z_{1-\alpha/2}\) is the \(1-\alpha/2\) quantile of the standard normal distribution</li>
-<li>This is called a \(100(1 - \alpha)\)% <strong>confidence interval</strong> for \(\mu\)</li>
-<li>We can replace the unknown \(\sigma\) with \(s\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-14" style="background:;">
-  <hgroup>
-    <h2>Give a confidence interval for the average height of sons</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>in Galton&#39;s data</p>
-
-<pre><code class="r">library(UsingR);data(father.son); x &lt;- father.son$sheight
-(mean(x) + c(-1, 1) * qnorm(.975) * sd(x) / sqrt(length(x))) / 12
-</code></pre>
-
-<pre><code>[1] 5.710 5.738
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-15" style="background:;">
-  <hgroup>
-    <h2>Sample proportions</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>In the event that each \(X_i\) is \(0\) or \(1\) with common success probability \(p\) then \(\sigma^2 = p(1 - p)\)</li>
-<li>The interval takes the form
-\[
-\hat p \pm z_{1 - \alpha/2}  \sqrt{\frac{p(1 - p)}{n}}
-\]</li>
-<li>Replacing \(p\) by \(\hat p\) in the standard error results in what is called a Wald confidence interval for \(p\)</li>
-<li>Also note that \(p(1-p) \leq 1/4\) for \(0 \leq p \leq 1\)</li>
-<li>Let \(\alpha = .05\) so that \(z_{1 -\alpha/2} = 1.96 \approx 2\) then
-\[
-2  \sqrt{\frac{p(1 - p)}{n}} \leq 2 \sqrt{\frac{1}{4n}} = \frac{1}{\sqrt{n}} 
-\]</li>
-<li>Therefore \(\hat p \pm \frac{1}{\sqrt{n}}\) is a quick CI estimate for \(p\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-16" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Your campaign advisor told you that in a random sample of 100 likely voters,
-56 intent to vote for you. 
-
-<ul>
-<li>Can you relax? Do you have this race in the bag?</li>
-<li>Without access to a computer or calculator, how precise is this estimate?</li>
-</ul></li>
-<li><code>1/sqrt(100)=.1</code> so a back of the envelope calculation gives an approximate 95% interval of <code>(0.46, 0.66)</code>
-
-<ul>
-<li>Not enough for you to relax, better go do more campaigning!</li>
-</ul></li>
-<li>Rough guidelines, 100 for 1 decimal place, 10,000 for 2, 1,000,000 for 3.</li>
-</ul>
-
-<pre><code class="r">round(1 / sqrt(10 ^ (1 : 6)), 3)
-</code></pre>
-
-<pre><code>[1] 0.316 0.100 0.032 0.010 0.003 0.001
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-17" style="background:;">
-  <hgroup>
-    <h2>Poisson interval</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>A nuclear pump failed 5 times out of 94.32 days, give a 95% confidence interval for the failure rate per day?</li>
-<li>\(X \sim Poisson(\lambda t)\).</li>
-<li>Estimate \(\hat \lambda = X/t\)</li>
-<li>\(Var(\hat \lambda) = \lambda / t\) 
-\[
-\frac{\hat \lambda - \lambda}{\sqrt{\hat \lambda / t}} 
-= 
-\frac{X - t \lambda}{\sqrt{X}} 
-\rightarrow N(0,1)
-\]</li>
-<li>This isn&#39;t the best interval.
-
-<ul>
-<li>There are better asymptotic intervals.</li>
-<li>You can get an exact CI in this case.</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-18" style="background:;">
-  <hgroup>
-    <h3>R code</h3>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">x &lt;- 5; t &lt;- 94.32; lambda &lt;- x / t
-round(lambda + c(-1, 1) * qnorm(.975) * sqrt(lambda / t), 3)
-</code></pre>
-
-<pre><code>[1] 0.007 0.099
-</code></pre>
-
-<pre><code class="r">poisson.test(x, T = 94.32)$conf
-</code></pre>
-
-<pre><code>[1] 0.01721 0.12371
-attr(,&quot;conf.level&quot;)
-[1] 0.95
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-19" style="background:;">
-  <hgroup>
-    <h2>In the regression class</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">exp(confint(glm(x ~ 1 + offset(log(t)), family = poisson(link = log))))
-</code></pre>
-
-<pre><code>  2.5 %  97.5 % 
-0.01901 0.11393 
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Asymptotics'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Numerical limits'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Limits of random variables'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='The Law of Large Numbers'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Law of large numbers in action'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Discussion'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='The Central Limit Theorem'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Example'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='Simulation of mean of \(n\) dice'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='Coin CLT'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title=''>
-         11
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=12 title='CLT in practice'>
-         12
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=13 title='Confidence intervals'>
-         13
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=14 title='Give a confidence interval for the average height of sons'>
-         14
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=15 title='Sample proportions'>
-         15
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=16 title='Example'>
-         16
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=17 title='Poisson interval'>
-         17
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=18 title='R code'>
-         18
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=19 title='In the regression class'>
-         19
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>A trip to Asymptopia</title>
+  <meta charset="utf-8">
+  <meta name="description" content="A trip to Asymptopia">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>A trip to Asymptopia</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Asymptotics</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Asymptotics is the term for the behavior of statistics as the sample size (or some other relevant quantity) limits to infinity (or some other relevant number)</li>
+<li>(Asymptopia is my name for the land of asymptotics, where everything works out well and there&#39;s no messes. The land of infinite data is nice that way.)</li>
+<li>Asymptotics are incredibly useful for simple statistical inference and approximations </li>
+<li>(Not covered in this class) Asymptotics often lead to nice understanding of procedures</li>
+<li>Asymptotics generally give no assurances about finite sample performance
+
+<ul>
+<li>The kinds of asymptotics that do are orders of magnitude more difficult to work with</li>
+</ul></li>
+<li>Asymptotics form the basis for frequency interpretation of probabilities 
+(the long run proportion of times an event occurs)</li>
+<li>To understand asymptotics, we need a very basic understanding of limits.</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Numerical limits</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li><p>Imagine a sequence</p>
+
+<ul>
+<li>\(a_1 = .9\),</li>
+<li>\(a_2 = .99\),</li>
+<li>\(a_3 = .999\), ...</li>
+</ul></li>
+<li><p>Clearly this sequence converges to \(1\)</p></li>
+<li><p>Definition of a limit: For any fixed distance we can find a point in the sequence so that the sequence is closer to the limit than that distance from that point on</p></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Limits of random variables</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The problem is harder for random variables</li>
+<li><p>Consider \(\bar X_n\) the sample average of the first \(n\) of a collection of \(iid\) observations</p>
+
+<ul>
+<li>Example \(\bar X_n\) could be the average of the result of \(n\) coin flips (i.e. the sample proportion of heads)</li>
+</ul></li>
+<li><p>We say that \(\bar X_n\) converges in probability to a limit if for any fixed distance the  probability of \(\bar X_n\) being closer (further away) than that distance from the limit converges to one (zero)</p></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>The Law of Large Numbers</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Establishing that a random sequence converges to a limit is hard</li>
+<li>Fortunately, we have a theorem that does all the work for us, called
+the <strong>Law of Large Numbers</strong></li>
+<li>The law of large numbers states that if \(X_1,\ldots X_n\) are iid from a population with mean \(\mu\) and variance \(\sigma^2\) then \(\bar X_n\) converges in probability to \(\mu\)</li>
+<li>(There are many variations on the LLN; we are using a particularly lazy version, my favorite kind of version)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Law of large numbers in action</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">n &lt;- 10000
+means &lt;- cumsum(rnorm(n))/(1:n)
+plot(1:n, means, type = &quot;l&quot;, lwd = 2, frame = FALSE, ylab = &quot;cumulative means&quot;, 
+    xlab = &quot;sample size&quot;)
+abline(h = 0)
+</code></pre>
+
+<p><img src="assets/fig/unnamed-chunk-1.png" alt="plot of chunk unnamed-chunk-1"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Discussion</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>An estimator is <strong>consistent</strong> if it converges to what you want to estimate
+
+<ul>
+<li>Consistency is neither necessary nor sufficient for one estimator to be better than another</li>
+<li>Typically, good estimators are consistent; it&#39;s not too much to ask that if we go to the trouble of collecting an infinite amount of data that we get the right answer</li>
+</ul></li>
+<li>The LLN basically states that the sample mean is consistent</li>
+<li>The sample variance and the sample standard deviation are consistent as well</li>
+<li>Recall also that the sample mean and the sample variance are unbiased as well</li>
+<li>(The sample standard deviation is biased, by the way)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>The Central Limit Theorem</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The <strong>Central Limit Theorem</strong> (CLT) is one of the most important theorems in statistics</li>
+<li>For our purposes, the CLT states that the distribution of averages of iid variables, properly normalized, becomes that of a standard normal as the sample size increases</li>
+<li>The CLT applies in an endless variety of settings</li>
+<li>Let \(X_1,\ldots,X_n\) be a collection of iid random variables with mean \(\mu\) and variance \(\sigma^2\)</li>
+<li>Let \(\bar X_n\) be their sample average</li>
+<li>Then \(\frac{\bar X_n - \mu}{\sigma / \sqrt{n}}\) has a distribution like that of a standard normal for large \(n\).</li>
+<li>Remember the form
+\[\frac{\bar X_n - \mu}{\sigma / \sqrt{n}} = 
+\frac{\mbox{Estimate} - \mbox{Mean of estimate}}{\mbox{Std. Err. of estimate}}.
+\]</li>
+<li>Usually, replacing the standard error by its estimated value doesn&#39;t change the CLT</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Simulate a standard normal random variable by rolling \(n\) (six sided)</li>
+<li>Let \(X_i\) be the outcome for die \(i\)</li>
+<li>Then note that \(\mu = E[X_i] = 3.5\)</li>
+<li>\(Var(X_i) = 2.92\) </li>
+<li>SE \(\sqrt{2.92 / n} = 1.71 / \sqrt{n}\)</li>
+<li>Standardized mean
+\[
+\frac{\bar X_n - 3.5}{1.71/\sqrt{n}}
+\] </li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>Simulation of mean of \(n\) dice</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><img src="assets/fig/unnamed-chunk-2.png" alt="plot of chunk unnamed-chunk-2"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>Coin CLT</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Let \(X_i\) be the \(0\) or \(1\) result of the \(i^{th}\) flip of a possibly unfair coin
+
+<ul>
+<li>The sample proportion, say \(\hat p\), is the average of the coin flips</li>
+<li>\(E[X_i] = p\) and \(Var(X_i) = p(1-p)\)</li>
+<li>Standard error of the mean is \(\sqrt{p(1-p)/n}\)</li>
+<li>Then
+\[
+\frac{\hat p - p}{\sqrt{p(1-p)/n}}
+\]
+will be approximately normally distributed</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <article data-timings="">
+    <p><img src="assets/fig/unnamed-chunk-3.png" alt="plot of chunk unnamed-chunk-3"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <hgroup>
+    <h2>CLT in practice</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>In practice the CLT is mostly useful as an approximation
+\[
+P\left( \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq z \right) \approx \Phi(z).  
+\]</li>
+<li>Recall \(1.96\) is a good approximation to the \(.975^{th}\) quantile of the standard normal</li>
+<li>Consider
+\[
+\begin{eqnarray*}
+  .95 & \approx & P\left( -1.96 \leq \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq 1.96 \right)\\ \\
+  & =       & P\left(\bar X_n +1.96 \sigma/\sqrt{n} \geq \mu \geq \bar X_n - 1.96\sigma/\sqrt{n} \right),\\
+\end{eqnarray*}
+\]</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-13" style="background:;">
+  <hgroup>
+    <h2>Confidence intervals</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Therefore, according to the CLT, the probability that the random interval \[\bar X_n \pm z_{1-\alpha/2}\sigma / \sqrt{n}\] contains \(\mu\) is approximately 100\((1-\alpha)\)%, where \(z_{1-\alpha/2}\) is the \(1-\alpha/2\) quantile of the standard normal distribution</li>
+<li>This is called a \(100(1 - \alpha)\)% <strong>confidence interval</strong> for \(\mu\)</li>
+<li>We can replace the unknown \(\sigma\) with \(s\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-14" style="background:;">
+  <hgroup>
+    <h2>Give a confidence interval for the average height of sons</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>in Galton&#39;s data</p>
+
+<pre><code class="r">library(UsingR)
+data(father.son)
+x &lt;- father.son$sheight
+(mean(x) + c(-1, 1) * qnorm(0.975) * sd(x)/sqrt(length(x)))/12
+</code></pre>
+
+<pre><code>## [1] 5.710 5.738
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-15" style="background:;">
+  <hgroup>
+    <h2>Sample proportions</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>In the event that each \(X_i\) is \(0\) or \(1\) with common success probability \(p\) then \(\sigma^2 = p(1 - p)\)</li>
+<li>The interval takes the form
+\[
+\hat p \pm z_{1 - \alpha/2}  \sqrt{\frac{p(1 - p)}{n}}
+\]</li>
+<li>Replacing \(p\) by \(\hat p\) in the standard error results in what is called a Wald confidence interval for \(p\)</li>
+<li>Also note that \(p(1-p) \leq 1/4\) for \(0 \leq p \leq 1\)</li>
+<li>Let \(\alpha = .05\) so that \(z_{1 -\alpha/2} = 1.96 \approx 2\) then
+\[
+2  \sqrt{\frac{p(1 - p)}{n}} \leq 2 \sqrt{\frac{1}{4n}} = \frac{1}{\sqrt{n}} 
+\]</li>
+<li>Therefore \(\hat p \pm \frac{1}{\sqrt{n}}\) is a quick CI estimate for \(p\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-16" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Your campaign advisor told you that in a random sample of 100 likely voters,
+56 intent to vote for you. 
+
+<ul>
+<li>Can you relax? Do you have this race in the bag?</li>
+<li>Without access to a computer or calculator, how precise is this estimate?</li>
+</ul></li>
+<li><code>1/sqrt(100)=.1</code> so a back of the envelope calculation gives an approximate 95% interval of <code>(0.46, 0.66)</code>
+
+<ul>
+<li>Not enough for you to relax, better go do more campaigning!</li>
+</ul></li>
+<li>Rough guidelines, 100 for 1 decimal place, 10,000 for 2, 1,000,000 for 3.</li>
+</ul>
+
+<pre><code class="r">round(1/sqrt(10^(1:6)), 3)
+</code></pre>
+
+<pre><code>## [1] 0.316 0.100 0.032 0.010 0.003 0.001
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-17" style="background:;">
+  <hgroup>
+    <h2>Poisson interval</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>A nuclear pump failed 5 times out of 94.32 days, give a 95% confidence interval for the failure rate per day?</li>
+<li>\(X \sim Poisson(\lambda t)\).</li>
+<li>Estimate \(\hat \lambda = X/t\)</li>
+<li>\(Var(\hat \lambda) = \lambda / t\) 
+\[
+\frac{\hat \lambda - \lambda}{\sqrt{\hat \lambda / t}} 
+= 
+\frac{X - t \lambda}{\sqrt{X}} 
+\rightarrow N(0,1)
+\]</li>
+<li>This isn&#39;t the best interval.
+
+<ul>
+<li>There are better asymptotic intervals.</li>
+<li>You can get an exact CI in this case.</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-18" style="background:;">
+  <hgroup>
+    <h3>R code</h3>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">x &lt;- 5
+t &lt;- 94.32
+lambda &lt;- x/t
+round(lambda + c(-1, 1) * qnorm(0.975) * sqrt(lambda/t), 3)
+</code></pre>
+
+<pre><code>## [1] 0.007 0.099
+</code></pre>
+
+<pre><code class="r">poisson.test(x, T = 94.32)$conf
+</code></pre>
+
+<pre><code>## [1] 0.01721 0.12371
+## attr(,&quot;conf.level&quot;)
+## [1] 0.95
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-19" style="background:;">
+  <hgroup>
+    <h2>In the regression class</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">exp(confint(glm(x ~ 1 + offset(log(t)), family = poisson(link = log))))
+</code></pre>
+
+<pre><code>## Waiting for profiling to be done...
+</code></pre>
+
+<pre><code>##   2.5 %  97.5 % 
+## 0.01901 0.11393
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Asymptotics'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Numerical limits'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Limits of random variables'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='The Law of Large Numbers'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Law of large numbers in action'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Discussion'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='The Central Limit Theorem'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Example'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='Simulation of mean of \(n\) dice'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='Coin CLT'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title=''>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title='CLT in practice'>
+         12
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=13 title='Confidence intervals'>
+         13
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=14 title='Give a confidence interval for the average height of sons'>
+         14
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=15 title='Sample proportions'>
+         15
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=16 title='Example'>
+         16
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=17 title='Poisson interval'>
+         17
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=18 title='R code'>
+         18
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=19 title='In the regression class'>
+         19
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/02_02_Asymptopia/index.md b/06_StatisticalInference/02_02_Asymptopia/index.md
index 1e6b29c78..c2a83dc2c 100644
--- a/06_StatisticalInference/02_02_Asymptopia/index.md
+++ b/06_StatisticalInference/02_02_Asymptopia/index.md
@@ -1,263 +1,270 @@
----
-title       : A trip to Asymptopia
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-## Asymptotics
-* Asymptotics is the term for the behavior of statistics as the sample size (or some other relevant quantity) limits to infinity (or some other relevant number)
-* (Asymptopia is my name for the land of asymptotics, where everything works out well and there's no messes. The land of infinite data is nice that way.)
-* Asymptotics are incredibly useful for simple statistical inference and approximations 
-* (Not covered in this class) Asymptotics often lead to nice understanding of procedures
-* Asymptotics generally give no assurances about finite sample performance
-  * The kinds of asymptotics that do are orders of magnitude more difficult to work with
-* Asymptotics form the basis for frequency interpretation of probabilities 
-  (the long run proportion of times an event occurs)
-* To understand asymptotics, we need a very basic understanding of limits.
-
-
----
-## Numerical limits
-
-- Imagine a sequence
-
-  - $a_1 = .9$,
-  - $a_2 = .99$,
-  - $a_3 = .999$, ...
-
-- Clearly this sequence converges to $1$
-- Definition of a limit: For any fixed distance we can find a point in the sequence so that the sequence is closer to the limit than that distance from that point on
-
----
-
-## Limits of random variables
-
-- The problem is harder for random variables
-- Consider $\bar X_n$ the sample average of the first $n$ of a collection of $iid$ observations
-
-  - Example $\bar X_n$ could be the average of the result of $n$ coin flips (i.e. the sample proportion of heads)
-
-- We say that $\bar X_n$ converges in probability to a limit if for any fixed distance the  probability of $\bar X_n$ being closer (further away) than that distance from the limit converges to one (zero)
-
----
-
-## The Law of Large Numbers
-
-- Establishing that a random sequence converges to a limit is hard
-- Fortunately, we have a theorem that does all the work for us, called
-    the **Law of Large Numbers**
-- The law of large numbers states that if $X_1,\ldots X_n$ are iid from a population with mean $\mu$ and variance $\sigma^2$ then $\bar X_n$ converges in probability to $\mu$
-- (There are many variations on the LLN; we are using a particularly lazy version, my favorite kind of version)
-
----
-## Law of large numbers in action
-
-```r
-n <- 10000; means <- cumsum(rnorm(n)) / (1  : n)
-plot(1 : n, means, type = "l", lwd = 2, 
-     frame = FALSE, ylab = "cumulative means", xlab = "sample size")
-abline(h = 0)
-```
-
-<div class="rimage center"><img src="fig/unnamed-chunk-1.png" title="plot of chunk unnamed-chunk-1" alt="plot of chunk unnamed-chunk-1" class="plot" /></div>
-
----
-## Discussion
-- An estimator is **consistent** if it converges to what you want to estimate
-  - Consistency is neither necessary nor sufficient for one estimator to be better than another
-  - Typically, good estimators are consistent; it's not too much to ask that if we go to the trouble of collecting an infinite amount of data that we get the right answer
-- The LLN basically states that the sample mean is consistent
-- The sample variance and the sample standard deviation are consistent as well
-- Recall also that the sample mean and the sample variance are unbiased as well
-- (The sample standard deviation is biased, by the way)
-
----
-
-## The Central Limit Theorem
-
-- The **Central Limit Theorem** (CLT) is one of the most important theorems in statistics
-- For our purposes, the CLT states that the distribution of averages of iid variables, properly normalized, becomes that of a standard normal as the sample size increases
-- The CLT applies in an endless variety of settings
-- Let $X_1,\ldots,X_n$ be a collection of iid random variables with mean $\mu$ and variance $\sigma^2$
-- Let $\bar X_n$ be their sample average
-- Then $\frac{\bar X_n - \mu}{\sigma / \sqrt{n}}$ has a distribution like that of a standard normal for large $n$.
-- Remember the form
-$$\frac{\bar X_n - \mu}{\sigma / \sqrt{n}} = 
-    \frac{\mbox{Estimate} - \mbox{Mean of estimate}}{\mbox{Std. Err. of estimate}}.
-$$
-- Usually, replacing the standard error by its estimated value doesn't change the CLT
-
----
-
-## Example
-
-- Simulate a standard normal random variable by rolling $n$ (six sided)
-- Let $X_i$ be the outcome for die $i$
-- Then note that $\mu = E[X_i] = 3.5$
-- $Var(X_i) = 2.92$ 
-- SE $\sqrt{2.92 / n} = 1.71 / \sqrt{n}$
-- Standardized mean
-$$
-    \frac{\bar X_n - 3.5}{1.71/\sqrt{n}}
-$$ 
-
----
-## Simulation of mean of $n$ dice
-<div class="rimage center"><img src="fig/unnamed-chunk-2.png" title="plot of chunk unnamed-chunk-2" alt="plot of chunk unnamed-chunk-2" class="plot" /></div>
-
----
-
-## Coin CLT
-
- - Let $X_i$ be the $0$ or $1$ result of the $i^{th}$ flip of a possibly unfair coin
-- The sample proportion, say $\hat p$, is the average of the coin flips
-- $E[X_i] = p$ and $Var(X_i) = p(1-p)$
-- Standard error of the mean is $\sqrt{p(1-p)/n}$
-- Then
-$$
-    \frac{\hat p - p}{\sqrt{p(1-p)/n}}
-$$
-will be approximately normally distributed
-
----
-
-<div class="rimage center"><img src="fig/unnamed-chunk-3.png" title="plot of chunk unnamed-chunk-3" alt="plot of chunk unnamed-chunk-3" class="plot" /></div>
-
-
----
-
-## CLT in practice
-
-- In practice the CLT is mostly useful as an approximation
-$$
-    P\left( \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq z \right) \approx \Phi(z).  
-$$
-- Recall $1.96$ is a good approximation to the $.975^{th}$ quantile of the standard normal
-- Consider
-$$
-    \begin{eqnarray*}
-      .95 & \approx & P\left( -1.96 \leq \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq 1.96 \right)\\ \\
-      & =       & P\left(\bar X_n +1.96 \sigma/\sqrt{n} \geq \mu \geq \bar X_n - 1.96\sigma/\sqrt{n} \right),\\
-    \end{eqnarray*}
-$$
-
----
-
-## Confidence intervals
-
-- Therefore, according to the CLT, the probability that the random interval $$\bar X_n \pm z_{1-\alpha/2}\sigma / \sqrt{n}$$ contains $\mu$ is approximately 100$(1-\alpha)$%, where $z_{1-\alpha/2}$ is the $1-\alpha/2$ quantile of the standard normal distribution
-- This is called a $100(1 - \alpha)$% **confidence interval** for $\mu$
-- We can replace the unknown $\sigma$ with $s$
-
----
-## Give a confidence interval for the average height of sons
-in Galton's data
-
-```r
-library(UsingR);data(father.son); x <- father.son$sheight
-(mean(x) + c(-1, 1) * qnorm(.975) * sd(x) / sqrt(length(x))) / 12
-```
-
-```
-[1] 5.710 5.738
-```
-
-
----
-
-## Sample proportions
-
-- In the event that each $X_i$ is $0$ or $1$ with common success probability $p$ then $\sigma^2 = p(1 - p)$
-- The interval takes the form
-$$
-    \hat p \pm z_{1 - \alpha/2}  \sqrt{\frac{p(1 - p)}{n}}
-$$
-- Replacing $p$ by $\hat p$ in the standard error results in what is called a Wald confidence interval for $p$
-- Also note that $p(1-p) \leq 1/4$ for $0 \leq p \leq 1$
-- Let $\alpha = .05$ so that $z_{1 -\alpha/2} = 1.96 \approx 2$ then
-$$
-    2  \sqrt{\frac{p(1 - p)}{n}} \leq 2 \sqrt{\frac{1}{4n}} = \frac{1}{\sqrt{n}} 
-$$
-- Therefore $\hat p \pm \frac{1}{\sqrt{n}}$ is a quick CI estimate for $p$
-
----
-## Example
-* Your campaign advisor told you that in a random sample of 100 likely voters,
-  56 intent to vote for you. 
-  * Can you relax? Do you have this race in the bag?
-  * Without access to a computer or calculator, how precise is this estimate?
-* `1/sqrt(100)=.1` so a back of the envelope calculation gives an approximate 95% interval of `(0.46, 0.66)`
-  * Not enough for you to relax, better go do more campaigning!
-* Rough guidelines, 100 for 1 decimal place, 10,000 for 2, 1,000,000 for 3.
-
-```r
-round(1 / sqrt(10 ^ (1 : 6)), 3)
-```
-
-```
-[1] 0.316 0.100 0.032 0.010 0.003 0.001
-```
-
----
-## Poisson interval
-* A nuclear pump failed 5 times out of 94.32 days, give a 95% confidence interval for the failure rate per day?
-* $X \sim Poisson(\lambda t)$.
-* Estimate $\hat \lambda = X/t$
-* $Var(\hat \lambda) = \lambda / t$ 
-$$
-\frac{\hat \lambda - \lambda}{\sqrt{\hat \lambda / t}} 
-= 
-\frac{X - t \lambda}{\sqrt{X}} 
-\rightarrow N(0,1)
-$$
-* This isn't the best interval.
-  * There are better asymptotic intervals.
-  * You can get an exact CI in this case.
-
----
-### R code
-
-```r
-x <- 5; t <- 94.32; lambda <- x / t
-round(lambda + c(-1, 1) * qnorm(.975) * sqrt(lambda / t), 3)
-```
-
-```
-[1] 0.007 0.099
-```
-
-```r
-poisson.test(x, T = 94.32)$conf
-```
-
-```
-[1] 0.01721 0.12371
-attr(,"conf.level")
-[1] 0.95
-```
-
-
----
-## In the regression class
-
-```r
-exp(confint(glm(x ~ 1 + offset(log(t)), family = poisson(link = log))))
-```
-
-```
-  2.5 %  97.5 % 
-0.01901 0.11393 
-```
-
-
+---
+title       : A trip to Asymptopia
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+## Asymptotics
+* Asymptotics is the term for the behavior of statistics as the sample size (or some other relevant quantity) limits to infinity (or some other relevant number)
+* (Asymptopia is my name for the land of asymptotics, where everything works out well and there's no messes. The land of infinite data is nice that way.)
+* Asymptotics are incredibly useful for simple statistical inference and approximations 
+* (Not covered in this class) Asymptotics often lead to nice understanding of procedures
+* Asymptotics generally give no assurances about finite sample performance
+  * The kinds of asymptotics that do are orders of magnitude more difficult to work with
+* Asymptotics form the basis for frequency interpretation of probabilities 
+  (the long run proportion of times an event occurs)
+* To understand asymptotics, we need a very basic understanding of limits.
+
+
+---
+## Numerical limits
+
+- Imagine a sequence
+
+  - $a_1 = .9$,
+  - $a_2 = .99$,
+  - $a_3 = .999$, ...
+
+- Clearly this sequence converges to $1$
+- Definition of a limit: For any fixed distance we can find a point in the sequence so that the sequence is closer to the limit than that distance from that point on
+
+---
+
+## Limits of random variables
+
+- The problem is harder for random variables
+- Consider $\bar X_n$ the sample average of the first $n$ of a collection of $iid$ observations
+
+  - Example $\bar X_n$ could be the average of the result of $n$ coin flips (i.e. the sample proportion of heads)
+
+- We say that $\bar X_n$ converges in probability to a limit if for any fixed distance the  probability of $\bar X_n$ being closer (further away) than that distance from the limit converges to one (zero)
+
+---
+
+## The Law of Large Numbers
+
+- Establishing that a random sequence converges to a limit is hard
+- Fortunately, we have a theorem that does all the work for us, called
+    the **Law of Large Numbers**
+- The law of large numbers states that if $X_1,\ldots X_n$ are iid from a population with mean $\mu$ and variance $\sigma^2$ then $\bar X_n$ converges in probability to $\mu$
+- (There are many variations on the LLN; we are using a particularly lazy version, my favorite kind of version)
+
+---
+## Law of large numbers in action
+
+```r
+n <- 10000
+means <- cumsum(rnorm(n))/(1:n)
+plot(1:n, means, type = "l", lwd = 2, frame = FALSE, ylab = "cumulative means", 
+    xlab = "sample size")
+abline(h = 0)
+```
+
+![plot of chunk unnamed-chunk-1](assets/fig/unnamed-chunk-1.png) 
+
+---
+## Discussion
+- An estimator is **consistent** if it converges to what you want to estimate
+  - Consistency is neither necessary nor sufficient for one estimator to be better than another
+  - Typically, good estimators are consistent; it's not too much to ask that if we go to the trouble of collecting an infinite amount of data that we get the right answer
+- The LLN basically states that the sample mean is consistent
+- The sample variance and the sample standard deviation are consistent as well
+- Recall also that the sample mean and the sample variance are unbiased as well
+- (The sample standard deviation is biased, by the way)
+
+---
+
+## The Central Limit Theorem
+
+- The **Central Limit Theorem** (CLT) is one of the most important theorems in statistics
+- For our purposes, the CLT states that the distribution of averages of iid variables, properly normalized, becomes that of a standard normal as the sample size increases
+- The CLT applies in an endless variety of settings
+- Let $X_1,\ldots,X_n$ be a collection of iid random variables with mean $\mu$ and variance $\sigma^2$
+- Let $\bar X_n$ be their sample average
+- Then $\frac{\bar X_n - \mu}{\sigma / \sqrt{n}}$ has a distribution like that of a standard normal for large $n$.
+- Remember the form
+$$\frac{\bar X_n - \mu}{\sigma / \sqrt{n}} = 
+    \frac{\mbox{Estimate} - \mbox{Mean of estimate}}{\mbox{Std. Err. of estimate}}.
+$$
+- Usually, replacing the standard error by its estimated value doesn't change the CLT
+
+---
+
+## Example
+
+- Simulate a standard normal random variable by rolling $n$ (six sided)
+- Let $X_i$ be the outcome for die $i$
+- Then note that $\mu = E[X_i] = 3.5$
+- $Var(X_i) = 2.92$ 
+- SE $\sqrt{2.92 / n} = 1.71 / \sqrt{n}$
+- Standardized mean
+$$
+    \frac{\bar X_n - 3.5}{1.71/\sqrt{n}}
+$$ 
+
+---
+## Simulation of mean of $n$ dice
+![plot of chunk unnamed-chunk-2](assets/fig/unnamed-chunk-2.png) 
+
+---
+
+## Coin CLT
+
+ - Let $X_i$ be the $0$ or $1$ result of the $i^{th}$ flip of a possibly unfair coin
+- The sample proportion, say $\hat p$, is the average of the coin flips
+- $E[X_i] = p$ and $Var(X_i) = p(1-p)$
+- Standard error of the mean is $\sqrt{p(1-p)/n}$
+- Then
+$$
+    \frac{\hat p - p}{\sqrt{p(1-p)/n}}
+$$
+will be approximately normally distributed
+
+---
+
+![plot of chunk unnamed-chunk-3](assets/fig/unnamed-chunk-3.png) 
+
+
+---
+
+## CLT in practice
+
+- In practice the CLT is mostly useful as an approximation
+$$
+    P\left( \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq z \right) \approx \Phi(z).  
+$$
+- Recall $1.96$ is a good approximation to the $.975^{th}$ quantile of the standard normal
+- Consider
+$$
+    \begin{eqnarray*}
+      .95 & \approx & P\left( -1.96 \leq \frac{\bar X_n - \mu}{\sigma / \sqrt{n}} \leq 1.96 \right)\\ \\
+      & =       & P\left(\bar X_n +1.96 \sigma/\sqrt{n} \geq \mu \geq \bar X_n - 1.96\sigma/\sqrt{n} \right),\\
+    \end{eqnarray*}
+$$
+
+---
+
+## Confidence intervals
+
+- Therefore, according to the CLT, the probability that the random interval $$\bar X_n \pm z_{1-\alpha/2}\sigma / \sqrt{n}$$ contains $\mu$ is approximately 100$(1-\alpha)$%, where $z_{1-\alpha/2}$ is the $1-\alpha/2$ quantile of the standard normal distribution
+- This is called a $100(1 - \alpha)$% **confidence interval** for $\mu$
+- We can replace the unknown $\sigma$ with $s$
+
+---
+## Give a confidence interval for the average height of sons
+in Galton's data
+
+```r
+library(UsingR)
+data(father.son)
+x <- father.son$sheight
+(mean(x) + c(-1, 1) * qnorm(0.975) * sd(x)/sqrt(length(x)))/12
+```
+
+```
+## [1] 5.710 5.738
+```
+
+
+---
+
+## Sample proportions
+
+- In the event that each $X_i$ is $0$ or $1$ with common success probability $p$ then $\sigma^2 = p(1 - p)$
+- The interval takes the form
+$$
+    \hat p \pm z_{1 - \alpha/2}  \sqrt{\frac{p(1 - p)}{n}}
+$$
+- Replacing $p$ by $\hat p$ in the standard error results in what is called a Wald confidence interval for $p$
+- Also note that $p(1-p) \leq 1/4$ for $0 \leq p \leq 1$
+- Let $\alpha = .05$ so that $z_{1 -\alpha/2} = 1.96 \approx 2$ then
+$$
+    2  \sqrt{\frac{p(1 - p)}{n}} \leq 2 \sqrt{\frac{1}{4n}} = \frac{1}{\sqrt{n}} 
+$$
+- Therefore $\hat p \pm \frac{1}{\sqrt{n}}$ is a quick CI estimate for $p$
+
+---
+## Example
+* Your campaign advisor told you that in a random sample of 100 likely voters,
+  56 intent to vote for you. 
+  * Can you relax? Do you have this race in the bag?
+  * Without access to a computer or calculator, how precise is this estimate?
+* `1/sqrt(100)=.1` so a back of the envelope calculation gives an approximate 95% interval of `(0.46, 0.66)`
+  * Not enough for you to relax, better go do more campaigning!
+* Rough guidelines, 100 for 1 decimal place, 10,000 for 2, 1,000,000 for 3.
+
+```r
+round(1/sqrt(10^(1:6)), 3)
+```
+
+```
+## [1] 0.316 0.100 0.032 0.010 0.003 0.001
+```
+
+---
+## Poisson interval
+* A nuclear pump failed 5 times out of 94.32 days, give a 95% confidence interval for the failure rate per day?
+* $X \sim Poisson(\lambda t)$.
+* Estimate $\hat \lambda = X/t$
+* $Var(\hat \lambda) = \lambda / t$ 
+$$
+\frac{\hat \lambda - \lambda}{\sqrt{\hat \lambda / t}} 
+= 
+\frac{X - t \lambda}{\sqrt{X}} 
+\rightarrow N(0,1)
+$$
+* This isn't the best interval.
+  * There are better asymptotic intervals.
+  * You can get an exact CI in this case.
+
+---
+### R code
+
+```r
+x <- 5
+t <- 94.32
+lambda <- x/t
+round(lambda + c(-1, 1) * qnorm(0.975) * sqrt(lambda/t), 3)
+```
+
+```
+## [1] 0.007 0.099
+```
+
+```r
+poisson.test(x, T = 94.32)$conf
+```
+
+```
+## [1] 0.01721 0.12371
+## attr(,"conf.level")
+## [1] 0.95
+```
+
+
+---
+## In the regression class
+
+```r
+exp(confint(glm(x ~ 1 + offset(log(t)), family = poisson(link = log))))
+```
+
+```
+## Waiting for profiling to be done...
+```
+
+```
+##   2.5 %  97.5 % 
+## 0.01901 0.11393
+```
+
+
diff --git a/06_StatisticalInference/02_02_Asymptopia/index.pdf b/06_StatisticalInference/02_02_Asymptopia/index.pdf
index 7fcb65b90..d43b4b4a8 100644
Binary files a/06_StatisticalInference/02_02_Asymptopia/index.pdf and b/06_StatisticalInference/02_02_Asymptopia/index.pdf differ
diff --git a/06_StatisticalInference/02_03_tCIs/index.Rmd b/06_StatisticalInference/02_03_tCIs/index.Rmd
index a28c1c306..d2f663f19 100644
--- a/06_StatisticalInference/02_03_tCIs/index.Rmd
+++ b/06_StatisticalInference/02_03_tCIs/index.Rmd
@@ -1,158 +1,161 @@
----
-title       : T Confidence Intervals
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-## Confidence intervals
-
-- In the previous, we discussed creating a confidence interval using the CLT
-- In this lecture, we discuss some methods for small samples, notably Gosset's $t$ distribution
-- To discuss the $t$ distribution we must discuss the Chi-squared distribution
-- Throughout we use the following general procedure for creating CIs
-
-  a. Create a **Pivot** or statistic that does not depend on the parameter of interest
-  
-  b. Solve the probability that the pivot lies between bounds for the parameter
-
----
-
-## The Chi-squared distribution
-
-- Suppose that $S^2$ is the sample variance from a collection of iid $N(\mu,\sigma^2)$ data; then 
-$$
-    \frac{(n - 1) S^2}{\sigma^2} \sim \chi^2_{n-1}
-$$
-which reads: follows a Chi-squared distribution with $n-1$ degrees of freedom
-- The Chi-squared distribution is skewed and has support on $0$ to $\infty$
-- The mean of the Chi-squared is its degrees of freedom 
-- The variance of the Chi-squared distribution is twice the degrees of freedom
-
----
-
-## Confidence interval for the variance
-
-Note that if $\chi^2_{n-1, \alpha}$ is the $\alpha$ quantile of the
-Chi-squared distribution then
-
-$$
-\begin{eqnarray*}
-  1 - \alpha & = & P \left( \chi^2_{n-1, \alpha/2} \leq  \frac{(n - 1) S^2}{\sigma^2} \leq  \chi^2_{n-1,1 - \alpha/2} \right) \\ \\
-& = &  P\left(\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}} \leq \sigma^2 \leq 
-\frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}} \right) \\
-\end{eqnarray*}
-$$
-So that 
-$$
-\left[\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}}, \frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}}\right]
-$$
-is a $100(1-\alpha)\%$ confidence interval for $\sigma^2$
-
----
-
-## Notes about this interval
-
-- This interval relies heavily on the assumed normality
-- Square-rooting the endpoints yields a CI for $\sigma$
-
----
-## Example
-### Confidence interval for the standard deviation of sons' heights from Galton's data
-```{r}
-library(UsingR); data(father.son); x <- father.son$sheight
-s <- sd(x); n <- length(x)
-round(sqrt( (n-1) * s ^ 2 / qchisq(c(.975, .025), n - 1) ), 3)
-```
-
----
-
-## Gosset's $t$ distribution
-
-- Invented by William Gosset (under the pseudonym "Student") in 1908
-- Has thicker tails than the normal
-- Is indexed by a degrees of freedom; gets more like a standard normal as df gets larger
-- Is obtained as 
-$$
-\frac{Z}{\sqrt{\frac{\chi^2}{df}}}
-$$
-where $Z$ and $\chi^2$ are independent standard normals and
-Chi-squared distributions respectively
-
----
-
-## Result
-
-- Suppose that $(X_1,\ldots,X_n)$ are iid $N(\mu,\sigma^2)$, then:
-  a. $\frac{\bar X - \mu}{\sigma / \sqrt{n}}$ is standard normal
-  b. $\sqrt{\frac{(n - 1) S^2}{\sigma^2 (n - 1)}} = S / \sigma$ is the square root of a Chi-squared divided by its df
-
-- Therefore 
-$$
-\frac{\frac{\bar X - \mu}{\sigma /\sqrt{n}}}{S/\sigma}  
-= \frac{\bar X - \mu}{S/\sqrt{n}}
-$$
-    follows Gosset's $t$ distribution with $n-1$ degrees of freedom
-
----
-
-## Confidence intervals for the mean
-
-- Notice that the $t$ statistic is a pivot, therefore we use it to create a confidence interval for $\mu$
-- Let $t_{df,\alpha}$ be the $\alpha^{th}$ quantile of the t distribution with $df$ degrees of freedom
-$$
-  \begin{eqnarray*}
-&   & 1 - \alpha \\
-& = & P\left(-t_{n-1,1-\alpha/2} \leq \frac{\bar X - \mu}{S/\sqrt{n}} \leq t_{n-1,1-\alpha/2}\right) \\ \\
-& = & P\left(\bar X - t_{n-1,1-\alpha/2} S / \sqrt{n} \leq \mu  
-      \leq \bar X + t_{n-1,1-\alpha/2}S /\sqrt{n}\right)
-  \end{eqnarray*}
-$$
-- Interval is $\bar X \pm t_{n-1,1-\alpha/2} S/\sqrt{n}$
-
----
-
-## Note's about the $t$ interval
-
-- The $t$ interval technically assumes that the data are iid normal, though it is robust to this assumption
-- It works well whenever the distribution of the data is roughly symmetric and mound shaped
-- Paired observations are often analyzed using the $t$ interval by taking differences
-- For large degrees of freedom, $t$ quantiles become the same as standard normal quantiles; therefore this interval converges to the same interval as the CLT yielded
-- For skewed distributions, the spirit of the $t$ interval assumptions are violated
-- Also, for skewed distributions, it doesn't make a lot of sense to center the interval at the mean
-- In this case, consider taking logs or using a different summary like the median
-- For highly discrete data, like binary, other intervals are available
-
----
-
-## Sleep data
-
-In R typing `data(sleep)` brings up the sleep data originally
-analyzed in Gosset's Biometrika paper, which shows the increase in
-hours for 10 patients on two soporific drugs. R treats the data as two
-groups rather than paired.
-
----
-## The data
-```{r}
-data(sleep)
-head(sleep)
-```
-
----
-```{r}
-g1 <- sleep$extra[1 : 10]; g2 <- sleep$extra[11 : 20]
-difference <- g2 - g1
-mn <- mean(difference); s <- sd(difference); n <- 10
-mn + c(-1, 1) * qt(.975, n-1) * s / sqrt(n)
-t.test(difference)$conf.int
-```
-
+---
+title       : T Confidence Intervals
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+## Confidence intervals
+
+- In the previous, we discussed creating a confidence interval using the CLT
+- In this lecture, we discuss some methods for small samples, notably Gosset's $t$ distribution
+- To discuss the $t$ distribution we must discuss the Chi-squared distribution
+- Throughout we use the following general procedure for creating CIs
+
+  a. Create a **Pivot** or statistic that does not depend on the parameter of interest
+  
+  b. Solve the probability that the pivot lies between bounds for the parameter
+
+---
+
+## The Chi-squared distribution
+
+- Suppose that $S^2$ is the sample variance from a collection of iid $N(\mu,\sigma^2)$ data; then 
+$$
+    \frac{(n - 1) S^2}{\sigma^2} \sim \chi^2_{n-1}
+$$
+which reads: follows a Chi-squared distribution with $n-1$ degrees of freedom
+- The Chi-squared distribution is skewed and has support on $0$ to $\infty$
+- The mean of the Chi-squared is its degrees of freedom 
+- The variance of the Chi-squared distribution is twice the degrees of freedom
+
+---
+
+## Confidence interval for the variance
+
+Note that if $\chi^2_{n-1, \alpha}$ is the $\alpha$ quantile of the
+Chi-squared distribution then
+
+$$
+\begin{eqnarray*}
+  1 - \alpha & = & P \left( \chi^2_{n-1, \alpha/2} \leq  \frac{(n - 1) S^2}{\sigma^2} \leq  \chi^2_{n-1,1 - \alpha/2} \right) \\ \\
+& = &  P\left(\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}} \leq \sigma^2 \leq 
+\frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}} \right) \\
+\end{eqnarray*}
+$$
+So that 
+$$
+\left[\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}}, \frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}}\right]
+$$
+is a $100(1-\alpha)\%$ confidence interval for $\sigma^2$
+
+---
+
+## Notes about this interval
+
+- This interval relies heavily on the assumed normality
+- Square-rooting the endpoints yields a CI for $\sigma$
+
+---
+## Example
+### Confidence interval for the standard deviation of sons' heights from Galton's data
+```{r}
+library(UsingR); data(father.son); x <- father.son$sheight
+s <- sd(x); n <- length(x)
+round(sqrt( (n-1) * s ^ 2 / qchisq(c(.975, .025), n - 1) ), 3)
+```
+
+---
+
+## Gosset's $t$ distribution
+
+- Invented by William Gosset (under the pseudonym "Student") in 1908
+- Has thicker tails than the normal
+- Is indexed by a degrees of freedom; gets more like a standard normal as df gets larger
+- Is obtained as 
+$$
+\frac{Z}{\sqrt{\frac{\chi^2}{df}}}
+$$
+where $Z$ and $\chi^2$ are independent standard normals and
+Chi-squared distributions respectively
+
+---
+
+## Result
+
+- Suppose that $(X_1,\ldots,X_n)$ are iid $N(\mu,\sigma^2)$, then:
+  a. $\frac{\bar X - \mu}{\sigma / \sqrt{n}}$ is standard normal
+  b. $\sqrt{\frac{(n - 1) S^2}{\sigma^2 (n - 1)}} = S / \sigma$ is the square root of a Chi-squared divided by its df
+
+- Therefore 
+$$
+\frac{\frac{\bar X - \mu}{\sigma /\sqrt{n}}}{S/\sigma}  
+= \frac{\bar X - \mu}{S/\sqrt{n}}
+$$
+    follows Gosset's $t$ distribution with $n-1$ degrees of freedom
+
+---
+
+## Confidence intervals for the mean
+
+- Notice that the $t$ statistic is a pivot, therefore we use it to create a confidence interval for $\mu$
+- Let $t_{df,\alpha}$ be the $\alpha^{th}$ quantile of the t distribution with $df$ degrees of freedom
+$$
+  \begin{eqnarray*}
+&   & 1 - \alpha \\
+& = & P\left(-t_{n-1,1-\alpha/2} \leq \frac{\bar X - \mu}{S/\sqrt{n}} \leq t_{n-1,1-\alpha/2}\right) \\ \\
+& = & P\left(\bar X - t_{n-1,1-\alpha/2} S / \sqrt{n} \leq \mu  
+      \leq \bar X + t_{n-1,1-\alpha/2}S /\sqrt{n}\right)
+  \end{eqnarray*}
+$$
+- Interval is $\bar X \pm t_{n-1,1-\alpha/2} S/\sqrt{n}$
+
+---
+
+## Note's about the $t$ interval
+
+- The $t$ interval technically assumes that the data are iid normal, though it is robust to this assumption
+- It works well whenever the distribution of the data is roughly symmetric and mound shaped
+- Paired observations are often analyzed using the $t$ interval by taking differences
+- For large degrees of freedom, $t$ quantiles become the same as standard normal quantiles; therefore this interval converges to the same interval as the CLT yielded
+- For skewed distributions, the spirit of the $t$ interval assumptions are violated
+- Also, for skewed distributions, it doesn't make a lot of sense to center the interval at the mean
+- In this case, consider taking logs or using a different summary like the median
+- For highly discrete data, like binary, other intervals are available
+
+---
+
+## Sleep data
+
+In R typing `data(sleep)` brings up the sleep data originally
+analyzed in Gosset's Biometrika paper, which shows the increase in
+hours for 10 patients on two soporific drugs. R treats the data as two
+groups rather than paired.
+
+---
+## The data
+```{r}
+data(sleep)
+head(sleep)
+```
+
+---
+## Results
+```{r, echo=TRUE}
+g1 <- sleep$extra[1 : 10]; g2 <- sleep$extra[11 : 20]
+difference <- g2 - g1
+mn <- mean(difference); s <- sd(difference); n <- 10
+mn + c(-1, 1) * qt(.975, n-1) * s / sqrt(n)
+t.test(difference)$conf.int
+```
+
+
+
diff --git a/06_StatisticalInference/02_03_tCIs/index.html b/06_StatisticalInference/02_03_tCIs/index.html
index 8a5b3d4eb..8b2b6c9d9 100644
--- a/06_StatisticalInference/02_03_tCIs/index.html
+++ b/06_StatisticalInference/02_03_tCIs/index.html
@@ -1,399 +1,402 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>T Confidence Intervals</title>
-  <meta charset="utf-8">
-  <meta name="description" content="T Confidence Intervals">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>T Confidence Intervals</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Confidence intervals</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>In the previous, we discussed creating a confidence interval using the CLT</li>
-<li>In this lecture, we discuss some methods for small samples, notably Gosset&#39;s \(t\) distribution</li>
-<li>To discuss the \(t\) distribution we must discuss the Chi-squared distribution</li>
-<li><p>Throughout we use the following general procedure for creating CIs</p>
-
-<p>a. Create a <strong>Pivot</strong> or statistic that does not depend on the parameter of interest</p>
-
-<p>b. Solve the probability that the pivot lies between bounds for the parameter</p></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>The Chi-squared distribution</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose that \(S^2\) is the sample variance from a collection of iid \(N(\mu,\sigma^2)\) data; then 
-\[
-\frac{(n - 1) S^2}{\sigma^2} \sim \chi^2_{n-1}
-\]
-which reads: follows a Chi-squared distribution with \(n-1\) degrees of freedom</li>
-<li>The Chi-squared distribution is skewed and has support on \(0\) to \(\infty\)</li>
-<li>The mean of the Chi-squared is its degrees of freedom </li>
-<li>The variance of the Chi-squared distribution is twice the degrees of freedom</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Confidence interval for the variance</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>Note that if \(\chi^2_{n-1, \alpha}\) is the \(\alpha\) quantile of the
-Chi-squared distribution then</p>
-
-<p>\[
-\begin{eqnarray*}
-  1 - \alpha & = & P \left( \chi^2_{n-1, \alpha/2} \leq  \frac{(n - 1) S^2}{\sigma^2} \leq  \chi^2_{n-1,1 - \alpha/2} \right) \\ \\
-& = &  P\left(\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}} \leq \sigma^2 \leq 
-\frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}} \right) \\
-\end{eqnarray*}
-\]
-So that 
-\[
-\left[\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}}, \frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}}\right]
-\]
-is a \(100(1-\alpha)\%\) confidence interval for \(\sigma^2\)</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Notes about this interval</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>This interval relies heavily on the assumed normality</li>
-<li>Square-rooting the endpoints yields a CI for \(\sigma\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <h3>Confidence interval for the standard deviation of sons&#39; heights from Galton&#39;s data</h3>
-
-<pre><code class="r">library(UsingR)
-data(father.son)
-x &lt;- father.son$sheight
-s &lt;- sd(x)
-n &lt;- length(x)
-round(sqrt((n - 1) * s^2/qchisq(c(0.975, 0.025), n - 1)), 3)
-</code></pre>
-
-<pre><code>## [1] 2.701 2.939
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Gosset&#39;s \(t\) distribution</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Invented by William Gosset (under the pseudonym &quot;Student&quot;) in 1908</li>
-<li>Has thicker tails than the normal</li>
-<li>Is indexed by a degrees of freedom; gets more like a standard normal as df gets larger</li>
-<li>Is obtained as 
-\[
-\frac{Z}{\sqrt{\frac{\chi^2}{df}}}
-\]
-where \(Z\) and \(\chi^2\) are independent standard normals and
-Chi-squared distributions respectively</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Result</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li><p>Suppose that \((X_1,\ldots,X_n)\) are iid \(N(\mu,\sigma^2)\), then:
-a. \(\frac{\bar X - \mu}{\sigma / \sqrt{n}}\) is standard normal
-b. \(\sqrt{\frac{(n - 1) S^2}{\sigma^2 (n - 1)}} = S / \sigma\) is the square root of a Chi-squared divided by its df</p></li>
-<li><p>Therefore 
-\[
-\frac{\frac{\bar X - \mu}{\sigma /\sqrt{n}}}{S/\sigma}  
-= \frac{\bar X - \mu}{S/\sqrt{n}}
-\]
-follows Gosset&#39;s \(t\) distribution with \(n-1\) degrees of freedom</p></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Confidence intervals for the mean</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Notice that the \(t\) statistic is a pivot, therefore we use it to create a confidence interval for \(\mu\)</li>
-<li>Let \(t_{df,\alpha}\) be the \(\alpha^{th}\) quantile of the t distribution with \(df\) degrees of freedom
-\[
-\begin{eqnarray*}
-&   & 1 - \alpha \\
-& = & P\left(-t_{n-1,1-\alpha/2} \leq \frac{\bar X - \mu}{S/\sqrt{n}} \leq t_{n-1,1-\alpha/2}\right) \\ \\
-& = & P\left(\bar X - t_{n-1,1-\alpha/2} S / \sqrt{n} \leq \mu  
-  \leq \bar X + t_{n-1,1-\alpha/2}S /\sqrt{n}\right)
-\end{eqnarray*}
-\]</li>
-<li>Interval is \(\bar X \pm t_{n-1,1-\alpha/2} S/\sqrt{n}\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>Note&#39;s about the \(t\) interval</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The \(t\) interval technically assumes that the data are iid normal, though it is robust to this assumption</li>
-<li>It works well whenever the distribution of the data is roughly symmetric and mound shaped</li>
-<li>Paired observations are often analyzed using the \(t\) interval by taking differences</li>
-<li>For large degrees of freedom, \(t\) quantiles become the same as standard normal quantiles; therefore this interval converges to the same interval as the CLT yielded</li>
-<li>For skewed distributions, the spirit of the \(t\) interval assumptions are violated</li>
-<li>Also, for skewed distributions, it doesn&#39;t make a lot of sense to center the interval at the mean</li>
-<li>In this case, consider taking logs or using a different summary like the median</li>
-<li>For highly discrete data, like binary, other intervals are available</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>Sleep data</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>In R typing <code>data(sleep)</code> brings up the sleep data originally
-analyzed in Gosset&#39;s Biometrika paper, which shows the increase in
-hours for 10 patients on two soporific drugs. R treats the data as two
-groups rather than paired.</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <hgroup>
-    <h2>The data</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">data(sleep)
-head(sleep)
-</code></pre>
-
-<pre><code>##   extra group ID
-## 1   0.7     1  1
-## 2  -1.6     1  2
-## 3  -0.2     1  3
-## 4  -1.2     1  4
-## 5  -0.1     1  5
-## 6   3.4     1  6
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-12" style="background:;">
-  <article data-timings="">
-    <pre><code class="r">g1 &lt;- sleep$extra[1:10]
-g2 &lt;- sleep$extra[11:20]
-difference &lt;- g2 - g1
-mn &lt;- mean(difference)
-s &lt;- sd(difference)
-n &lt;- 10
-mn + c(-1, 1) * qt(0.975, n - 1) * s/sqrt(n)
-</code></pre>
-
-<pre><code>## [1] 0.7001 2.4599
-</code></pre>
-
-<pre><code class="r">t.test(difference)$conf.int
-</code></pre>
-
-<pre><code>## [1] 0.7001 2.4599
-## attr(,&quot;conf.level&quot;)
-## [1] 0.95
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Confidence intervals'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='The Chi-squared distribution'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Confidence interval for the variance'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Notes about this interval'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Example'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Gosset&#39;s \(t\) distribution'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Result'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Confidence intervals for the mean'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='Note&#39;s about the \(t\) interval'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='Sleep data'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title='The data'>
-         11
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=12 title=''>
-         12
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>T Confidence Intervals</title>
+  <meta charset="utf-8">
+  <meta name="description" content="T Confidence Intervals">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>T Confidence Intervals</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Confidence intervals</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>In the previous, we discussed creating a confidence interval using the CLT</li>
+<li>In this lecture, we discuss some methods for small samples, notably Gosset&#39;s \(t\) distribution</li>
+<li>To discuss the \(t\) distribution we must discuss the Chi-squared distribution</li>
+<li><p>Throughout we use the following general procedure for creating CIs</p>
+
+<p>a. Create a <strong>Pivot</strong> or statistic that does not depend on the parameter of interest</p>
+
+<p>b. Solve the probability that the pivot lies between bounds for the parameter</p></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>The Chi-squared distribution</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose that \(S^2\) is the sample variance from a collection of iid \(N(\mu,\sigma^2)\) data; then 
+\[
+\frac{(n - 1) S^2}{\sigma^2} \sim \chi^2_{n-1}
+\]
+which reads: follows a Chi-squared distribution with \(n-1\) degrees of freedom</li>
+<li>The Chi-squared distribution is skewed and has support on \(0\) to \(\infty\)</li>
+<li>The mean of the Chi-squared is its degrees of freedom </li>
+<li>The variance of the Chi-squared distribution is twice the degrees of freedom</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Confidence interval for the variance</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>Note that if \(\chi^2_{n-1, \alpha}\) is the \(\alpha\) quantile of the
+Chi-squared distribution then</p>
+
+<p>\[
+\begin{eqnarray*}
+  1 - \alpha & = & P \left( \chi^2_{n-1, \alpha/2} \leq  \frac{(n - 1) S^2}{\sigma^2} \leq  \chi^2_{n-1,1 - \alpha/2} \right) \\ \\
+& = &  P\left(\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}} \leq \sigma^2 \leq 
+\frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}} \right) \\
+\end{eqnarray*}
+\]
+So that 
+\[
+\left[\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}}, \frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}}\right]
+\]
+is a \(100(1-\alpha)\%\) confidence interval for \(\sigma^2\)</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Notes about this interval</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>This interval relies heavily on the assumed normality</li>
+<li>Square-rooting the endpoints yields a CI for \(\sigma\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <h3>Confidence interval for the standard deviation of sons&#39; heights from Galton&#39;s data</h3>
+
+<pre><code class="r">library(UsingR)
+data(father.son)
+x &lt;- father.son$sheight
+s &lt;- sd(x)
+n &lt;- length(x)
+round(sqrt((n - 1) * s^2/qchisq(c(0.975, 0.025), n - 1)), 3)
+</code></pre>
+
+<pre><code>## [1] 2.701 2.939
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Gosset&#39;s \(t\) distribution</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Invented by William Gosset (under the pseudonym &quot;Student&quot;) in 1908</li>
+<li>Has thicker tails than the normal</li>
+<li>Is indexed by a degrees of freedom; gets more like a standard normal as df gets larger</li>
+<li>Is obtained as 
+\[
+\frac{Z}{\sqrt{\frac{\chi^2}{df}}}
+\]
+where \(Z\) and \(\chi^2\) are independent standard normals and
+Chi-squared distributions respectively</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Result</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li><p>Suppose that \((X_1,\ldots,X_n)\) are iid \(N(\mu,\sigma^2)\), then:
+a. \(\frac{\bar X - \mu}{\sigma / \sqrt{n}}\) is standard normal
+b. \(\sqrt{\frac{(n - 1) S^2}{\sigma^2 (n - 1)}} = S / \sigma\) is the square root of a Chi-squared divided by its df</p></li>
+<li><p>Therefore 
+\[
+\frac{\frac{\bar X - \mu}{\sigma /\sqrt{n}}}{S/\sigma}  
+= \frac{\bar X - \mu}{S/\sqrt{n}}
+\]
+follows Gosset&#39;s \(t\) distribution with \(n-1\) degrees of freedom</p></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Confidence intervals for the mean</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Notice that the \(t\) statistic is a pivot, therefore we use it to create a confidence interval for \(\mu\)</li>
+<li>Let \(t_{df,\alpha}\) be the \(\alpha^{th}\) quantile of the t distribution with \(df\) degrees of freedom
+\[
+\begin{eqnarray*}
+&   & 1 - \alpha \\
+& = & P\left(-t_{n-1,1-\alpha/2} \leq \frac{\bar X - \mu}{S/\sqrt{n}} \leq t_{n-1,1-\alpha/2}\right) \\ \\
+& = & P\left(\bar X - t_{n-1,1-\alpha/2} S / \sqrt{n} \leq \mu  
+  \leq \bar X + t_{n-1,1-\alpha/2}S /\sqrt{n}\right)
+\end{eqnarray*}
+\]</li>
+<li>Interval is \(\bar X \pm t_{n-1,1-\alpha/2} S/\sqrt{n}\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>Note&#39;s about the \(t\) interval</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The \(t\) interval technically assumes that the data are iid normal, though it is robust to this assumption</li>
+<li>It works well whenever the distribution of the data is roughly symmetric and mound shaped</li>
+<li>Paired observations are often analyzed using the \(t\) interval by taking differences</li>
+<li>For large degrees of freedom, \(t\) quantiles become the same as standard normal quantiles; therefore this interval converges to the same interval as the CLT yielded</li>
+<li>For skewed distributions, the spirit of the \(t\) interval assumptions are violated</li>
+<li>Also, for skewed distributions, it doesn&#39;t make a lot of sense to center the interval at the mean</li>
+<li>In this case, consider taking logs or using a different summary like the median</li>
+<li>For highly discrete data, like binary, other intervals are available</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>Sleep data</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>In R typing <code>data(sleep)</code> brings up the sleep data originally
+analyzed in Gosset&#39;s Biometrika paper, which shows the increase in
+hours for 10 patients on two soporific drugs. R treats the data as two
+groups rather than paired.</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <hgroup>
+    <h2>The data</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">data(sleep)
+head(sleep)
+</code></pre>
+
+<pre><code>##   extra group ID
+## 1   0.7     1  1
+## 2  -1.6     1  2
+## 3  -0.2     1  3
+## 4  -1.2     1  4
+## 5  -0.1     1  5
+## 6   3.4     1  6
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <hgroup>
+    <h2>Results</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">g1 &lt;- sleep$extra[1:10]
+g2 &lt;- sleep$extra[11:20]
+difference &lt;- g2 - g1
+mn &lt;- mean(difference)
+s &lt;- sd(difference)
+n &lt;- 10
+mn + c(-1, 1) * qt(0.975, n - 1) * s/sqrt(n)
+</code></pre>
+
+<pre><code>## [1] 0.7001 2.4599
+</code></pre>
+
+<pre><code class="r">t.test(difference)$conf.int
+</code></pre>
+
+<pre><code>## [1] 0.7001 2.4599
+## attr(,&quot;conf.level&quot;)
+## [1] 0.95
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Confidence intervals'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='The Chi-squared distribution'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Confidence interval for the variance'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Notes about this interval'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Example'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Gosset&#39;s \(t\) distribution'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Result'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Confidence intervals for the mean'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='Note&#39;s about the \(t\) interval'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='Sleep data'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title='The data'>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title='Results'>
+         12
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/02_03_tCIs/index.md b/06_StatisticalInference/02_03_tCIs/index.md
index 5f8bd59ec..fbcd970b6 100644
--- a/06_StatisticalInference/02_03_tCIs/index.md
+++ b/06_StatisticalInference/02_03_tCIs/index.md
@@ -1,197 +1,200 @@
----
-title       : T Confidence Intervals
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-## Confidence intervals
-
-- In the previous, we discussed creating a confidence interval using the CLT
-- In this lecture, we discuss some methods for small samples, notably Gosset's $t$ distribution
-- To discuss the $t$ distribution we must discuss the Chi-squared distribution
-- Throughout we use the following general procedure for creating CIs
-
-  a. Create a **Pivot** or statistic that does not depend on the parameter of interest
-  
-  b. Solve the probability that the pivot lies between bounds for the parameter
-
----
-
-## The Chi-squared distribution
-
-- Suppose that $S^2$ is the sample variance from a collection of iid $N(\mu,\sigma^2)$ data; then 
-$$
-    \frac{(n - 1) S^2}{\sigma^2} \sim \chi^2_{n-1}
-$$
-which reads: follows a Chi-squared distribution with $n-1$ degrees of freedom
-- The Chi-squared distribution is skewed and has support on $0$ to $\infty$
-- The mean of the Chi-squared is its degrees of freedom 
-- The variance of the Chi-squared distribution is twice the degrees of freedom
-
----
-
-## Confidence interval for the variance
-
-Note that if $\chi^2_{n-1, \alpha}$ is the $\alpha$ quantile of the
-Chi-squared distribution then
-
-$$
-\begin{eqnarray*}
-  1 - \alpha & = & P \left( \chi^2_{n-1, \alpha/2} \leq  \frac{(n - 1) S^2}{\sigma^2} \leq  \chi^2_{n-1,1 - \alpha/2} \right) \\ \\
-& = &  P\left(\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}} \leq \sigma^2 \leq 
-\frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}} \right) \\
-\end{eqnarray*}
-$$
-So that 
-$$
-\left[\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}}, \frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}}\right]
-$$
-is a $100(1-\alpha)\%$ confidence interval for $\sigma^2$
-
----
-
-## Notes about this interval
-
-- This interval relies heavily on the assumed normality
-- Square-rooting the endpoints yields a CI for $\sigma$
-
----
-## Example
-### Confidence interval for the standard deviation of sons' heights from Galton's data
-
-```r
-library(UsingR)
-data(father.son)
-x <- father.son$sheight
-s <- sd(x)
-n <- length(x)
-round(sqrt((n - 1) * s^2/qchisq(c(0.975, 0.025), n - 1)), 3)
-```
-
-```
-## [1] 2.701 2.939
-```
-
-
----
-
-## Gosset's $t$ distribution
-
-- Invented by William Gosset (under the pseudonym "Student") in 1908
-- Has thicker tails than the normal
-- Is indexed by a degrees of freedom; gets more like a standard normal as df gets larger
-- Is obtained as 
-$$
-\frac{Z}{\sqrt{\frac{\chi^2}{df}}}
-$$
-where $Z$ and $\chi^2$ are independent standard normals and
-Chi-squared distributions respectively
-
----
-
-## Result
-
-- Suppose that $(X_1,\ldots,X_n)$ are iid $N(\mu,\sigma^2)$, then:
-  a. $\frac{\bar X - \mu}{\sigma / \sqrt{n}}$ is standard normal
-  b. $\sqrt{\frac{(n - 1) S^2}{\sigma^2 (n - 1)}} = S / \sigma$ is the square root of a Chi-squared divided by its df
-
-- Therefore 
-$$
-\frac{\frac{\bar X - \mu}{\sigma /\sqrt{n}}}{S/\sigma}  
-= \frac{\bar X - \mu}{S/\sqrt{n}}
-$$
-    follows Gosset's $t$ distribution with $n-1$ degrees of freedom
-
----
-
-## Confidence intervals for the mean
-
-- Notice that the $t$ statistic is a pivot, therefore we use it to create a confidence interval for $\mu$
-- Let $t_{df,\alpha}$ be the $\alpha^{th}$ quantile of the t distribution with $df$ degrees of freedom
-$$
-  \begin{eqnarray*}
-&   & 1 - \alpha \\
-& = & P\left(-t_{n-1,1-\alpha/2} \leq \frac{\bar X - \mu}{S/\sqrt{n}} \leq t_{n-1,1-\alpha/2}\right) \\ \\
-& = & P\left(\bar X - t_{n-1,1-\alpha/2} S / \sqrt{n} \leq \mu  
-      \leq \bar X + t_{n-1,1-\alpha/2}S /\sqrt{n}\right)
-  \end{eqnarray*}
-$$
-- Interval is $\bar X \pm t_{n-1,1-\alpha/2} S/\sqrt{n}$
-
----
-
-## Note's about the $t$ interval
-
-- The $t$ interval technically assumes that the data are iid normal, though it is robust to this assumption
-- It works well whenever the distribution of the data is roughly symmetric and mound shaped
-- Paired observations are often analyzed using the $t$ interval by taking differences
-- For large degrees of freedom, $t$ quantiles become the same as standard normal quantiles; therefore this interval converges to the same interval as the CLT yielded
-- For skewed distributions, the spirit of the $t$ interval assumptions are violated
-- Also, for skewed distributions, it doesn't make a lot of sense to center the interval at the mean
-- In this case, consider taking logs or using a different summary like the median
-- For highly discrete data, like binary, other intervals are available
-
----
-
-## Sleep data
-
-In R typing `data(sleep)` brings up the sleep data originally
-analyzed in Gosset's Biometrika paper, which shows the increase in
-hours for 10 patients on two soporific drugs. R treats the data as two
-groups rather than paired.
-
----
-## The data
-
-```r
-data(sleep)
-head(sleep)
-```
-
-```
-##   extra group ID
-## 1   0.7     1  1
-## 2  -1.6     1  2
-## 3  -0.2     1  3
-## 4  -1.2     1  4
-## 5  -0.1     1  5
-## 6   3.4     1  6
-```
-
-
----
-
-```r
-g1 <- sleep$extra[1:10]
-g2 <- sleep$extra[11:20]
-difference <- g2 - g1
-mn <- mean(difference)
-s <- sd(difference)
-n <- 10
-mn + c(-1, 1) * qt(0.975, n - 1) * s/sqrt(n)
-```
-
-```
-## [1] 0.7001 2.4599
-```
-
-```r
-t.test(difference)$conf.int
-```
-
-```
-## [1] 0.7001 2.4599
-## attr(,"conf.level")
-## [1] 0.95
-```
-
-
+---
+title       : T Confidence Intervals
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+## Confidence intervals
+
+- In the previous, we discussed creating a confidence interval using the CLT
+- In this lecture, we discuss some methods for small samples, notably Gosset's $t$ distribution
+- To discuss the $t$ distribution we must discuss the Chi-squared distribution
+- Throughout we use the following general procedure for creating CIs
+
+  a. Create a **Pivot** or statistic that does not depend on the parameter of interest
+  
+  b. Solve the probability that the pivot lies between bounds for the parameter
+
+---
+
+## The Chi-squared distribution
+
+- Suppose that $S^2$ is the sample variance from a collection of iid $N(\mu,\sigma^2)$ data; then 
+$$
+    \frac{(n - 1) S^2}{\sigma^2} \sim \chi^2_{n-1}
+$$
+which reads: follows a Chi-squared distribution with $n-1$ degrees of freedom
+- The Chi-squared distribution is skewed and has support on $0$ to $\infty$
+- The mean of the Chi-squared is its degrees of freedom 
+- The variance of the Chi-squared distribution is twice the degrees of freedom
+
+---
+
+## Confidence interval for the variance
+
+Note that if $\chi^2_{n-1, \alpha}$ is the $\alpha$ quantile of the
+Chi-squared distribution then
+
+$$
+\begin{eqnarray*}
+  1 - \alpha & = & P \left( \chi^2_{n-1, \alpha/2} \leq  \frac{(n - 1) S^2}{\sigma^2} \leq  \chi^2_{n-1,1 - \alpha/2} \right) \\ \\
+& = &  P\left(\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}} \leq \sigma^2 \leq 
+\frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}} \right) \\
+\end{eqnarray*}
+$$
+So that 
+$$
+\left[\frac{(n-1)S^2}{\chi^2_{n-1,1-\alpha/2}}, \frac{(n-1)S^2}{\chi^2_{n-1,\alpha/2}}\right]
+$$
+is a $100(1-\alpha)\%$ confidence interval for $\sigma^2$
+
+---
+
+## Notes about this interval
+
+- This interval relies heavily on the assumed normality
+- Square-rooting the endpoints yields a CI for $\sigma$
+
+---
+## Example
+### Confidence interval for the standard deviation of sons' heights from Galton's data
+
+```r
+library(UsingR)
+data(father.son)
+x <- father.son$sheight
+s <- sd(x)
+n <- length(x)
+round(sqrt((n - 1) * s^2/qchisq(c(0.975, 0.025), n - 1)), 3)
+```
+
+```
+## [1] 2.701 2.939
+```
+
+
+---
+
+## Gosset's $t$ distribution
+
+- Invented by William Gosset (under the pseudonym "Student") in 1908
+- Has thicker tails than the normal
+- Is indexed by a degrees of freedom; gets more like a standard normal as df gets larger
+- Is obtained as 
+$$
+\frac{Z}{\sqrt{\frac{\chi^2}{df}}}
+$$
+where $Z$ and $\chi^2$ are independent standard normals and
+Chi-squared distributions respectively
+
+---
+
+## Result
+
+- Suppose that $(X_1,\ldots,X_n)$ are iid $N(\mu,\sigma^2)$, then:
+  a. $\frac{\bar X - \mu}{\sigma / \sqrt{n}}$ is standard normal
+  b. $\sqrt{\frac{(n - 1) S^2}{\sigma^2 (n - 1)}} = S / \sigma$ is the square root of a Chi-squared divided by its df
+
+- Therefore 
+$$
+\frac{\frac{\bar X - \mu}{\sigma /\sqrt{n}}}{S/\sigma}  
+= \frac{\bar X - \mu}{S/\sqrt{n}}
+$$
+    follows Gosset's $t$ distribution with $n-1$ degrees of freedom
+
+---
+
+## Confidence intervals for the mean
+
+- Notice that the $t$ statistic is a pivot, therefore we use it to create a confidence interval for $\mu$
+- Let $t_{df,\alpha}$ be the $\alpha^{th}$ quantile of the t distribution with $df$ degrees of freedom
+$$
+  \begin{eqnarray*}
+&   & 1 - \alpha \\
+& = & P\left(-t_{n-1,1-\alpha/2} \leq \frac{\bar X - \mu}{S/\sqrt{n}} \leq t_{n-1,1-\alpha/2}\right) \\ \\
+& = & P\left(\bar X - t_{n-1,1-\alpha/2} S / \sqrt{n} \leq \mu  
+      \leq \bar X + t_{n-1,1-\alpha/2}S /\sqrt{n}\right)
+  \end{eqnarray*}
+$$
+- Interval is $\bar X \pm t_{n-1,1-\alpha/2} S/\sqrt{n}$
+
+---
+
+## Note's about the $t$ interval
+
+- The $t$ interval technically assumes that the data are iid normal, though it is robust to this assumption
+- It works well whenever the distribution of the data is roughly symmetric and mound shaped
+- Paired observations are often analyzed using the $t$ interval by taking differences
+- For large degrees of freedom, $t$ quantiles become the same as standard normal quantiles; therefore this interval converges to the same interval as the CLT yielded
+- For skewed distributions, the spirit of the $t$ interval assumptions are violated
+- Also, for skewed distributions, it doesn't make a lot of sense to center the interval at the mean
+- In this case, consider taking logs or using a different summary like the median
+- For highly discrete data, like binary, other intervals are available
+
+---
+
+## Sleep data
+
+In R typing `data(sleep)` brings up the sleep data originally
+analyzed in Gosset's Biometrika paper, which shows the increase in
+hours for 10 patients on two soporific drugs. R treats the data as two
+groups rather than paired.
+
+---
+## The data
+
+```r
+data(sleep)
+head(sleep)
+```
+
+```
+##   extra group ID
+## 1   0.7     1  1
+## 2  -1.6     1  2
+## 3  -0.2     1  3
+## 4  -1.2     1  4
+## 5  -0.1     1  5
+## 6   3.4     1  6
+```
+
+
+---
+## Results
+
+```r
+g1 <- sleep$extra[1:10]
+g2 <- sleep$extra[11:20]
+difference <- g2 - g1
+mn <- mean(difference)
+s <- sd(difference)
+n <- 10
+mn + c(-1, 1) * qt(0.975, n - 1) * s/sqrt(n)
+```
+
+```
+## [1] 0.7001 2.4599
+```
+
+```r
+t.test(difference)$conf.int
+```
+
+```
+## [1] 0.7001 2.4599
+## attr(,"conf.level")
+## [1] 0.95
+```
+
+
+
+
diff --git a/06_StatisticalInference/02_03_tCIs/index.pdf b/06_StatisticalInference/02_03_tCIs/index.pdf
index 855c3502b..19947b5e5 100644
Binary files a/06_StatisticalInference/02_03_tCIs/index.pdf and b/06_StatisticalInference/02_03_tCIs/index.pdf differ
diff --git a/06_StatisticalInference/02_04_Likeklihood/assets/fig/unnamed-chunk-1.png b/06_StatisticalInference/02_04_Likeklihood/assets/fig/unnamed-chunk-1.png
new file mode 100644
index 000000000..eea114c31
Binary files /dev/null and b/06_StatisticalInference/02_04_Likeklihood/assets/fig/unnamed-chunk-1.png differ
diff --git a/06_StatisticalInference/02_04_Likeklihood/assets/fig/unnamed-chunk-2.png b/06_StatisticalInference/02_04_Likeklihood/assets/fig/unnamed-chunk-2.png
new file mode 100644
index 000000000..d00011e8a
Binary files /dev/null and b/06_StatisticalInference/02_04_Likeklihood/assets/fig/unnamed-chunk-2.png differ
diff --git a/06_StatisticalInference/02_04_Likeklihood/index.Rmd b/06_StatisticalInference/02_04_Likeklihood/index.Rmd
index 56f723f8d..d62b9a3c8 100644
--- a/06_StatisticalInference/02_04_Likeklihood/index.Rmd
+++ b/06_StatisticalInference/02_04_Likeklihood/index.Rmd
@@ -1,144 +1,128 @@
----
-title       : Likelihood
-subtitle    : Statistical Inference
-author      : Brian Caffo, Roger Peng, Jeff Leek
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F, results='hide'}
-# make this an external chunk that can be included in any file
-options(width = 100)
-opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/')
-
-options(xtable.type = 'html')
-knit_hooks$set(inline = function(x) {
-  if(is.numeric(x)) {
-    round(x, getOption('digits'))
-  } else {
-    paste(as.character(x), collapse = ', ')
-  }
-})
-knit_hooks$set(plot = knitr:::hook_plot_html)
-runif(1)
-```
-
-## Likelihood
-
-- A common and fruitful approach to statistics is to assume that the data arises from a family of distributions indexed by a parameter that represents a useful summary of the distribution
-- The **likelihood** of a collection of data is the joint density evaluated as a function of the parameters with the data fixed
-- Likelihood analysis of data uses the likelihood to perform inference regarding the unknown parameter
-
----
-
-## Likelihood
-
-Given a statistical probability mass function or density, say $f(x, \theta)$, where $\theta$ is an unknown parameter, the **likelihood** is $f$ viewed as a function of $\theta$ for a fixed, observed value of $x$. 
-
----
-
-## Interpretations of likelihoods
-
-The likelihood has the following properties:
-
-1. Ratios of likelihood values measure the relative evidence of one value of the unknown parameter to another.
-2. Given a statistical model and observed data, all of the relevant information contained in the data regarding the unknown parameter is contained in the likelihood.
-3. If $\{X_i\}$ are independent random variables, then their likelihoods multiply.  That is, the likelihood of the parameters given all of the $X_i$ is simply the product of the individual likelihoods.
-
----
-
-## Example
-
-- Suppose that we flip a coin with success probability $\theta$
-- Recall that the mass function for $x$
-  $$
-  f(x,\theta) = \theta^x(1 - \theta)^{1 - x}  ~~~\mbox{for}~~~ \theta \in [0,1].
-  $$
-  where $x$ is either $0$ (Tails) or $1$ (Heads) 
-- Suppose that the result is a head
-- The likelihood is
-  $$
-  {\cal L}(\theta, 1) = \theta^1 (1 - \theta)^{1 - 1} = \theta  ~~~\mbox{for} ~~~ \theta \in [0,1].
-  $$
-- Therefore, ${\cal L}(.5, 1) / {\cal L}(.25, 1) = 2$, 
-- There is twice as much evidence supporting the hypothesis that $\theta = .5$ to the hypothesis that $\theta = .25$
-
----
-
-## Example continued
-
-- Suppose now that we flip our coin from the previous example 4 times and get the sequence 1, 0, 1, 1
-- The likelihood is:
-$$
-  \begin{eqnarray*}
-  {\cal L}(\theta, 1,0,1,1) & = & \theta^1 (1 - \theta)^{1 - 1}
-  \theta^0 (1 - \theta)^{1 - 0}  \\
-& \times & \theta^1 (1 - \theta)^{1 - 1} 
-   \theta^1 (1 - \theta)^{1 - 1}\\
-& = &  \theta^3(1 - \theta)^1
-  \end{eqnarray*}
-$$
-- This likelihood only depends on the total number of heads and the total number of tails; we might write ${\cal L}(\theta, 1, 3)$ for shorthand
-- Now consider ${\cal L}(.5, 1, 3) / {\cal L}(.25, 1, 3) = 5.33$
-- There is over five times as much evidence supporting the hypothesis that $\theta = .5$ over that $\theta = .25$
-
----
-
-## Plotting likelihoods
-
-- Generally, we want to consider all the values of $\theta$ between 0 and 1
-- A **likelihood plot** displays $\theta$ by ${\cal L}(\theta,x)$
-- Because the likelihood measures *relative evidence*, dividing the curve by its maximum value (or any other value for that matter) does not change its interpretation
-
----
-```{r, fig.height=4.5, fig.width=4.5}
-pvals <- seq(0, 1, length = 1000)
-plot(pvals, dbinom(3, 4, pvals) / dbinom(3, 4, 3/4), type = "l", frame = FALSE, lwd = 3, xlab = "p", ylab = "likelihood / max likelihood")
-```
-
-
----
-
-## Maximum likelihood
-
-- The value of $\theta$ where the curve reaches its maximum has a special meaning
-- It is the value of $\theta$ that is most well supported by the data
-- This point is called the **maximum likelihood estimate** (or MLE) of $\theta$
-  $$
-  MLE = \mathrm{argmax}_\theta {\cal L}(\theta, x).
-  $$
-- Another interpretation of the MLE is that it is the value of $\theta$ that would make the data that we observed most probable
-
----
-## Some results
-* $X_1, \ldots, X_n \stackrel{iid}{\sim} N(\mu, \sigma^2)$ the MLE of $\mu$ is $\bar X$ and the ML of $\sigma^2$ is the biased sample variance estimate.
-* If $X_1,\ldots, X_n \stackrel{iid}{\sim} Bernoulli(p)$ then the MLE of $p$ is $\bar X$ (the sample proportion of 1s).
-* If $X_i \stackrel{iid}{\sim} Binomial(n_i, p)$ then the MLE of $p$ is $\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n n_i}$ (the sample proportion of 1s).
-* If $X \stackrel{iid}{\sim} Poisson(\lambda t)$ then the MLE of $\lambda$ is $X/t$.
-* If $X_i \stackrel{iid}{\sim} Poisson(\lambda t_i)$ then the MLE of $\lambda$ is
-  $\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n t_i}$
-
----
-## Example
-* You saw 5 failure events per 94 days of monitoring a nuclear pump. 
-* Assuming Poisson, plot the likelihood
-
----
-```{r, fig.height=4, fig.width=4, echo= TRUE}
-lambda <- seq(0, .2, length = 1000)
-likelihood <- dpois(5, 94 * lambda) / dpois(5, 5)
-plot(lambda, likelihood, frame = FALSE, lwd = 3, type = "l", xlab = expression(lambda))
-lines(rep(5/94, 2), 0 : 1, col = "red", lwd = 3)
-lines(range(lambda[likelihood > 1/16]), rep(1/16, 2), lwd = 2)
-lines(range(lambda[likelihood > 1/8]), rep(1/8, 2), lwd = 2)
-```
-
-
-
+---
+title       : Likelihood
+subtitle    : Statistical Inference
+author      : Brian Caffo, Roger Peng, Jeff Leek
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Likelihood
+
+- A common and fruitful approach to statistics is to assume that the data arises from a family of distributions indexed by a parameter that represents a useful summary of the distribution
+- The **likelihood** of a collection of data is the joint density evaluated as a function of the parameters with the data fixed
+- Likelihood analysis of data uses the likelihood to perform inference regarding the unknown parameter
+
+---
+
+## Likelihood
+
+Given a statistical probability mass function or density, say $f(x, \theta)$, where $\theta$ is an unknown parameter, the **likelihood** is $f$ viewed as a function of $\theta$ for a fixed, observed value of $x$. 
+
+---
+
+## Interpretations of likelihoods
+
+The likelihood has the following properties:
+
+1. Ratios of likelihood values measure the relative evidence of one value of the unknown parameter to another.
+2. Given a statistical model and observed data, all of the relevant information contained in the data regarding the unknown parameter is contained in the likelihood.
+3. If $\{X_i\}$ are independent random variables, then their likelihoods multiply.  That is, the likelihood of the parameters given all of the $X_i$ is simply the product of the individual likelihoods.
+
+---
+
+## Example
+
+- Suppose that we flip a coin with success probability $\theta$
+- Recall that the mass function for $x$
+  $$
+  f(x,\theta) = \theta^x(1 - \theta)^{1 - x}  ~~~\mbox{for}~~~ \theta \in [0,1].
+  $$
+  where $x$ is either $0$ (Tails) or $1$ (Heads) 
+- Suppose that the result is a head
+- The likelihood is
+  $$
+  {\cal L}(\theta, 1) = \theta^1 (1 - \theta)^{1 - 1} = \theta  ~~~\mbox{for} ~~~ \theta \in [0,1].
+  $$
+- Therefore, ${\cal L}(.5, 1) / {\cal L}(.25, 1) = 2$, 
+- There is twice as much evidence supporting the hypothesis that $\theta = .5$ to the hypothesis that $\theta = .25$
+
+---
+
+## Example continued
+
+- Suppose now that we flip our coin from the previous example 4 times and get the sequence 1, 0, 1, 1
+- The likelihood is:
+$$
+  \begin{eqnarray*}
+  {\cal L}(\theta, 1,0,1,1) & = & \theta^1 (1 - \theta)^{1 - 1}
+  \theta^0 (1 - \theta)^{1 - 0}  \\
+& \times & \theta^1 (1 - \theta)^{1 - 1} 
+   \theta^1 (1 - \theta)^{1 - 1}\\
+& = &  \theta^3(1 - \theta)^1
+  \end{eqnarray*}
+$$
+- This likelihood only depends on the total number of heads and the total number of tails; we might write ${\cal L}(\theta, 1, 3)$ for shorthand
+- Now consider ${\cal L}(.5, 1, 3) / {\cal L}(.25, 1, 3) = 5.33$
+- There is over five times as much evidence supporting the hypothesis that $\theta = .5$ over that $\theta = .25$
+
+---
+
+## Plotting likelihoods
+
+- Generally, we want to consider all the values of $\theta$ between 0 and 1
+- A **likelihood plot** displays $\theta$ by ${\cal L}(\theta,x)$
+- Because the likelihood measures *relative evidence*, dividing the curve by its maximum value (or any other value for that matter) does not change its interpretation
+
+---
+```{r, fig.height=4.5, fig.width=4.5}
+pvals <- seq(0, 1, length = 1000)
+plot(pvals, dbinom(3, 4, pvals) / dbinom(3, 4, 3/4), type = "l", frame = FALSE, lwd = 3, xlab = "p", ylab = "likelihood / max likelihood")
+```
+
+
+---
+
+## Maximum likelihood
+
+- The value of $\theta$ where the curve reaches its maximum has a special meaning
+- It is the value of $\theta$ that is most well supported by the data
+- This point is called the **maximum likelihood estimate** (or MLE) of $\theta$
+  $$
+  MLE = \mathrm{argmax}_\theta {\cal L}(\theta, x).
+  $$
+- Another interpretation of the MLE is that it is the value of $\theta$ that would make the data that we observed most probable
+
+---
+## Some results
+* $X_1, \ldots, X_n \stackrel{iid}{\sim} N(\mu, \sigma^2)$ the MLE of $\mu$ is $\bar X$ and the ML of $\sigma^2$ is the biased sample variance estimate.
+* If $X_1,\ldots, X_n \stackrel{iid}{\sim} Bernoulli(p)$ then the MLE of $p$ is $\bar X$ (the sample proportion of 1s).
+* If $X_i \stackrel{iid}{\sim} Binomial(n_i, p)$ then the MLE of $p$ is $\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n n_i}$ (the sample proportion of 1s).
+* If $X \stackrel{iid}{\sim} Poisson(\lambda t)$ then the MLE of $\lambda$ is $X/t$.
+* If $X_i \stackrel{iid}{\sim} Poisson(\lambda t_i)$ then the MLE of $\lambda$ is
+  $\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n t_i}$
+
+---
+## Example
+* You saw 5 failure events per 94 days of monitoring a nuclear pump. 
+* Assuming Poisson, plot the likelihood
+
+---
+```{r, fig.height=4, fig.width=4, echo= TRUE}
+lambda <- seq(0, .2, length = 1000)
+likelihood <- dpois(5, 94 * lambda) / dpois(5, 5)
+plot(lambda, likelihood, frame = FALSE, lwd = 3, type = "l", xlab = expression(lambda))
+lines(rep(5/94, 2), 0 : 1, col = "red", lwd = 3)
+lines(range(lambda[likelihood > 1/16]), rep(1/16, 2), lwd = 2)
+lines(range(lambda[likelihood > 1/8]), rep(1/8, 2), lwd = 2)
+```
+
+
+
diff --git a/06_StatisticalInference/02_04_Likeklihood/index.html b/06_StatisticalInference/02_04_Likeklihood/index.html
index d7f7d2684..527e9b425 100644
--- a/06_StatisticalInference/02_04_Likeklihood/index.html
+++ b/06_StatisticalInference/02_04_Likeklihood/index.html
@@ -1,333 +1,334 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Likelihood</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Likelihood">
-  <meta name="author" content="Brian Caffo, Roger Peng, Jeff Leek">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Likelihood</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Roger Peng, Jeff Leek<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Likelihood</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>A common and fruitful approach to statistics is to assume that the data arises from a family of distributions indexed by a parameter that represents a useful summary of the distribution</li>
-<li>The <strong>likelihood</strong> of a collection of data is the joint density evaluated as a function of the parameters with the data fixed</li>
-<li>Likelihood analysis of data uses the likelihood to perform inference regarding the unknown parameter</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Likelihood</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>Given a statistical probability mass function or density, say \(f(x, \theta)\), where \(\theta\) is an unknown parameter, the <strong>likelihood</strong> is \(f\) viewed as a function of \(\theta\) for a fixed, observed value of \(x\). </p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Interpretations of likelihoods</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>The likelihood has the following properties:</p>
-
-<ol>
-<li>Ratios of likelihood values measure the relative evidence of one value of the unknown parameter to another.</li>
-<li>Given a statistical model and observed data, all of the relevant information contained in the data regarding the unknown parameter is contained in the likelihood.</li>
-<li>If \(\{X_i\}\) are independent random variables, then their likelihoods multiply.  That is, the likelihood of the parameters given all of the \(X_i\) is simply the product of the individual likelihoods.</li>
-</ol>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose that we flip a coin with success probability \(\theta\)</li>
-<li>Recall that the mass function for \(x\)
-\[
-f(x,\theta) = \theta^x(1 - \theta)^{1 - x}  ~~~\mbox{for}~~~ \theta \in [0,1].
-\]
-where \(x\) is either \(0\) (Tails) or \(1\) (Heads) </li>
-<li>Suppose that the result is a head</li>
-<li>The likelihood is
-\[
-{\cal L}(\theta, 1) = \theta^1 (1 - \theta)^{1 - 1} = \theta  ~~~\mbox{for} ~~~ \theta \in [0,1].
-\]</li>
-<li>Therefore, \({\cal L}(.5, 1) / {\cal L}(.25, 1) = 2\), </li>
-<li>There is twice as much evidence supporting the hypothesis that \(\theta = .5\) to the hypothesis that \(\theta = .25\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Example continued</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose now that we flip our coin from the previous example 4 times and get the sequence 1, 0, 1, 1</li>
-<li>The likelihood is:
-\[
-\begin{eqnarray*}
-{\cal L}(\theta, 1,0,1,1) & = & \theta^1 (1 - \theta)^{1 - 1}
-\theta^0 (1 - \theta)^{1 - 0}  \\
-& \times & \theta^1 (1 - \theta)^{1 - 1} 
-\theta^1 (1 - \theta)^{1 - 1}\\
-& = &  \theta^3(1 - \theta)^1
-\end{eqnarray*}
-\]</li>
-<li>This likelihood only depends on the total number of heads and the total number of tails; we might write \({\cal L}(\theta, 1, 3)\) for shorthand</li>
-<li>Now consider \({\cal L}(.5, 1, 3) / {\cal L}(.25, 1, 3) = 5.33\)</li>
-<li>There is over five times as much evidence supporting the hypothesis that \(\theta = .5\) over that \(\theta = .25\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Plotting likelihoods</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Generally, we want to consider all the values of \(\theta\) between 0 and 1</li>
-<li>A <strong>likelihood plot</strong> displays \(\theta\) by \({\cal L}(\theta,x)\)</li>
-<li>Because the likelihood measures <em>relative evidence</em>, dividing the curve by its maximum value (or any other value for that matter) does not change its interpretation</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <article data-timings="">
-    <pre><code class="r">pvals &lt;- seq(0, 1, length = 1000)
-plot(pvals, dbinom(3, 4, pvals) / dbinom(3, 4, 3/4), type = &quot;l&quot;, frame = FALSE, lwd = 3, xlab = &quot;p&quot;, ylab = &quot;likelihood / max likelihood&quot;)
-</code></pre>
-
-<div class="rimage center"><img src="fig/unnamed-chunk-1.png" title="plot of chunk unnamed-chunk-1" alt="plot of chunk unnamed-chunk-1" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Maximum likelihood</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The value of \(\theta\) where the curve reaches its maximum has a special meaning</li>
-<li>It is the value of \(\theta\) that is most well supported by the data</li>
-<li>This point is called the <strong>maximum likelihood estimate</strong> (or MLE) of \(\theta\)
-\[
-MLE = \mathrm{argmax}_\theta {\cal L}(\theta, x).
-\]</li>
-<li>Another interpretation of the MLE is that it is the value of \(\theta\) that would make the data that we observed most probable</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>Some results</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>\(X_1, \ldots, X_n \stackrel{iid}{\sim} N(\mu, \sigma^2)\) the MLE of \(\mu\) is \(\bar X\) and the ML of \(\sigma^2\) is the biased sample variance estimate.</li>
-<li>If \(X_1,\ldots, X_n \stackrel{iid}{\sim} Bernoulli(p)\) then the MLE of \(p\) is \(\bar X\) (the sample proportion of 1s).</li>
-<li>If \(X_i \stackrel{iid}{\sim} Binomial(n_i, p)\) then the MLE of \(p\) is \(\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n n_i}\) (the sample proportion of 1s).</li>
-<li>If \(X \stackrel{iid}{\sim} Poisson(\lambda t)\) then the MLE of \(\lambda\) is \(X/t\).</li>
-<li>If \(X_i \stackrel{iid}{\sim} Poisson(\lambda t_i)\) then the MLE of \(\lambda\) is
-\(\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n t_i}\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>You saw 5 failure events per 94 days of monitoring a nuclear pump. </li>
-<li>Assuming Poisson, plot the likelihood</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <article data-timings="">
-    <pre><code class="r">lambda &lt;- seq(0, .2, length = 1000)
-likelihood &lt;- dpois(5, 94 * lambda) / dpois(5, 5)
-plot(lambda, likelihood, frame = FALSE, lwd = 3, type = &quot;l&quot;, xlab = expression(lambda))
-lines(rep(5/94, 2), 0 : 1, col = &quot;red&quot;, lwd = 3)
-lines(range(lambda[likelihood &gt; 1/16]), rep(1/16, 2), lwd = 2)
-lines(range(lambda[likelihood &gt; 1/8]), rep(1/8, 2), lwd = 2)
-</code></pre>
-
-<div class="rimage center"><img src="fig/unnamed-chunk-2.png" title="plot of chunk unnamed-chunk-2" alt="plot of chunk unnamed-chunk-2" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Likelihood'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Likelihood'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Interpretations of likelihoods'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Example'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Example continued'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Plotting likelihoods'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title=''>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Maximum likelihood'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='Some results'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='Example'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title=''>
-         11
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Likelihood</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Likelihood">
+  <meta name="author" content="Brian Caffo, Roger Peng, Jeff Leek">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Likelihood</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Roger Peng, Jeff Leek<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Likelihood</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>A common and fruitful approach to statistics is to assume that the data arises from a family of distributions indexed by a parameter that represents a useful summary of the distribution</li>
+<li>The <strong>likelihood</strong> of a collection of data is the joint density evaluated as a function of the parameters with the data fixed</li>
+<li>Likelihood analysis of data uses the likelihood to perform inference regarding the unknown parameter</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Likelihood</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>Given a statistical probability mass function or density, say \(f(x, \theta)\), where \(\theta\) is an unknown parameter, the <strong>likelihood</strong> is \(f\) viewed as a function of \(\theta\) for a fixed, observed value of \(x\). </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Interpretations of likelihoods</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>The likelihood has the following properties:</p>
+
+<ol>
+<li>Ratios of likelihood values measure the relative evidence of one value of the unknown parameter to another.</li>
+<li>Given a statistical model and observed data, all of the relevant information contained in the data regarding the unknown parameter is contained in the likelihood.</li>
+<li>If \(\{X_i\}\) are independent random variables, then their likelihoods multiply.  That is, the likelihood of the parameters given all of the \(X_i\) is simply the product of the individual likelihoods.</li>
+</ol>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose that we flip a coin with success probability \(\theta\)</li>
+<li>Recall that the mass function for \(x\)
+\[
+f(x,\theta) = \theta^x(1 - \theta)^{1 - x}  ~~~\mbox{for}~~~ \theta \in [0,1].
+\]
+where \(x\) is either \(0\) (Tails) or \(1\) (Heads) </li>
+<li>Suppose that the result is a head</li>
+<li>The likelihood is
+\[
+{\cal L}(\theta, 1) = \theta^1 (1 - \theta)^{1 - 1} = \theta  ~~~\mbox{for} ~~~ \theta \in [0,1].
+\]</li>
+<li>Therefore, \({\cal L}(.5, 1) / {\cal L}(.25, 1) = 2\), </li>
+<li>There is twice as much evidence supporting the hypothesis that \(\theta = .5\) to the hypothesis that \(\theta = .25\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Example continued</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose now that we flip our coin from the previous example 4 times and get the sequence 1, 0, 1, 1</li>
+<li>The likelihood is:
+\[
+\begin{eqnarray*}
+{\cal L}(\theta, 1,0,1,1) & = & \theta^1 (1 - \theta)^{1 - 1}
+\theta^0 (1 - \theta)^{1 - 0}  \\
+& \times & \theta^1 (1 - \theta)^{1 - 1} 
+\theta^1 (1 - \theta)^{1 - 1}\\
+& = &  \theta^3(1 - \theta)^1
+\end{eqnarray*}
+\]</li>
+<li>This likelihood only depends on the total number of heads and the total number of tails; we might write \({\cal L}(\theta, 1, 3)\) for shorthand</li>
+<li>Now consider \({\cal L}(.5, 1, 3) / {\cal L}(.25, 1, 3) = 5.33\)</li>
+<li>There is over five times as much evidence supporting the hypothesis that \(\theta = .5\) over that \(\theta = .25\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Plotting likelihoods</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Generally, we want to consider all the values of \(\theta\) between 0 and 1</li>
+<li>A <strong>likelihood plot</strong> displays \(\theta\) by \({\cal L}(\theta,x)\)</li>
+<li>Because the likelihood measures <em>relative evidence</em>, dividing the curve by its maximum value (or any other value for that matter) does not change its interpretation</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <article data-timings="">
+    <pre><code class="r">pvals &lt;- seq(0, 1, length = 1000)
+plot(pvals, dbinom(3, 4, pvals)/dbinom(3, 4, 3/4), type = &quot;l&quot;, frame = FALSE, 
+    lwd = 3, xlab = &quot;p&quot;, ylab = &quot;likelihood / max likelihood&quot;)
+</code></pre>
+
+<p><img src="assets/fig/unnamed-chunk-1.png" alt="plot of chunk unnamed-chunk-1"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Maximum likelihood</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The value of \(\theta\) where the curve reaches its maximum has a special meaning</li>
+<li>It is the value of \(\theta\) that is most well supported by the data</li>
+<li>This point is called the <strong>maximum likelihood estimate</strong> (or MLE) of \(\theta\)
+\[
+MLE = \mathrm{argmax}_\theta {\cal L}(\theta, x).
+\]</li>
+<li>Another interpretation of the MLE is that it is the value of \(\theta\) that would make the data that we observed most probable</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>Some results</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>\(X_1, \ldots, X_n \stackrel{iid}{\sim} N(\mu, \sigma^2)\) the MLE of \(\mu\) is \(\bar X\) and the ML of \(\sigma^2\) is the biased sample variance estimate.</li>
+<li>If \(X_1,\ldots, X_n \stackrel{iid}{\sim} Bernoulli(p)\) then the MLE of \(p\) is \(\bar X\) (the sample proportion of 1s).</li>
+<li>If \(X_i \stackrel{iid}{\sim} Binomial(n_i, p)\) then the MLE of \(p\) is \(\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n n_i}\) (the sample proportion of 1s).</li>
+<li>If \(X \stackrel{iid}{\sim} Poisson(\lambda t)\) then the MLE of \(\lambda\) is \(X/t\).</li>
+<li>If \(X_i \stackrel{iid}{\sim} Poisson(\lambda t_i)\) then the MLE of \(\lambda\) is
+\(\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n t_i}\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>You saw 5 failure events per 94 days of monitoring a nuclear pump. </li>
+<li>Assuming Poisson, plot the likelihood</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <article data-timings="">
+    <pre><code class="r">lambda &lt;- seq(0, 0.2, length = 1000)
+likelihood &lt;- dpois(5, 94 * lambda)/dpois(5, 5)
+plot(lambda, likelihood, frame = FALSE, lwd = 3, type = &quot;l&quot;, xlab = expression(lambda))
+lines(rep(5/94, 2), 0:1, col = &quot;red&quot;, lwd = 3)
+lines(range(lambda[likelihood &gt; 1/16]), rep(1/16, 2), lwd = 2)
+lines(range(lambda[likelihood &gt; 1/8]), rep(1/8, 2), lwd = 2)
+</code></pre>
+
+<p><img src="assets/fig/unnamed-chunk-2.png" alt="plot of chunk unnamed-chunk-2"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Likelihood'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Likelihood'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Interpretations of likelihoods'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Example'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Example continued'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Plotting likelihoods'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title=''>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Maximum likelihood'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='Some results'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='Example'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title=''>
+         11
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/02_04_Likeklihood/index.md b/06_StatisticalInference/02_04_Likeklihood/index.md
index d7cc17b36..1e0739357 100644
--- a/06_StatisticalInference/02_04_Likeklihood/index.md
+++ b/06_StatisticalInference/02_04_Likeklihood/index.md
@@ -1,138 +1,137 @@
----
-title       : Likelihood
-subtitle    : Statistical Inference
-author      : Brian Caffo, Roger Peng, Jeff Leek
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-
-## Likelihood
-
-- A common and fruitful approach to statistics is to assume that the data arises from a family of distributions indexed by a parameter that represents a useful summary of the distribution
-- The **likelihood** of a collection of data is the joint density evaluated as a function of the parameters with the data fixed
-- Likelihood analysis of data uses the likelihood to perform inference regarding the unknown parameter
-
----
-
-## Likelihood
-
-Given a statistical probability mass function or density, say $f(x, \theta)$, where $\theta$ is an unknown parameter, the **likelihood** is $f$ viewed as a function of $\theta$ for a fixed, observed value of $x$. 
-
----
-
-## Interpretations of likelihoods
-
-The likelihood has the following properties:
-
-1. Ratios of likelihood values measure the relative evidence of one value of the unknown parameter to another.
-2. Given a statistical model and observed data, all of the relevant information contained in the data regarding the unknown parameter is contained in the likelihood.
-3. If $\{X_i\}$ are independent random variables, then their likelihoods multiply.  That is, the likelihood of the parameters given all of the $X_i$ is simply the product of the individual likelihoods.
-
----
-
-## Example
-
-- Suppose that we flip a coin with success probability $\theta$
-- Recall that the mass function for $x$
-  $$
-  f(x,\theta) = \theta^x(1 - \theta)^{1 - x}  ~~~\mbox{for}~~~ \theta \in [0,1].
-  $$
-  where $x$ is either $0$ (Tails) or $1$ (Heads) 
-- Suppose that the result is a head
-- The likelihood is
-  $$
-  {\cal L}(\theta, 1) = \theta^1 (1 - \theta)^{1 - 1} = \theta  ~~~\mbox{for} ~~~ \theta \in [0,1].
-  $$
-- Therefore, ${\cal L}(.5, 1) / {\cal L}(.25, 1) = 2$, 
-- There is twice as much evidence supporting the hypothesis that $\theta = .5$ to the hypothesis that $\theta = .25$
-
----
-
-## Example continued
-
-- Suppose now that we flip our coin from the previous example 4 times and get the sequence 1, 0, 1, 1
-- The likelihood is:
-$$
-  \begin{eqnarray*}
-  {\cal L}(\theta, 1,0,1,1) & = & \theta^1 (1 - \theta)^{1 - 1}
-  \theta^0 (1 - \theta)^{1 - 0}  \\
-& \times & \theta^1 (1 - \theta)^{1 - 1} 
-   \theta^1 (1 - \theta)^{1 - 1}\\
-& = &  \theta^3(1 - \theta)^1
-  \end{eqnarray*}
-$$
-- This likelihood only depends on the total number of heads and the total number of tails; we might write ${\cal L}(\theta, 1, 3)$ for shorthand
-- Now consider ${\cal L}(.5, 1, 3) / {\cal L}(.25, 1, 3) = 5.33$
-- There is over five times as much evidence supporting the hypothesis that $\theta = .5$ over that $\theta = .25$
-
----
-
-## Plotting likelihoods
-
-- Generally, we want to consider all the values of $\theta$ between 0 and 1
-- A **likelihood plot** displays $\theta$ by ${\cal L}(\theta,x)$
-- Because the likelihood measures *relative evidence*, dividing the curve by its maximum value (or any other value for that matter) does not change its interpretation
-
----
-
-```r
-pvals <- seq(0, 1, length = 1000)
-plot(pvals, dbinom(3, 4, pvals) / dbinom(3, 4, 3/4), type = "l", frame = FALSE, lwd = 3, xlab = "p", ylab = "likelihood / max likelihood")
-```
-
-<div class="rimage center"><img src="fig/unnamed-chunk-1.png" title="plot of chunk unnamed-chunk-1" alt="plot of chunk unnamed-chunk-1" class="plot" /></div>
-
-
-
----
-
-## Maximum likelihood
-
-- The value of $\theta$ where the curve reaches its maximum has a special meaning
-- It is the value of $\theta$ that is most well supported by the data
-- This point is called the **maximum likelihood estimate** (or MLE) of $\theta$
-  $$
-  MLE = \mathrm{argmax}_\theta {\cal L}(\theta, x).
-  $$
-- Another interpretation of the MLE is that it is the value of $\theta$ that would make the data that we observed most probable
-
----
-## Some results
-* $X_1, \ldots, X_n \stackrel{iid}{\sim} N(\mu, \sigma^2)$ the MLE of $\mu$ is $\bar X$ and the ML of $\sigma^2$ is the biased sample variance estimate.
-* If $X_1,\ldots, X_n \stackrel{iid}{\sim} Bernoulli(p)$ then the MLE of $p$ is $\bar X$ (the sample proportion of 1s).
-* If $X_i \stackrel{iid}{\sim} Binomial(n_i, p)$ then the MLE of $p$ is $\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n n_i}$ (the sample proportion of 1s).
-* If $X \stackrel{iid}{\sim} Poisson(\lambda t)$ then the MLE of $\lambda$ is $X/t$.
-* If $X_i \stackrel{iid}{\sim} Poisson(\lambda t_i)$ then the MLE of $\lambda$ is
-  $\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n t_i}$
-
----
-## Example
-* You saw 5 failure events per 94 days of monitoring a nuclear pump. 
-* Assuming Poisson, plot the likelihood
-
----
-
-```r
-lambda <- seq(0, .2, length = 1000)
-likelihood <- dpois(5, 94 * lambda) / dpois(5, 5)
-plot(lambda, likelihood, frame = FALSE, lwd = 3, type = "l", xlab = expression(lambda))
-lines(rep(5/94, 2), 0 : 1, col = "red", lwd = 3)
-lines(range(lambda[likelihood > 1/16]), rep(1/16, 2), lwd = 2)
-lines(range(lambda[likelihood > 1/8]), rep(1/8, 2), lwd = 2)
-```
-
-<div class="rimage center"><img src="fig/unnamed-chunk-2.png" title="plot of chunk unnamed-chunk-2" alt="plot of chunk unnamed-chunk-2" class="plot" /></div>
-
-
-
-
+---
+title       : Likelihood
+subtitle    : Statistical Inference
+author      : Brian Caffo, Roger Peng, Jeff Leek
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Likelihood
+
+- A common and fruitful approach to statistics is to assume that the data arises from a family of distributions indexed by a parameter that represents a useful summary of the distribution
+- The **likelihood** of a collection of data is the joint density evaluated as a function of the parameters with the data fixed
+- Likelihood analysis of data uses the likelihood to perform inference regarding the unknown parameter
+
+---
+
+## Likelihood
+
+Given a statistical probability mass function or density, say $f(x, \theta)$, where $\theta$ is an unknown parameter, the **likelihood** is $f$ viewed as a function of $\theta$ for a fixed, observed value of $x$. 
+
+---
+
+## Interpretations of likelihoods
+
+The likelihood has the following properties:
+
+1. Ratios of likelihood values measure the relative evidence of one value of the unknown parameter to another.
+2. Given a statistical model and observed data, all of the relevant information contained in the data regarding the unknown parameter is contained in the likelihood.
+3. If $\{X_i\}$ are independent random variables, then their likelihoods multiply.  That is, the likelihood of the parameters given all of the $X_i$ is simply the product of the individual likelihoods.
+
+---
+
+## Example
+
+- Suppose that we flip a coin with success probability $\theta$
+- Recall that the mass function for $x$
+  $$
+  f(x,\theta) = \theta^x(1 - \theta)^{1 - x}  ~~~\mbox{for}~~~ \theta \in [0,1].
+  $$
+  where $x$ is either $0$ (Tails) or $1$ (Heads) 
+- Suppose that the result is a head
+- The likelihood is
+  $$
+  {\cal L}(\theta, 1) = \theta^1 (1 - \theta)^{1 - 1} = \theta  ~~~\mbox{for} ~~~ \theta \in [0,1].
+  $$
+- Therefore, ${\cal L}(.5, 1) / {\cal L}(.25, 1) = 2$, 
+- There is twice as much evidence supporting the hypothesis that $\theta = .5$ to the hypothesis that $\theta = .25$
+
+---
+
+## Example continued
+
+- Suppose now that we flip our coin from the previous example 4 times and get the sequence 1, 0, 1, 1
+- The likelihood is:
+$$
+  \begin{eqnarray*}
+  {\cal L}(\theta, 1,0,1,1) & = & \theta^1 (1 - \theta)^{1 - 1}
+  \theta^0 (1 - \theta)^{1 - 0}  \\
+& \times & \theta^1 (1 - \theta)^{1 - 1} 
+   \theta^1 (1 - \theta)^{1 - 1}\\
+& = &  \theta^3(1 - \theta)^1
+  \end{eqnarray*}
+$$
+- This likelihood only depends on the total number of heads and the total number of tails; we might write ${\cal L}(\theta, 1, 3)$ for shorthand
+- Now consider ${\cal L}(.5, 1, 3) / {\cal L}(.25, 1, 3) = 5.33$
+- There is over five times as much evidence supporting the hypothesis that $\theta = .5$ over that $\theta = .25$
+
+---
+
+## Plotting likelihoods
+
+- Generally, we want to consider all the values of $\theta$ between 0 and 1
+- A **likelihood plot** displays $\theta$ by ${\cal L}(\theta,x)$
+- Because the likelihood measures *relative evidence*, dividing the curve by its maximum value (or any other value for that matter) does not change its interpretation
+
+---
+
+```r
+pvals <- seq(0, 1, length = 1000)
+plot(pvals, dbinom(3, 4, pvals)/dbinom(3, 4, 3/4), type = "l", frame = FALSE, 
+    lwd = 3, xlab = "p", ylab = "likelihood / max likelihood")
+```
+
+![plot of chunk unnamed-chunk-1](assets/fig/unnamed-chunk-1.png) 
+
+
+
+---
+
+## Maximum likelihood
+
+- The value of $\theta$ where the curve reaches its maximum has a special meaning
+- It is the value of $\theta$ that is most well supported by the data
+- This point is called the **maximum likelihood estimate** (or MLE) of $\theta$
+  $$
+  MLE = \mathrm{argmax}_\theta {\cal L}(\theta, x).
+  $$
+- Another interpretation of the MLE is that it is the value of $\theta$ that would make the data that we observed most probable
+
+---
+## Some results
+* $X_1, \ldots, X_n \stackrel{iid}{\sim} N(\mu, \sigma^2)$ the MLE of $\mu$ is $\bar X$ and the ML of $\sigma^2$ is the biased sample variance estimate.
+* If $X_1,\ldots, X_n \stackrel{iid}{\sim} Bernoulli(p)$ then the MLE of $p$ is $\bar X$ (the sample proportion of 1s).
+* If $X_i \stackrel{iid}{\sim} Binomial(n_i, p)$ then the MLE of $p$ is $\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n n_i}$ (the sample proportion of 1s).
+* If $X \stackrel{iid}{\sim} Poisson(\lambda t)$ then the MLE of $\lambda$ is $X/t$.
+* If $X_i \stackrel{iid}{\sim} Poisson(\lambda t_i)$ then the MLE of $\lambda$ is
+  $\frac{\sum_{i=1}^n X_i}{\sum_{i=1}^n t_i}$
+
+---
+## Example
+* You saw 5 failure events per 94 days of monitoring a nuclear pump. 
+* Assuming Poisson, plot the likelihood
+
+---
+
+```r
+lambda <- seq(0, 0.2, length = 1000)
+likelihood <- dpois(5, 94 * lambda)/dpois(5, 5)
+plot(lambda, likelihood, frame = FALSE, lwd = 3, type = "l", xlab = expression(lambda))
+lines(rep(5/94, 2), 0:1, col = "red", lwd = 3)
+lines(range(lambda[likelihood > 1/16]), rep(1/16, 2), lwd = 2)
+lines(range(lambda[likelihood > 1/8]), rep(1/8, 2), lwd = 2)
+```
+
+![plot of chunk unnamed-chunk-2](assets/fig/unnamed-chunk-2.png) 
+
+
+
+
diff --git a/06_StatisticalInference/02_04_Likeklihood/index.pdf b/06_StatisticalInference/02_04_Likeklihood/index.pdf
index 85787788c..4e3388a80 100644
Binary files a/06_StatisticalInference/02_04_Likeklihood/index.pdf and b/06_StatisticalInference/02_04_Likeklihood/index.pdf differ
diff --git a/06_StatisticalInference/02_05_Bayes/index.Rmd b/06_StatisticalInference/02_05_Bayes/index.Rmd
index b1aae880c..7c49de8da 100644
--- a/06_StatisticalInference/02_05_Bayes/index.Rmd
+++ b/06_StatisticalInference/02_05_Bayes/index.Rmd
@@ -1,204 +1,187 @@
----
-title       : Bayesian inference
-subtitle    : Statistical Inference
-author      : Brian Caffo, Roger Peng, Jeff Leek
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F, results='hide'}
-# make this an external chunk that can be included in any file
-options(width = 100)
-opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/')
-
-options(xtable.type = 'html')
-knit_hooks$set(inline = function(x) {
-  if(is.numeric(x)) {
-    round(x, getOption('digits'))
-  } else {
-    paste(as.character(x), collapse = ', ')
-  }
-})
-knit_hooks$set(plot = knitr:::hook_plot_html)
-runif(1)
-```
-
-## Bayesian analysis
-- Bayesian statistics posits a *prior* on the parameter
-  of interest
-- All inferences are then performed on the distribution of 
-  the parameter given the data, called the posterior
-- In general,
-  $$
-  \mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
-  $$
-- Therefore (as we saw in diagnostic testing) the likelihood is
-  the factor by which our prior beliefs are updated to produce
-  conclusions in the light of the data
-
----
-## Prior specification
-- The beta distribution is the default prior
-  for parameters between $0$ and $1$.
-- The beta density depends on two parameters $\alpha$ and $\beta$
-$$
-\frac{\Gamma(\alpha +  \beta)}{\Gamma(\alpha)\Gamma(\beta)}
- p ^ {\alpha - 1} (1 - p) ^ {\beta - 1} ~~~~\mbox{for} ~~ 0 \leq p \leq 1
-$$
-- The mean of the beta density is $\alpha / (\alpha + \beta)$
-- The variance of the beta density is 
-$$\frac{\alpha \beta}{(\alpha + \beta)^2 (\alpha + \beta + 1)}$$
-- The uniform density is the special case where $\alpha = \beta = 1$
-
----
-
-```
-## Exploring the beta density
-library(manipulate)
-pvals <- seq(0.01, 0.99, length = 1000)
-manipulate(
-    plot(pvals, dbeta(pvals, alpha, beta), type = "l", lwd = 3, frame = FALSE),
-    alpha = slider(0.01, 10, initial = 1, step = .5),
-    beta = slider(0.01, 10, initial = 1, step = .5)
-    )
-```
-
----
-## Posterior 
-- Suppose that we chose values of $\alpha$ and $\beta$ so that
-  the beta prior is indicative of our degree of belief regarding $p$
-  in the absence of data
-- Then using the rule that
-  $$
-  \mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
-  $$
-  and throwing out anything that doesn't depend on $p$, we have that
-$$
-\begin{align}
-\mbox{Posterior} &\propto  p^x(1 - p)^{n-x} \times p^{\alpha -1} (1 - p)^{\beta - 1} \\
-                 &  =      p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1}
-\end{align}
-$$
-- This density is just another beta density with parameters
-  $\tilde \alpha = x + \alpha$ and $\tilde \beta = n - x + \beta$
-
-
----
-## Posterior mean
-
-$$
-\begin{align}
-E[p ~|~ X] & =   \frac{\tilde \alpha}{\tilde \alpha + \tilde \beta}\\ \\
-& =  \frac{x + \alpha}{x + \alpha + n - x + \beta}\\ \\
-& =  \frac{x + \alpha}{n + \alpha + \beta} \\ \\
-& =  \frac{x}{n} \times \frac{n}{n + \alpha + \beta} + \frac{\alpha}{\alpha + \beta} \times \frac{\alpha + \beta}{n + \alpha + \beta} \\ \\
-& =  \mbox{MLE} \times \pi + \mbox{Prior Mean} \times (1 - \pi)
-\end{align}
-$$
-
----
-## Thoughts
-
-- The posterior mean is a mixture of the MLE ($\hat p$) and the
-  prior mean
-- $\pi$ goes to $1$ as $n$ gets large; for large $n$ the data swamps the prior
-- For small $n$, the prior mean dominates 
-- Generalizes how science should ideally work; as data becomes
-  increasingly available, prior beliefs should matter less and less
-- With a prior that is degenerate at a value, no amount of data
-  can overcome the prior
-
----
-## Example
-
-- Suppose that in a random sample of an at-risk population
-$13$ of $20$ subjects had hypertension. Estimate the prevalence
-of hypertension in this population.
-- $x = 13$ and $n=20$
-- Consider a uniform prior, $\alpha = \beta = 1$
-- The posterior is proportional to (see formula above)
-$$
-p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^x (1 - p)^{n-x}
-$$
-That is, for the uniform prior, the posterior is the likelihood
-- Consider the instance where $\alpha = \beta = 2$ (recall this prior
-is humped around the point $.5$) the posterior is
-$$
-p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^{x + 1} (1 - p)^{n-x + 1}
-$$
-- The "Jeffrey's prior" which has some theoretical benefits
-  puts $\alpha = \beta = .5$
-
----
-```
-pvals <- seq(0.01, 0.99, length = 1000)
-x <- 13; n <- 20
-myPlot <- function(alpha, beta){
-    plot(0 : 1, 0 : 1, type = "n", xlab = "p", ylab = "", frame = FALSE)
-    lines(pvals, dbeta(pvals, alpha, beta) / max(dbeta(pvals, alpha, beta)), 
-            lwd = 3, col = "darkred")
-    lines(pvals, dbinom(x,n,pvals) / dbinom(x,n,x/n), lwd = 3, col = "darkblue")
-    lines(pvals, dbeta(pvals, alpha+x, beta+(n-x)) / max(dbeta(pvals, alpha+x, beta+(n-x))),
-        lwd = 3, col = "darkgreen")
-    title("red=prior,green=posterior,blue=likelihood")
-}
-manipulate(
-    myPlot(alpha, beta),
-    alpha = slider(0.01, 100, initial = 1, step = .5),
-    beta = slider(0.01, 100, initial = 1, step = .5)
-    )
-```
-
----
-## Credible intervals
-- A Bayesian credible interval is the  Bayesian analog of a confidence
-  interval
-- A $95\%$ credible interval, $[a, b]$ would satisfy
-  $$
-  P(p \in [a, b] ~|~ x) = .95
-  $$
-- The best credible intervals chop off the posterior with a horizontal
-  line in the same way we did for likelihoods 
-- These are called highest posterior density (HPD) intervals
-
----
-## Getting HPD intervals for this example
-- Install the \texttt{binom} package, then the command
-```{r}
-library(binom)
-binom.bayes(13, 20, type = "highest")
-```
-gives the HPD interval. 
-- The default credible level is $95\%$ and
-the default prior is the Jeffrey's prior.
-
----
-```
-pvals <- seq(0.01, 0.99, length = 1000)
-x <- 13; n <- 20
-myPlot2 <- function(alpha, beta, cl){
-    plot(pvals, dbeta(pvals, alpha+x, beta+(n-x)), type = "l", lwd = 3,
-    xlab = "p", ylab = "", frame = FALSE)
-    out <- binom.bayes(x, n, type = "highest", 
-        prior.shape1 = alpha, 
-        prior.shape2 = beta, 
-        conf.level = cl)
-    p1 <- out$lower; p2 <- out$upper
-    lines(c(p1, p1, p2, p2), c(0, dbeta(c(p1, p2), alpha+x, beta+(n-x)), 0), 
-        type = "l", lwd = 3, col = "darkred")
-}
-manipulate(
-    myPlot2(alpha, beta, cl),
-    alpha = slider(0.01, 10, initial = 1, step = .5),
-    beta = slider(0.01, 10, initial = 1, step = .5),
-    cl = slider(0.01, 0.99, initial = 0.95, step = .01)
-    )
-```
-
+---
+title       : Bayesian inference
+subtitle    : Statistical Inference
+author      : Brian Caffo, Roger Peng, Jeff Leek
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+## Bayesian analysis
+- Bayesian statistics posits a *prior* on the parameter
+  of interest
+- All inferences are then performed on the distribution of 
+  the parameter given the data, called the posterior
+- In general,
+  $$
+  \mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
+  $$
+- Therefore (as we saw in diagnostic testing) the likelihood is
+  the factor by which our prior beliefs are updated to produce
+  conclusions in the light of the data
+
+---
+## Prior specification
+- The beta distribution is the default prior
+  for parameters between $0$ and $1$.
+- The beta density depends on two parameters $\alpha$ and $\beta$
+$$
+\frac{\Gamma(\alpha +  \beta)}{\Gamma(\alpha)\Gamma(\beta)}
+ p ^ {\alpha - 1} (1 - p) ^ {\beta - 1} ~~~~\mbox{for} ~~ 0 \leq p \leq 1
+$$
+- The mean of the beta density is $\alpha / (\alpha + \beta)$
+- The variance of the beta density is 
+$$\frac{\alpha \beta}{(\alpha + \beta)^2 (\alpha + \beta + 1)}$$
+- The uniform density is the special case where $\alpha = \beta = 1$
+
+---
+
+```
+## Exploring the beta density
+library(manipulate)
+pvals <- seq(0.01, 0.99, length = 1000)
+manipulate(
+    plot(pvals, dbeta(pvals, alpha, beta), type = "l", lwd = 3, frame = FALSE),
+    alpha = slider(0.01, 10, initial = 1, step = .5),
+    beta = slider(0.01, 10, initial = 1, step = .5)
+    )
+```
+
+---
+## Posterior 
+- Suppose that we chose values of $\alpha$ and $\beta$ so that
+  the beta prior is indicative of our degree of belief regarding $p$
+  in the absence of data
+- Then using the rule that
+  $$
+  \mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
+  $$
+  and throwing out anything that doesn't depend on $p$, we have that
+$$
+\begin{align}
+\mbox{Posterior} &\propto  p^x(1 - p)^{n-x} \times p^{\alpha -1} (1 - p)^{\beta - 1} \\
+                 &  =      p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1}
+\end{align}
+$$
+- This density is just another beta density with parameters
+  $\tilde \alpha = x + \alpha$ and $\tilde \beta = n - x + \beta$
+
+
+---
+## Posterior mean
+
+$$
+\begin{align}
+E[p ~|~ X] & =   \frac{\tilde \alpha}{\tilde \alpha + \tilde \beta}\\ \\
+& =  \frac{x + \alpha}{x + \alpha + n - x + \beta}\\ \\
+& =  \frac{x + \alpha}{n + \alpha + \beta} \\ \\
+& =  \frac{x}{n} \times \frac{n}{n + \alpha + \beta} + \frac{\alpha}{\alpha + \beta} \times \frac{\alpha + \beta}{n + \alpha + \beta} \\ \\
+& =  \mbox{MLE} \times \pi + \mbox{Prior Mean} \times (1 - \pi)
+\end{align}
+$$
+
+---
+## Thoughts
+
+- The posterior mean is a mixture of the MLE ($\hat p$) and the
+  prior mean
+- $\pi$ goes to $1$ as $n$ gets large; for large $n$ the data swamps the prior
+- For small $n$, the prior mean dominates 
+- Generalizes how science should ideally work; as data becomes
+  increasingly available, prior beliefs should matter less and less
+- With a prior that is degenerate at a value, no amount of data
+  can overcome the prior
+
+---
+## Example
+
+- Suppose that in a random sample of an at-risk population
+$13$ of $20$ subjects had hypertension. Estimate the prevalence
+of hypertension in this population.
+- $x = 13$ and $n=20$
+- Consider a uniform prior, $\alpha = \beta = 1$
+- The posterior is proportional to (see formula above)
+$$
+p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^x (1 - p)^{n-x}
+$$
+That is, for the uniform prior, the posterior is the likelihood
+- Consider the instance where $\alpha = \beta = 2$ (recall this prior
+is humped around the point $.5$) the posterior is
+$$
+p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^{x + 1} (1 - p)^{n-x + 1}
+$$
+- The "Jeffrey's prior" which has some theoretical benefits
+  puts $\alpha = \beta = .5$
+
+---
+```
+pvals <- seq(0.01, 0.99, length = 1000)
+x <- 13; n <- 20
+myPlot <- function(alpha, beta){
+    plot(0 : 1, 0 : 1, type = "n", xlab = "p", ylab = "", frame = FALSE)
+    lines(pvals, dbeta(pvals, alpha, beta) / max(dbeta(pvals, alpha, beta)), 
+            lwd = 3, col = "darkred")
+    lines(pvals, dbinom(x,n,pvals) / dbinom(x,n,x/n), lwd = 3, col = "darkblue")
+    lines(pvals, dbeta(pvals, alpha+x, beta+(n-x)) / max(dbeta(pvals, alpha+x, beta+(n-x))),
+        lwd = 3, col = "darkgreen")
+    title("red=prior,green=posterior,blue=likelihood")
+}
+manipulate(
+    myPlot(alpha, beta),
+    alpha = slider(0.01, 100, initial = 1, step = .5),
+    beta = slider(0.01, 100, initial = 1, step = .5)
+    )
+```
+
+---
+## Credible intervals
+- A Bayesian credible interval is the  Bayesian analog of a confidence
+  interval
+- A $95\%$ credible interval, $[a, b]$ would satisfy
+  $$
+  P(p \in [a, b] ~|~ x) = .95
+  $$
+- The best credible intervals chop off the posterior with a horizontal
+  line in the same way we did for likelihoods 
+- These are called highest posterior density (HPD) intervals
+
+---
+## Getting HPD intervals for this example
+- Install the \texttt{binom} package, then the command
+```{r}
+library(binom)
+binom.bayes(13, 20, type = "highest")
+```
+gives the HPD interval. 
+- The default credible level is $95\%$ and
+the default prior is the Jeffrey's prior.
+
+---
+```
+pvals <- seq(0.01, 0.99, length = 1000)
+x <- 13; n <- 20
+myPlot2 <- function(alpha, beta, cl){
+    plot(pvals, dbeta(pvals, alpha+x, beta+(n-x)), type = "l", lwd = 3,
+    xlab = "p", ylab = "", frame = FALSE)
+    out <- binom.bayes(x, n, type = "highest", 
+        prior.shape1 = alpha, 
+        prior.shape2 = beta, 
+        conf.level = cl)
+    p1 <- out$lower; p2 <- out$upper
+    lines(c(p1, p1, p2, p2), c(0, dbeta(c(p1, p2), alpha+x, beta+(n-x)), 0), 
+        type = "l", lwd = 3, col = "darkred")
+}
+manipulate(
+    myPlot2(alpha, beta, cl),
+    alpha = slider(0.01, 10, initial = 1, step = .5),
+    beta = slider(0.01, 10, initial = 1, step = .5),
+    cl = slider(0.01, 0.99, initial = 0.95, step = .01)
+    )
+```
+
diff --git a/06_StatisticalInference/02_05_Bayes/index.html b/06_StatisticalInference/02_05_Bayes/index.html
index 5fb2d1b05..ac1863c48 100644
--- a/06_StatisticalInference/02_05_Bayes/index.html
+++ b/06_StatisticalInference/02_05_Bayes/index.html
@@ -1,403 +1,407 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Bayesian inference</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Bayesian inference">
-  <meta name="author" content="Brian Caffo, Roger Peng, Jeff Leek">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Bayesian inference</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Roger Peng, Jeff Leek<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Bayesian analysis</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Bayesian statistics posits a <em>prior</em> on the parameter
-of interest</li>
-<li>All inferences are then performed on the distribution of 
-the parameter given the data, called the posterior</li>
-<li>In general,
-\[
-\mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
-\]</li>
-<li>Therefore (as we saw in diagnostic testing) the likelihood is
-the factor by which our prior beliefs are updated to produce
-conclusions in the light of the data</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Prior specification</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The beta distribution is the default prior
-for parameters between \(0\) and \(1\).</li>
-<li>The beta density depends on two parameters \(\alpha\) and \(\beta\)
-\[
-\frac{\Gamma(\alpha +  \beta)}{\Gamma(\alpha)\Gamma(\beta)}
-p ^ {\alpha - 1} (1 - p) ^ {\beta - 1} ~~~~\mbox{for} ~~ 0 \leq p \leq 1
-\]</li>
-<li>The mean of the beta density is \(\alpha / (\alpha + \beta)\)</li>
-<li>The variance of the beta density is 
-\[\frac{\alpha \beta}{(\alpha + \beta)^2 (\alpha + \beta + 1)}\]</li>
-<li>The uniform density is the special case where \(\alpha = \beta = 1\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <article data-timings="">
-    <pre><code>## Exploring the beta density
-library(manipulate)
-pvals &lt;- seq(0.01, 0.99, length = 1000)
-manipulate(
-    plot(pvals, dbeta(pvals, alpha, beta), type = &quot;l&quot;, lwd = 3, frame = FALSE),
-    alpha = slider(0.01, 10, initial = 1, step = .5),
-    beta = slider(0.01, 10, initial = 1, step = .5)
-    )
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Posterior</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose that we chose values of \(\alpha\) and \(\beta\) so that
-the beta prior is indicative of our degree of belief regarding \(p\)
-in the absence of data</li>
-<li>Then using the rule that
-\[
-\mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
-\]
-and throwing out anything that doesn&#39;t depend on \(p\), we have that
-\[
-\begin{align}
-\mbox{Posterior} &\propto  p^x(1 - p)^{n-x} \times p^{\alpha -1} (1 - p)^{\beta - 1} \\
-             &  =      p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1}
-\end{align}
-\]</li>
-<li>This density is just another beta density with parameters
-\(\tilde \alpha = x + \alpha\) and \(\tilde \beta = n - x + \beta\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Posterior mean</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>\[
-\begin{align}
-E[p ~|~ X] & =   \frac{\tilde \alpha}{\tilde \alpha + \tilde \beta}\\ \\
-& =  \frac{x + \alpha}{x + \alpha + n - x + \beta}\\ \\
-& =  \frac{x + \alpha}{n + \alpha + \beta} \\ \\
-& =  \frac{x}{n} \times \frac{n}{n + \alpha + \beta} + \frac{\alpha}{\alpha + \beta} \times \frac{\alpha + \beta}{n + \alpha + \beta} \\ \\
-& =  \mbox{MLE} \times \pi + \mbox{Prior Mean} \times (1 - \pi)
-\end{align}
-\]</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Thoughts</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The posterior mean is a mixture of the MLE (\(\hat p\)) and the
-prior mean</li>
-<li>\(\pi\) goes to \(1\) as \(n\) gets large; for large \(n\) the data swamps the prior</li>
-<li>For small \(n\), the prior mean dominates </li>
-<li>Generalizes how science should ideally work; as data becomes
-increasingly available, prior beliefs should matter less and less</li>
-<li>With a prior that is degenerate at a value, no amount of data
-can overcome the prior</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose that in a random sample of an at-risk population
-\(13\) of \(20\) subjects had hypertension. Estimate the prevalence
-of hypertension in this population.</li>
-<li>\(x = 13\) and \(n=20\)</li>
-<li>Consider a uniform prior, \(\alpha = \beta = 1\)</li>
-<li>The posterior is proportional to (see formula above)
-\[
-p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^x (1 - p)^{n-x}
-\]
-That is, for the uniform prior, the posterior is the likelihood</li>
-<li>Consider the instance where \(\alpha = \beta = 2\) (recall this prior
-is humped around the point \(.5\)) the posterior is
-\[
-p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^{x + 1} (1 - p)^{n-x + 1}
-\]</li>
-<li>The &quot;Jeffrey&#39;s prior&quot; which has some theoretical benefits
-puts \(\alpha = \beta = .5\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <article data-timings="">
-    <pre><code>pvals &lt;- seq(0.01, 0.99, length = 1000)
-x &lt;- 13; n &lt;- 20
-myPlot &lt;- function(alpha, beta){
-    plot(0 : 1, 0 : 1, type = &quot;n&quot;, xlab = &quot;p&quot;, ylab = &quot;&quot;, frame = FALSE)
-    lines(pvals, dbeta(pvals, alpha, beta) / max(dbeta(pvals, alpha, beta)), 
-            lwd = 3, col = &quot;darkred&quot;)
-    lines(pvals, dbinom(x,n,pvals) / dbinom(x,n,x/n), lwd = 3, col = &quot;darkblue&quot;)
-    lines(pvals, dbeta(pvals, alpha+x, beta+(n-x)) / max(dbeta(pvals, alpha+x, beta+(n-x))),
-        lwd = 3, col = &quot;darkgreen&quot;)
-    title(&quot;red=prior,green=posterior,blue=likelihood&quot;)
-}
-manipulate(
-    myPlot(alpha, beta),
-    alpha = slider(0.01, 10, initial = 1, step = .5),
-    beta = slider(0.01, 10, initial = 1, step = .5)
-    )
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>Credible intervals</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>A Bayesian credible interval is the  Bayesian analog of a confidence
-interval</li>
-<li>A \(95\%\) credible interval, \([a, b]\) would satisfy
-\[
-P(p \in [a, b] ~|~ x) = .95
-\]</li>
-<li>The best credible intervals chop off the posterior with a horizontal
-line in the same way we did for likelihoods </li>
-<li>These are called highest posterior density (HPD) intervals</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>Getting HPD intervals for this example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Install the \texttt{binom} package, then the command</li>
-</ul>
-
-<pre><code class="r">library(binom)
-binom.bayes(13, 20, type = &quot;highest&quot;)
-</code></pre>
-
-<pre><code>  method  x  n shape1 shape2   mean  lower  upper  sig
-1  bayes 13 20   13.5    7.5 0.6429 0.4423 0.8361 0.05
-</code></pre>
-
-<p>gives the HPD interval. </p>
-
-<ul>
-<li>The default credible level is \(95\%\) and
-the default prior is the Jeffrey&#39;s prior.</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <article data-timings="">
-    <pre><code>pvals &lt;- seq(0.01, 0.99, length = 1000)
-x &lt;- 13; n &lt;- 20
-myPlot2 &lt;- function(alpha, beta, cl){
-    plot(pvals, dbeta(pvals, alpha+x, beta+(n-x)), type = &quot;l&quot;, lwd = 3,
-    xlab = &quot;p&quot;, ylab = &quot;&quot;, frame = FALSE)
-    out &lt;- binom.bayes(x, n, type = &quot;highest&quot;, 
-        prior.shape1 = alpha, 
-        prior.shape2 = beta, 
-        conf.level = cl)
-    p1 &lt;- out$lower; p2 &lt;- out$upper
-    lines(c(p1, p1, p2, p2), c(0, dbeta(c(p1, p2), alpha+x, beta+(n-x)), 0), 
-        type = &quot;l&quot;, lwd = 3, col = &quot;darkred&quot;)
-}
-manipulate(
-    myPlot2(alpha, beta, cl),
-    alpha = slider(0.01, 10, initial = 1, step = .5),
-    beta = slider(0.01, 10, initial = 1, step = .5),
-    cl = slider(0.01, 0.99, initial = 0.95, step = .01)
-    )
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Bayesian analysis'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Prior specification'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title=''>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Posterior'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Posterior mean'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Thoughts'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Example'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title=''>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='Credible intervals'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='Getting HPD intervals for this example'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title=''>
-         11
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Bayesian inference</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Bayesian inference">
+  <meta name="author" content="Brian Caffo, Roger Peng, Jeff Leek">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Bayesian inference</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Roger Peng, Jeff Leek<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Bayesian analysis</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Bayesian statistics posits a <em>prior</em> on the parameter
+of interest</li>
+<li>All inferences are then performed on the distribution of 
+the parameter given the data, called the posterior</li>
+<li>In general,
+\[
+\mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
+\]</li>
+<li>Therefore (as we saw in diagnostic testing) the likelihood is
+the factor by which our prior beliefs are updated to produce
+conclusions in the light of the data</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Prior specification</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The beta distribution is the default prior
+for parameters between \(0\) and \(1\).</li>
+<li>The beta density depends on two parameters \(\alpha\) and \(\beta\)
+\[
+\frac{\Gamma(\alpha +  \beta)}{\Gamma(\alpha)\Gamma(\beta)}
+p ^ {\alpha - 1} (1 - p) ^ {\beta - 1} ~~~~\mbox{for} ~~ 0 \leq p \leq 1
+\]</li>
+<li>The mean of the beta density is \(\alpha / (\alpha + \beta)\)</li>
+<li>The variance of the beta density is 
+\[\frac{\alpha \beta}{(\alpha + \beta)^2 (\alpha + \beta + 1)}\]</li>
+<li>The uniform density is the special case where \(\alpha = \beta = 1\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <article data-timings="">
+    <pre><code>## Exploring the beta density
+library(manipulate)
+pvals &lt;- seq(0.01, 0.99, length = 1000)
+manipulate(
+    plot(pvals, dbeta(pvals, alpha, beta), type = &quot;l&quot;, lwd = 3, frame = FALSE),
+    alpha = slider(0.01, 10, initial = 1, step = .5),
+    beta = slider(0.01, 10, initial = 1, step = .5)
+    )
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Posterior</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose that we chose values of \(\alpha\) and \(\beta\) so that
+the beta prior is indicative of our degree of belief regarding \(p\)
+in the absence of data</li>
+<li>Then using the rule that
+\[
+\mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
+\]
+and throwing out anything that doesn&#39;t depend on \(p\), we have that
+\[
+\begin{align}
+\mbox{Posterior} &\propto  p^x(1 - p)^{n-x} \times p^{\alpha -1} (1 - p)^{\beta - 1} \\
+             &  =      p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1}
+\end{align}
+\]</li>
+<li>This density is just another beta density with parameters
+\(\tilde \alpha = x + \alpha\) and \(\tilde \beta = n - x + \beta\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Posterior mean</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>\[
+\begin{align}
+E[p ~|~ X] & =   \frac{\tilde \alpha}{\tilde \alpha + \tilde \beta}\\ \\
+& =  \frac{x + \alpha}{x + \alpha + n - x + \beta}\\ \\
+& =  \frac{x + \alpha}{n + \alpha + \beta} \\ \\
+& =  \frac{x}{n} \times \frac{n}{n + \alpha + \beta} + \frac{\alpha}{\alpha + \beta} \times \frac{\alpha + \beta}{n + \alpha + \beta} \\ \\
+& =  \mbox{MLE} \times \pi + \mbox{Prior Mean} \times (1 - \pi)
+\end{align}
+\]</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Thoughts</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The posterior mean is a mixture of the MLE (\(\hat p\)) and the
+prior mean</li>
+<li>\(\pi\) goes to \(1\) as \(n\) gets large; for large \(n\) the data swamps the prior</li>
+<li>For small \(n\), the prior mean dominates </li>
+<li>Generalizes how science should ideally work; as data becomes
+increasingly available, prior beliefs should matter less and less</li>
+<li>With a prior that is degenerate at a value, no amount of data
+can overcome the prior</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose that in a random sample of an at-risk population
+\(13\) of \(20\) subjects had hypertension. Estimate the prevalence
+of hypertension in this population.</li>
+<li>\(x = 13\) and \(n=20\)</li>
+<li>Consider a uniform prior, \(\alpha = \beta = 1\)</li>
+<li>The posterior is proportional to (see formula above)
+\[
+p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^x (1 - p)^{n-x}
+\]
+That is, for the uniform prior, the posterior is the likelihood</li>
+<li>Consider the instance where \(\alpha = \beta = 2\) (recall this prior
+is humped around the point \(.5\)) the posterior is
+\[
+p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^{x + 1} (1 - p)^{n-x + 1}
+\]</li>
+<li>The &quot;Jeffrey&#39;s prior&quot; which has some theoretical benefits
+puts \(\alpha = \beta = .5\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <article data-timings="">
+    <pre><code>pvals &lt;- seq(0.01, 0.99, length = 1000)
+x &lt;- 13; n &lt;- 20
+myPlot &lt;- function(alpha, beta){
+    plot(0 : 1, 0 : 1, type = &quot;n&quot;, xlab = &quot;p&quot;, ylab = &quot;&quot;, frame = FALSE)
+    lines(pvals, dbeta(pvals, alpha, beta) / max(dbeta(pvals, alpha, beta)), 
+            lwd = 3, col = &quot;darkred&quot;)
+    lines(pvals, dbinom(x,n,pvals) / dbinom(x,n,x/n), lwd = 3, col = &quot;darkblue&quot;)
+    lines(pvals, dbeta(pvals, alpha+x, beta+(n-x)) / max(dbeta(pvals, alpha+x, beta+(n-x))),
+        lwd = 3, col = &quot;darkgreen&quot;)
+    title(&quot;red=prior,green=posterior,blue=likelihood&quot;)
+}
+manipulate(
+    myPlot(alpha, beta),
+    alpha = slider(0.01, 100, initial = 1, step = .5),
+    beta = slider(0.01, 100, initial = 1, step = .5)
+    )
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>Credible intervals</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>A Bayesian credible interval is the  Bayesian analog of a confidence
+interval</li>
+<li>A \(95\%\) credible interval, \([a, b]\) would satisfy
+\[
+P(p \in [a, b] ~|~ x) = .95
+\]</li>
+<li>The best credible intervals chop off the posterior with a horizontal
+line in the same way we did for likelihoods </li>
+<li>These are called highest posterior density (HPD) intervals</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>Getting HPD intervals for this example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Install the \texttt{binom} package, then the command</li>
+</ul>
+
+<pre><code class="r">library(binom)
+</code></pre>
+
+<pre><code>## Error: there is no package called &#39;binom&#39;
+</code></pre>
+
+<pre><code class="r">binom.bayes(13, 20, type = &quot;highest&quot;)
+</code></pre>
+
+<pre><code>## Error: could not find function &quot;binom.bayes&quot;
+</code></pre>
+
+<p>gives the HPD interval. </p>
+
+<ul>
+<li>The default credible level is \(95\%\) and
+the default prior is the Jeffrey&#39;s prior.</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <article data-timings="">
+    <pre><code>pvals &lt;- seq(0.01, 0.99, length = 1000)
+x &lt;- 13; n &lt;- 20
+myPlot2 &lt;- function(alpha, beta, cl){
+    plot(pvals, dbeta(pvals, alpha+x, beta+(n-x)), type = &quot;l&quot;, lwd = 3,
+    xlab = &quot;p&quot;, ylab = &quot;&quot;, frame = FALSE)
+    out &lt;- binom.bayes(x, n, type = &quot;highest&quot;, 
+        prior.shape1 = alpha, 
+        prior.shape2 = beta, 
+        conf.level = cl)
+    p1 &lt;- out$lower; p2 &lt;- out$upper
+    lines(c(p1, p1, p2, p2), c(0, dbeta(c(p1, p2), alpha+x, beta+(n-x)), 0), 
+        type = &quot;l&quot;, lwd = 3, col = &quot;darkred&quot;)
+}
+manipulate(
+    myPlot2(alpha, beta, cl),
+    alpha = slider(0.01, 10, initial = 1, step = .5),
+    beta = slider(0.01, 10, initial = 1, step = .5),
+    cl = slider(0.01, 0.99, initial = 0.95, step = .01)
+    )
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Bayesian analysis'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Prior specification'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title=''>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Posterior'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Posterior mean'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Thoughts'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Example'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title=''>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='Credible intervals'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='Getting HPD intervals for this example'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title=''>
+         11
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/02_05_Bayes/index.md b/06_StatisticalInference/02_05_Bayes/index.md
index ec53a034d..bb82da2b2 100644
--- a/06_StatisticalInference/02_05_Bayes/index.md
+++ b/06_StatisticalInference/02_05_Bayes/index.md
@@ -1,197 +1,200 @@
----
-title       : Bayesian inference
-subtitle    : Statistical Inference
-author      : Brian Caffo, Roger Peng, Jeff Leek
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-
-## Bayesian analysis
-- Bayesian statistics posits a *prior* on the parameter
-  of interest
-- All inferences are then performed on the distribution of 
-  the parameter given the data, called the posterior
-- In general,
-  $$
-  \mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
-  $$
-- Therefore (as we saw in diagnostic testing) the likelihood is
-  the factor by which our prior beliefs are updated to produce
-  conclusions in the light of the data
-
----
-## Prior specification
-- The beta distribution is the default prior
-  for parameters between $0$ and $1$.
-- The beta density depends on two parameters $\alpha$ and $\beta$
-$$
-\frac{\Gamma(\alpha +  \beta)}{\Gamma(\alpha)\Gamma(\beta)}
- p ^ {\alpha - 1} (1 - p) ^ {\beta - 1} ~~~~\mbox{for} ~~ 0 \leq p \leq 1
-$$
-- The mean of the beta density is $\alpha / (\alpha + \beta)$
-- The variance of the beta density is 
-$$\frac{\alpha \beta}{(\alpha + \beta)^2 (\alpha + \beta + 1)}$$
-- The uniform density is the special case where $\alpha = \beta = 1$
-
----
-
-```
-## Exploring the beta density
-library(manipulate)
-pvals <- seq(0.01, 0.99, length = 1000)
-manipulate(
-    plot(pvals, dbeta(pvals, alpha, beta), type = "l", lwd = 3, frame = FALSE),
-    alpha = slider(0.01, 10, initial = 1, step = .5),
-    beta = slider(0.01, 10, initial = 1, step = .5)
-    )
-```
-
----
-## Posterior 
-- Suppose that we chose values of $\alpha$ and $\beta$ so that
-  the beta prior is indicative of our degree of belief regarding $p$
-  in the absence of data
-- Then using the rule that
-  $$
-  \mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
-  $$
-  and throwing out anything that doesn't depend on $p$, we have that
-$$
-\begin{align}
-\mbox{Posterior} &\propto  p^x(1 - p)^{n-x} \times p^{\alpha -1} (1 - p)^{\beta - 1} \\
-                 &  =      p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1}
-\end{align}
-$$
-- This density is just another beta density with parameters
-  $\tilde \alpha = x + \alpha$ and $\tilde \beta = n - x + \beta$
-
-
----
-## Posterior mean
-
-$$
-\begin{align}
-E[p ~|~ X] & =   \frac{\tilde \alpha}{\tilde \alpha + \tilde \beta}\\ \\
-& =  \frac{x + \alpha}{x + \alpha + n - x + \beta}\\ \\
-& =  \frac{x + \alpha}{n + \alpha + \beta} \\ \\
-& =  \frac{x}{n} \times \frac{n}{n + \alpha + \beta} + \frac{\alpha}{\alpha + \beta} \times \frac{\alpha + \beta}{n + \alpha + \beta} \\ \\
-& =  \mbox{MLE} \times \pi + \mbox{Prior Mean} \times (1 - \pi)
-\end{align}
-$$
-
----
-## Thoughts
-
-- The posterior mean is a mixture of the MLE ($\hat p$) and the
-  prior mean
-- $\pi$ goes to $1$ as $n$ gets large; for large $n$ the data swamps the prior
-- For small $n$, the prior mean dominates 
-- Generalizes how science should ideally work; as data becomes
-  increasingly available, prior beliefs should matter less and less
-- With a prior that is degenerate at a value, no amount of data
-  can overcome the prior
-
----
-## Example
-
-- Suppose that in a random sample of an at-risk population
-$13$ of $20$ subjects had hypertension. Estimate the prevalence
-of hypertension in this population.
-- $x = 13$ and $n=20$
-- Consider a uniform prior, $\alpha = \beta = 1$
-- The posterior is proportional to (see formula above)
-$$
-p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^x (1 - p)^{n-x}
-$$
-That is, for the uniform prior, the posterior is the likelihood
-- Consider the instance where $\alpha = \beta = 2$ (recall this prior
-is humped around the point $.5$) the posterior is
-$$
-p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^{x + 1} (1 - p)^{n-x + 1}
-$$
-- The "Jeffrey's prior" which has some theoretical benefits
-  puts $\alpha = \beta = .5$
-
----
-```
-pvals <- seq(0.01, 0.99, length = 1000)
-x <- 13; n <- 20
-myPlot <- function(alpha, beta){
-    plot(0 : 1, 0 : 1, type = "n", xlab = "p", ylab = "", frame = FALSE)
-    lines(pvals, dbeta(pvals, alpha, beta) / max(dbeta(pvals, alpha, beta)), 
-            lwd = 3, col = "darkred")
-    lines(pvals, dbinom(x,n,pvals) / dbinom(x,n,x/n), lwd = 3, col = "darkblue")
-    lines(pvals, dbeta(pvals, alpha+x, beta+(n-x)) / max(dbeta(pvals, alpha+x, beta+(n-x))),
-        lwd = 3, col = "darkgreen")
-    title("red=prior,green=posterior,blue=likelihood")
-}
-manipulate(
-    myPlot(alpha, beta),
-    alpha = slider(0.01, 10, initial = 1, step = .5),
-    beta = slider(0.01, 10, initial = 1, step = .5)
-    )
-```
-
----
-## Credible intervals
-- A Bayesian credible interval is the  Bayesian analog of a confidence
-  interval
-- A $95\%$ credible interval, $[a, b]$ would satisfy
-  $$
-  P(p \in [a, b] ~|~ x) = .95
-  $$
-- The best credible intervals chop off the posterior with a horizontal
-  line in the same way we did for likelihoods 
-- These are called highest posterior density (HPD) intervals
-
----
-## Getting HPD intervals for this example
-- Install the \texttt{binom} package, then the command
-
-```r
-library(binom)
-binom.bayes(13, 20, type = "highest")
-```
-
-```
-  method  x  n shape1 shape2   mean  lower  upper  sig
-1  bayes 13 20   13.5    7.5 0.6429 0.4423 0.8361 0.05
-```
-
-gives the HPD interval. 
-- The default credible level is $95\%$ and
-the default prior is the Jeffrey's prior.
-
----
-```
-pvals <- seq(0.01, 0.99, length = 1000)
-x <- 13; n <- 20
-myPlot2 <- function(alpha, beta, cl){
-    plot(pvals, dbeta(pvals, alpha+x, beta+(n-x)), type = "l", lwd = 3,
-    xlab = "p", ylab = "", frame = FALSE)
-    out <- binom.bayes(x, n, type = "highest", 
-        prior.shape1 = alpha, 
-        prior.shape2 = beta, 
-        conf.level = cl)
-    p1 <- out$lower; p2 <- out$upper
-    lines(c(p1, p1, p2, p2), c(0, dbeta(c(p1, p2), alpha+x, beta+(n-x)), 0), 
-        type = "l", lwd = 3, col = "darkred")
-}
-manipulate(
-    myPlot2(alpha, beta, cl),
-    alpha = slider(0.01, 10, initial = 1, step = .5),
-    beta = slider(0.01, 10, initial = 1, step = .5),
-    cl = slider(0.01, 0.99, initial = 0.95, step = .01)
-    )
-```
-
+---
+title       : Bayesian inference
+subtitle    : Statistical Inference
+author      : Brian Caffo, Roger Peng, Jeff Leek
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+## Bayesian analysis
+- Bayesian statistics posits a *prior* on the parameter
+  of interest
+- All inferences are then performed on the distribution of 
+  the parameter given the data, called the posterior
+- In general,
+  $$
+  \mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
+  $$
+- Therefore (as we saw in diagnostic testing) the likelihood is
+  the factor by which our prior beliefs are updated to produce
+  conclusions in the light of the data
+
+---
+## Prior specification
+- The beta distribution is the default prior
+  for parameters between $0$ and $1$.
+- The beta density depends on two parameters $\alpha$ and $\beta$
+$$
+\frac{\Gamma(\alpha +  \beta)}{\Gamma(\alpha)\Gamma(\beta)}
+ p ^ {\alpha - 1} (1 - p) ^ {\beta - 1} ~~~~\mbox{for} ~~ 0 \leq p \leq 1
+$$
+- The mean of the beta density is $\alpha / (\alpha + \beta)$
+- The variance of the beta density is 
+$$\frac{\alpha \beta}{(\alpha + \beta)^2 (\alpha + \beta + 1)}$$
+- The uniform density is the special case where $\alpha = \beta = 1$
+
+---
+
+```
+## Exploring the beta density
+library(manipulate)
+pvals <- seq(0.01, 0.99, length = 1000)
+manipulate(
+    plot(pvals, dbeta(pvals, alpha, beta), type = "l", lwd = 3, frame = FALSE),
+    alpha = slider(0.01, 10, initial = 1, step = .5),
+    beta = slider(0.01, 10, initial = 1, step = .5)
+    )
+```
+
+---
+## Posterior 
+- Suppose that we chose values of $\alpha$ and $\beta$ so that
+  the beta prior is indicative of our degree of belief regarding $p$
+  in the absence of data
+- Then using the rule that
+  $$
+  \mbox{Posterior} \propto \mbox{Likelihood} \times \mbox{Prior}
+  $$
+  and throwing out anything that doesn't depend on $p$, we have that
+$$
+\begin{align}
+\mbox{Posterior} &\propto  p^x(1 - p)^{n-x} \times p^{\alpha -1} (1 - p)^{\beta - 1} \\
+                 &  =      p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1}
+\end{align}
+$$
+- This density is just another beta density with parameters
+  $\tilde \alpha = x + \alpha$ and $\tilde \beta = n - x + \beta$
+
+
+---
+## Posterior mean
+
+$$
+\begin{align}
+E[p ~|~ X] & =   \frac{\tilde \alpha}{\tilde \alpha + \tilde \beta}\\ \\
+& =  \frac{x + \alpha}{x + \alpha + n - x + \beta}\\ \\
+& =  \frac{x + \alpha}{n + \alpha + \beta} \\ \\
+& =  \frac{x}{n} \times \frac{n}{n + \alpha + \beta} + \frac{\alpha}{\alpha + \beta} \times \frac{\alpha + \beta}{n + \alpha + \beta} \\ \\
+& =  \mbox{MLE} \times \pi + \mbox{Prior Mean} \times (1 - \pi)
+\end{align}
+$$
+
+---
+## Thoughts
+
+- The posterior mean is a mixture of the MLE ($\hat p$) and the
+  prior mean
+- $\pi$ goes to $1$ as $n$ gets large; for large $n$ the data swamps the prior
+- For small $n$, the prior mean dominates 
+- Generalizes how science should ideally work; as data becomes
+  increasingly available, prior beliefs should matter less and less
+- With a prior that is degenerate at a value, no amount of data
+  can overcome the prior
+
+---
+## Example
+
+- Suppose that in a random sample of an at-risk population
+$13$ of $20$ subjects had hypertension. Estimate the prevalence
+of hypertension in this population.
+- $x = 13$ and $n=20$
+- Consider a uniform prior, $\alpha = \beta = 1$
+- The posterior is proportional to (see formula above)
+$$
+p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^x (1 - p)^{n-x}
+$$
+That is, for the uniform prior, the posterior is the likelihood
+- Consider the instance where $\alpha = \beta = 2$ (recall this prior
+is humped around the point $.5$) the posterior is
+$$
+p^{x + \alpha - 1} (1 - p)^{n - x + \beta - 1} = p^{x + 1} (1 - p)^{n-x + 1}
+$$
+- The "Jeffrey's prior" which has some theoretical benefits
+  puts $\alpha = \beta = .5$
+
+---
+```
+pvals <- seq(0.01, 0.99, length = 1000)
+x <- 13; n <- 20
+myPlot <- function(alpha, beta){
+    plot(0 : 1, 0 : 1, type = "n", xlab = "p", ylab = "", frame = FALSE)
+    lines(pvals, dbeta(pvals, alpha, beta) / max(dbeta(pvals, alpha, beta)), 
+            lwd = 3, col = "darkred")
+    lines(pvals, dbinom(x,n,pvals) / dbinom(x,n,x/n), lwd = 3, col = "darkblue")
+    lines(pvals, dbeta(pvals, alpha+x, beta+(n-x)) / max(dbeta(pvals, alpha+x, beta+(n-x))),
+        lwd = 3, col = "darkgreen")
+    title("red=prior,green=posterior,blue=likelihood")
+}
+manipulate(
+    myPlot(alpha, beta),
+    alpha = slider(0.01, 100, initial = 1, step = .5),
+    beta = slider(0.01, 100, initial = 1, step = .5)
+    )
+```
+
+---
+## Credible intervals
+- A Bayesian credible interval is the  Bayesian analog of a confidence
+  interval
+- A $95\%$ credible interval, $[a, b]$ would satisfy
+  $$
+  P(p \in [a, b] ~|~ x) = .95
+  $$
+- The best credible intervals chop off the posterior with a horizontal
+  line in the same way we did for likelihoods 
+- These are called highest posterior density (HPD) intervals
+
+---
+## Getting HPD intervals for this example
+- Install the \texttt{binom} package, then the command
+
+```r
+library(binom)
+```
+
+```
+## Error: there is no package called 'binom'
+```
+
+```r
+binom.bayes(13, 20, type = "highest")
+```
+
+```
+## Error: could not find function "binom.bayes"
+```
+
+gives the HPD interval. 
+- The default credible level is $95\%$ and
+the default prior is the Jeffrey's prior.
+
+---
+```
+pvals <- seq(0.01, 0.99, length = 1000)
+x <- 13; n <- 20
+myPlot2 <- function(alpha, beta, cl){
+    plot(pvals, dbeta(pvals, alpha+x, beta+(n-x)), type = "l", lwd = 3,
+    xlab = "p", ylab = "", frame = FALSE)
+    out <- binom.bayes(x, n, type = "highest", 
+        prior.shape1 = alpha, 
+        prior.shape2 = beta, 
+        conf.level = cl)
+    p1 <- out$lower; p2 <- out$upper
+    lines(c(p1, p1, p2, p2), c(0, dbeta(c(p1, p2), alpha+x, beta+(n-x)), 0), 
+        type = "l", lwd = 3, col = "darkred")
+}
+manipulate(
+    myPlot2(alpha, beta, cl),
+    alpha = slider(0.01, 10, initial = 1, step = .5),
+    beta = slider(0.01, 10, initial = 1, step = .5),
+    cl = slider(0.01, 0.99, initial = 0.95, step = .01)
+    )
+```
+
diff --git a/06_StatisticalInference/02_05_Bayes/index.pdf b/06_StatisticalInference/02_05_Bayes/index.pdf
index c8043e4a1..ae65bc28f 100644
Binary files a/06_StatisticalInference/02_05_Bayes/index.pdf and b/06_StatisticalInference/02_05_Bayes/index.pdf differ
diff --git a/06_StatisticalInference/03_01_TwoGroupIntervals/assets/fig/unnamed-chunk-3.png b/06_StatisticalInference/03_01_TwoGroupIntervals/assets/fig/unnamed-chunk-3.png
new file mode 100644
index 000000000..e5ca858ce
Binary files /dev/null and b/06_StatisticalInference/03_01_TwoGroupIntervals/assets/fig/unnamed-chunk-3.png differ
diff --git a/06_StatisticalInference/03_01_TwoGroupIntervals/index.Rmd b/06_StatisticalInference/03_01_TwoGroupIntervals/index.Rmd
index d139bf681..226588ff5 100644
--- a/06_StatisticalInference/03_01_TwoGroupIntervals/index.Rmd
+++ b/06_StatisticalInference/03_01_TwoGroupIntervals/index.Rmd
@@ -1,186 +1,170 @@
----
-title       : Two group intervals
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F, results='hide'}
-# make this an external chunk that can be included in any file
-options(width = 100)
-opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/')
-
-options(xtable.type = 'html')
-knit_hooks$set(inline = function(x) {
-  if(is.numeric(x)) {
-    round(x, getOption('digits'))
-  } else {
-    paste(as.character(x), collapse = ', ')
-  }
-})
-knit_hooks$set(plot = knitr:::hook_plot_html)
-runif(1)
-```
-
-## Independent group $t$ confidence intervals
-
-- Suppose that we want to compare the mean blood pressure between two groups in a randomized trial; those who received the treatment to those who received a placebo
-- We cannot use the paired t test because the groups are independent and may have different sample sizes
-- We now present methods for comparing independent groups
-
----
-
-## Notation
-
-- Let $X_1,\ldots,X_{n_x}$ be iid $N(\mu_x,\sigma^2)$
-- Let $Y_1,\ldots,Y_{n_y}$ be iid $N(\mu_y, \sigma^2)$
-- Let $\bar X$, $\bar Y$, $S_x$, $S_y$ be the means and standard deviations
-- Using the fact that linear combinations of normals are again normal, we know that $\bar Y - \bar X$ is also normal with mean $\mu_y - \mu_x$ and variance $\sigma^2 (\frac{1}{n_x} + \frac{1}{n_y})$
-- The pooled variance estimator $$S_p^2 = \{(n_x - 1) S_x^2 + (n_y - 1) S_y^2\}/(n_x + n_y - 2)$$ is a good estimator of $\sigma^2$
-
----
-
-## Note
-
-- The pooled estimator is a mixture of the group variances, placing greater weight on whichever has a larger sample size
-- If the sample sizes are the same the pooled variance estimate is the average of the group variances
-- The pooled estimator is unbiased
-$$
-    \begin{eqnarray*}
-    E[S_p^2] & = & \frac{(n_x - 1) E[S_x^2] + (n_y - 1) E[S_y^2]}{n_x + n_y - 2}\\
-            & = & \frac{(n_x - 1)\sigma^2 + (n_y - 1)\sigma^2}{n_x + n_y - 2}
-    \end{eqnarray*}
-$$
-- The pooled variance  estimate is independent of $\bar Y - \bar X$ since $S_x$ is independent of $\bar X$ and $S_y$ is independent of $\bar Y$ and the groups are independent
-
----
-
-## Result
-
-- The sum of two independent Chi-squared random variables is Chi-squared with degrees of freedom equal to the sum of the degrees of freedom of the summands
-- Therefore
-$$
-    \begin{eqnarray*}
-      (n_x + n_y - 2) S_p^2 / \sigma^2 & = & (n_x - 1)S_x^2 /\sigma^2 + (n_y - 1)S_y^2/\sigma^2 \\ \\
-      & = & \chi^2_{n_x - 1} + \chi^2_{n_y-1} \\ \\
-      & = & \chi^2_{n_x + n_y - 2}
-    \end{eqnarray*}
-$$
-
----
-
-## Putting this all together
-
-- The statistic
-$$
-    \frac{\frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\sigma \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}}%
-    {\sqrt{\frac{(n_x + n_y - 2) S_p^2}{(n_x + n_y - 2)\sigma^2}}}
-    = \frac{\bar Y - \bar X - (\mu_y - \mu_x)}{S_p \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}
-$$
-is a standard normal divided by the square root of an independent Chi-squared divided by its degrees of freedom 
-- Therefore this statistic follows Gosset's $t$ distribution with $n_x + n_y - 2$ degrees of freedom
-- Notice the form is (estimator - true value) / SE
-
----
-
-## Confidence interval
-
-- Therefore a $(1 - \alpha)\times 100\%$ confidence interval for $\mu_y - \mu_x$ is 
-$$
-    \bar Y - \bar X \pm t_{n_x + n_y - 2, 1 - \alpha/2}S_p\left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}
-$$
-- Remember this interval is assuming a constant variance across the two groups
-- If there is some doubt, assume a different variance per group, which we will discuss later
-
----
-
-
-## Example
-### Based on Rosner, Fundamentals of Biostatistics
-
-- Comparing SBP for 8 oral contraceptive users versus 21 controls
-- $\bar X_{OC} = 132.86$ mmHg with $s_{OC} = 15.34$ mmHg
-- $\bar X_{C} = 127.44$ mmHg with $s_{C} = 18.23$ mmHg
-- Pooled variance estimate
-```{r}
-sp <- sqrt((7 * 15.34^2 + 20 * 18.23^2) / (8 + 21 - 2))
-132.86 - 127.44 + c(-1, 1) * qt(.975, 27) * sp * (1 / 8 + 1 / 21)^.5
-```
-
----
-```{r}
-data(sleep)
-x1 <- sleep$extra[sleep$group == 1]
-x2 <- sleep$extra[sleep$group == 2]
-n1 <- length(x1)
-n2 <- length(x2)
-sp <- sqrt( ((n1 - 1) * sd(x1)^2 + (n2-1) * sd(x2)^2) / (n1 + n2-2))
-md <- mean(x1) - mean(x2)
-semd <- sp * sqrt(1 / n1 + 1/n2)
-md + c(-1, 1) * qt(.975, n1 + n2 - 2) * semd
-t.test(x1, x2, paired = FALSE, var.equal = TRUE)$conf
-t.test(x1, x2, paired = TRUE)$conf
-```
-
----
-## Ignoring pairing
-```{r, echo = FALSE, fig.width=5, fig.height=5}
-plot(c(0.5, 2.5), range(x1, x2), type = "n", frame = FALSE, xlab = "group", ylab = "Extra", axes = FALSE)
-axis(2)
-axis(1, at = 1 : 2, labels = c("Group 1", "Group 2"))
-for (i in 1 : n1) lines(c(1, 2), c(x1[i], x2[i]), lwd = 2, col = "red")
-for (i in 1 : n1) points(c(1, 2), c(x1[i], x2[i]), lwd = 2, col = "black", bg = "salmon", pch = 21, cex = 3)
-```
-
----
-
-## Unequal variances
-
-- Under unequal variances
-$$
-    \bar Y - \bar X \sim N\left(\mu_y - \mu_x, \frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)
-$$
-- The statistic 
-$$
-    \frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\left(\frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)^{1/2}}
-$$
-approximately follows Gosset's $t$ distribution with degrees of freedom equal to
-$$
-    \frac{\left(S_x^2 / n_x + S_y^2/n_y\right)^2}
-    {\left(\frac{S_x^2}{n_x}\right)^2 / (n_x - 1) +
-      \left(\frac{S_y^2}{n_y}\right)^2 / (n_y - 1)}
-$$
-
----
-
-## Example
-
-- Comparing SBP for 8 oral contraceptive users versus 21 controls
-- $\bar X_{OC} = 132.86$ mmHg with $s_{OC} = 15.34$ mmHg
-- $\bar X_{C} = 127.44$ mmHg with $s_{C} = 18.23$ mmHg
-- $df=15.04$, $t_{15.04, .975} = 2.13$
-- Interval
-$$
-132.86 - 127.44 \pm 2.13 \left(\frac{15.34^2}{8} + \frac{18.23^2}{21} \right)^{1/2}
-= [-8.91, 19.75]
-$$
-- In R, `t.test(..., var.equal = FALSE)`
-
----
-## Comparing other kinds of data
-* For binomial data, there's lots of ways to compare two groups
-  * Relative risk, risk difference, odds ratio.
-  * Chi-squared tests, normal approximations, exact tests.
-* For count data, there's also Chi-squared tests and exact tests.
-* We'll leave the discussions for comparing groups of data for binary
-  and count data until covering glms in the regression class.
-* In addition, Mathematical Biostatistics Boot Camp 2 covers many special
-  cases relevant to biostatistics.
+---
+title       : Two group intervals
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Independent group $t$ confidence intervals
+
+- Suppose that we want to compare the mean blood pressure between two groups in a randomized trial; those who received the treatment to those who received a placebo
+- We cannot use the paired t test because the groups are independent and may have different sample sizes
+- We now present methods for comparing independent groups
+
+---
+
+## Notation
+
+- Let $X_1,\ldots,X_{n_x}$ be iid $N(\mu_x,\sigma^2)$
+- Let $Y_1,\ldots,Y_{n_y}$ be iid $N(\mu_y, \sigma^2)$
+- Let $\bar X$, $\bar Y$, $S_x$, $S_y$ be the means and standard deviations
+- Using the fact that linear combinations of normals are again normal, we know that $\bar Y - \bar X$ is also normal with mean $\mu_y - \mu_x$ and variance $\sigma^2 (\frac{1}{n_x} + \frac{1}{n_y})$
+- The pooled variance estimator $$S_p^2 = \{(n_x - 1) S_x^2 + (n_y - 1) S_y^2\}/(n_x + n_y - 2)$$ is a good estimator of $\sigma^2$
+
+---
+
+## Note
+
+- The pooled estimator is a mixture of the group variances, placing greater weight on whichever has a larger sample size
+- If the sample sizes are the same the pooled variance estimate is the average of the group variances
+- The pooled estimator is unbiased
+$$
+    \begin{eqnarray*}
+    E[S_p^2] & = & \frac{(n_x - 1) E[S_x^2] + (n_y - 1) E[S_y^2]}{n_x + n_y - 2}\\
+            & = & \frac{(n_x - 1)\sigma^2 + (n_y - 1)\sigma^2}{n_x + n_y - 2}
+    \end{eqnarray*}
+$$
+- The pooled variance  estimate is independent of $\bar Y - \bar X$ since $S_x$ is independent of $\bar X$ and $S_y$ is independent of $\bar Y$ and the groups are independent
+
+---
+
+## Result
+
+- The sum of two independent Chi-squared random variables is Chi-squared with degrees of freedom equal to the sum of the degrees of freedom of the summands
+- Therefore
+$$
+    \begin{eqnarray*}
+      (n_x + n_y - 2) S_p^2 / \sigma^2 & = & (n_x - 1)S_x^2 /\sigma^2 + (n_y - 1)S_y^2/\sigma^2 \\ \\
+      & = & \chi^2_{n_x - 1} + \chi^2_{n_y-1} \\ \\
+      & = & \chi^2_{n_x + n_y - 2}
+    \end{eqnarray*}
+$$
+
+---
+
+## Putting this all together
+
+- The statistic
+$$
+    \frac{\frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\sigma \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}}%
+    {\sqrt{\frac{(n_x + n_y - 2) S_p^2}{(n_x + n_y - 2)\sigma^2}}}
+    = \frac{\bar Y - \bar X - (\mu_y - \mu_x)}{S_p \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}
+$$
+is a standard normal divided by the square root of an independent Chi-squared divided by its degrees of freedom 
+- Therefore this statistic follows Gosset's $t$ distribution with $n_x + n_y - 2$ degrees of freedom
+- Notice the form is (estimator - true value) / SE
+
+---
+
+## Confidence interval
+
+- Therefore a $(1 - \alpha)\times 100\%$ confidence interval for $\mu_y - \mu_x$ is 
+$$
+    \bar Y - \bar X \pm t_{n_x + n_y - 2, 1 - \alpha/2}S_p\left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}
+$$
+- Remember this interval is assuming a constant variance across the two groups
+- If there is some doubt, assume a different variance per group, which we will discuss later
+
+---
+
+
+## Example
+### Based on Rosner, Fundamentals of Biostatistics
+
+- Comparing SBP for 8 oral contraceptive users versus 21 controls
+- $\bar X_{OC} = 132.86$ mmHg with $s_{OC} = 15.34$ mmHg
+- $\bar X_{C} = 127.44$ mmHg with $s_{C} = 18.23$ mmHg
+- Pooled variance estimate
+```{r}
+sp <- sqrt((7 * 15.34^2 + 20 * 18.23^2) / (8 + 21 - 2))
+132.86 - 127.44 + c(-1, 1) * qt(.975, 27) * sp * (1 / 8 + 1 / 21)^.5
+```
+
+---
+```{r}
+data(sleep)
+x1 <- sleep$extra[sleep$group == 1]
+x2 <- sleep$extra[sleep$group == 2]
+n1 <- length(x1)
+n2 <- length(x2)
+sp <- sqrt( ((n1 - 1) * sd(x1)^2 + (n2-1) * sd(x2)^2) / (n1 + n2-2))
+md <- mean(x1) - mean(x2)
+semd <- sp * sqrt(1 / n1 + 1/n2)
+md + c(-1, 1) * qt(.975, n1 + n2 - 2) * semd
+t.test(x1, x2, paired = FALSE, var.equal = TRUE)$conf
+t.test(x1, x2, paired = TRUE)$conf
+```
+
+---
+## Ignoring pairing
+```{r, echo = FALSE, fig.width=5, fig.height=5}
+plot(c(0.5, 2.5), range(x1, x2), type = "n", frame = FALSE, xlab = "group", ylab = "Extra", axes = FALSE)
+axis(2)
+axis(1, at = 1 : 2, labels = c("Group 1", "Group 2"))
+for (i in 1 : n1) lines(c(1, 2), c(x1[i], x2[i]), lwd = 2, col = "red")
+for (i in 1 : n1) points(c(1, 2), c(x1[i], x2[i]), lwd = 2, col = "black", bg = "salmon", pch = 21, cex = 3)
+```
+
+---
+
+## Unequal variances
+
+- Under unequal variances
+$$
+    \bar Y - \bar X \sim N\left(\mu_y - \mu_x, \frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)
+$$
+- The statistic 
+$$
+    \frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\left(\frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)^{1/2}}
+$$
+approximately follows Gosset's $t$ distribution with degrees of freedom equal to
+$$
+    \frac{\left(S_x^2 / n_x + S_y^2/n_y\right)^2}
+    {\left(\frac{S_x^2}{n_x}\right)^2 / (n_x - 1) +
+      \left(\frac{S_y^2}{n_y}\right)^2 / (n_y - 1)}
+$$
+
+---
+
+## Example
+
+- Comparing SBP for 8 oral contraceptive users versus 21 controls
+- $\bar X_{OC} = 132.86$ mmHg with $s_{OC} = 15.34$ mmHg
+- $\bar X_{C} = 127.44$ mmHg with $s_{C} = 18.23$ mmHg
+- $df=15.04$, $t_{15.04, .975} = 2.13$
+- Interval
+$$
+132.86 - 127.44 \pm 2.13 \left(\frac{15.34^2}{8} + \frac{18.23^2}{21} \right)^{1/2}
+= [-8.91, 19.75]
+$$
+- In R, `t.test(..., var.equal = FALSE)`
+
+---
+## Comparing other kinds of data
+* For binomial data, there's lots of ways to compare two groups
+  * Relative risk, risk difference, odds ratio.
+  * Chi-squared tests, normal approximations, exact tests.
+* For count data, there's also Chi-squared tests and exact tests.
+* We'll leave the discussions for comparing groups of data for binary
+  and count data until covering glms in the regression class.
+* In addition, Mathematical Biostatistics Boot Camp 2 covers many special
+  cases relevant to biostatistics.
diff --git a/06_StatisticalInference/03_01_TwoGroupIntervals/index.html b/06_StatisticalInference/03_01_TwoGroupIntervals/index.html
index 910ae14e9..bb7feb5c5 100644
--- a/06_StatisticalInference/03_01_TwoGroupIntervals/index.html
+++ b/06_StatisticalInference/03_01_TwoGroupIntervals/index.html
@@ -1,408 +1,408 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Two group intervals</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Two group intervals">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Two group intervals</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Independent group \(t\) confidence intervals</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose that we want to compare the mean blood pressure between two groups in a randomized trial; those who received the treatment to those who received a placebo</li>
-<li>We cannot use the paired t test because the groups are independent and may have different sample sizes</li>
-<li>We now present methods for comparing independent groups</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Notation</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Let \(X_1,\ldots,X_{n_x}\) be iid \(N(\mu_x,\sigma^2)\)</li>
-<li>Let \(Y_1,\ldots,Y_{n_y}\) be iid \(N(\mu_y, \sigma^2)\)</li>
-<li>Let \(\bar X\), \(\bar Y\), \(S_x\), \(S_y\) be the means and standard deviations</li>
-<li>Using the fact that linear combinations of normals are again normal, we know that \(\bar Y - \bar X\) is also normal with mean \(\mu_y - \mu_x\) and variance \(\sigma^2 (\frac{1}{n_x} + \frac{1}{n_y})\)</li>
-<li>The pooled variance estimator \[S_p^2 = \{(n_x - 1) S_x^2 + (n_y - 1) S_y^2\}/(n_x + n_y - 2)\] is a good estimator of \(\sigma^2\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Note</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The pooled estimator is a mixture of the group variances, placing greater weight on whichever has a larger sample size</li>
-<li>If the sample sizes are the same the pooled variance estimate is the average of the group variances</li>
-<li>The pooled estimator is unbiased
-\[
-\begin{eqnarray*}
-E[S_p^2] & = & \frac{(n_x - 1) E[S_x^2] + (n_y - 1) E[S_y^2]}{n_x + n_y - 2}\\
-        & = & \frac{(n_x - 1)\sigma^2 + (n_y - 1)\sigma^2}{n_x + n_y - 2}
-\end{eqnarray*}
-\]</li>
-<li>The pooled variance  estimate is independent of \(\bar Y - \bar X\) since \(S_x\) is independent of \(\bar X\) and \(S_y\) is independent of \(\bar Y\) and the groups are independent</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Result</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The sum of two independent Chi-squared random variables is Chi-squared with degrees of freedom equal to the sum of the degrees of freedom of the summands</li>
-<li>Therefore
-\[
-\begin{eqnarray*}
-  (n_x + n_y - 2) S_p^2 / \sigma^2 & = & (n_x - 1)S_x^2 /\sigma^2 + (n_y - 1)S_y^2/\sigma^2 \\ \\
-  & = & \chi^2_{n_x - 1} + \chi^2_{n_y-1} \\ \\
-  & = & \chi^2_{n_x + n_y - 2}
-\end{eqnarray*}
-\]</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Putting this all together</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The statistic
-\[
-\frac{\frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\sigma \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}}%
-{\sqrt{\frac{(n_x + n_y - 2) S_p^2}{(n_x + n_y - 2)\sigma^2}}}
-= \frac{\bar Y - \bar X - (\mu_y - \mu_x)}{S_p \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}
-\]
-is a standard normal divided by the square root of an independent Chi-squared divided by its degrees of freedom </li>
-<li>Therefore this statistic follows Gosset&#39;s \(t\) distribution with \(n_x + n_y - 2\) degrees of freedom</li>
-<li>Notice the form is (estimator - true value) / SE</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Confidence interval</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Therefore a \((1 - \alpha)\times 100\%\) confidence interval for \(\mu_y - \mu_x\) is 
-\[
-\bar Y - \bar X \pm t_{n_x + n_y - 2, 1 - \alpha/2}S_p\left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}
-\]</li>
-<li>Remember this interval is assuming a constant variance across the two groups</li>
-<li>If there is some doubt, assume a different variance per group, which we will discuss later</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <h3>Based on Rosner, Fundamentals of Biostatistics</h3>
-
-<ul>
-<li>Comparing SBP for 8 oral contraceptive users versus 21 controls</li>
-<li>\(\bar X_{OC} = 132.86\) mmHg with \(s_{OC} = 15.34\) mmHg</li>
-<li>\(\bar X_{C} = 127.44\) mmHg with \(s_{C} = 18.23\) mmHg</li>
-<li>Pooled variance estimate</li>
-</ul>
-
-<pre><code class="r">sp &lt;- sqrt((7 * 15.34^2 + 20 * 18.23^2) / (8 + 21 - 2))
-132.86 - 127.44 + c(-1, 1) * qt(.975, 27) * sp * (1 / 8 + 1 / 21)^.5
-</code></pre>
-
-<pre><code>[1] -9.521 20.361
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <article data-timings="">
-    <pre><code class="r">data(sleep)
-x1 &lt;- sleep$extra[sleep$group == 1]
-x2 &lt;- sleep$extra[sleep$group == 2]
-n1 &lt;- length(x1)
-n2 &lt;- length(x2)
-sp &lt;- sqrt( ((n1 - 1) * sd(x1)^2 + (n2-1) * sd(x2)^2) / (n1 + n2-2))
-md &lt;- mean(x1) - mean(x2)
-semd &lt;- sp * sqrt(1 / n1 + 1/n2)
-md + c(-1, 1) * qt(.975, n1 + n2 - 2) * semd
-</code></pre>
-
-<pre><code>[1] -3.3639  0.2039
-</code></pre>
-
-<pre><code class="r">t.test(x1, x2, paired = FALSE, var.equal = TRUE)$conf
-</code></pre>
-
-<pre><code>[1] -3.3639  0.2039
-attr(,&quot;conf.level&quot;)
-[1] 0.95
-</code></pre>
-
-<pre><code class="r">t.test(x1, x2, paired = TRUE)$conf
-</code></pre>
-
-<pre><code>[1] -2.4599 -0.7001
-attr(,&quot;conf.level&quot;)
-[1] 0.95
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>Ignoring pairing</h2>
-  </hgroup>
-  <article data-timings="">
-    <div class="rimage center"><img src="fig/unnamed-chunk-3.png" title="plot of chunk unnamed-chunk-3" alt="plot of chunk unnamed-chunk-3" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>Unequal variances</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Under unequal variances
-\[
-\bar Y - \bar X \sim N\left(\mu_y - \mu_x, \frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)
-\]</li>
-<li>The statistic 
-\[
-\frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\left(\frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)^{1/2}}
-\]
-approximately follows Gosset&#39;s \(t\) distribution with degrees of freedom equal to
-\[
-\frac{\left(S_x^2 / n_x + S_y^2/n_y\right)^2}
-{\left(\frac{S_x^2}{n_x}\right)^2 / (n_x - 1) +
-  \left(\frac{S_y^2}{n_y}\right)^2 / (n_y - 1)}
-\]</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Comparing SBP for 8 oral contraceptive users versus 21 controls</li>
-<li>\(\bar X_{OC} = 132.86\) mmHg with \(s_{OC} = 15.34\) mmHg</li>
-<li>\(\bar X_{C} = 127.44\) mmHg with \(s_{C} = 18.23\) mmHg</li>
-<li>\(df=15.04\), \(t_{15.04, .975} = 2.13\)</li>
-<li>Interval
-\[
-132.86 - 127.44 \pm 2.13 \left(\frac{15.34^2}{8} + \frac{18.23^2}{21} \right)^{1/2}
-= [-8.91, 19.75]
-\]</li>
-<li>In R, <code>t.test(..., var.equal = FALSE)</code></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-12" style="background:;">
-  <hgroup>
-    <h2>Comparing other kinds of data</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>For binomial data, there&#39;s lots of ways to compare two groups
-
-<ul>
-<li>Relative risk, risk difference, odds ratio.</li>
-<li>Chi-squared tests, normal approximations, exact tests.</li>
-</ul></li>
-<li>For count data, there&#39;s also Chi-squared tests and exact tests.</li>
-<li>We&#39;ll leave the discussions for comparing groups of data for binary
-and count data until covering glms in the regression class.</li>
-<li>In addition, Mathematical Biostatistics Boot Camp 2 covers many special
-cases relevant to biostatistics.</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Independent group \(t\) confidence intervals'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Notation'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Note'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Result'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Putting this all together'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Confidence interval'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Example'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title=''>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='Ignoring pairing'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='Unequal variances'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title='Example'>
-         11
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=12 title='Comparing other kinds of data'>
-         12
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Two group intervals</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Two group intervals">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Two group intervals</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Independent group \(t\) confidence intervals</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose that we want to compare the mean blood pressure between two groups in a randomized trial; those who received the treatment to those who received a placebo</li>
+<li>We cannot use the paired t test because the groups are independent and may have different sample sizes</li>
+<li>We now present methods for comparing independent groups</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Notation</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Let \(X_1,\ldots,X_{n_x}\) be iid \(N(\mu_x,\sigma^2)\)</li>
+<li>Let \(Y_1,\ldots,Y_{n_y}\) be iid \(N(\mu_y, \sigma^2)\)</li>
+<li>Let \(\bar X\), \(\bar Y\), \(S_x\), \(S_y\) be the means and standard deviations</li>
+<li>Using the fact that linear combinations of normals are again normal, we know that \(\bar Y - \bar X\) is also normal with mean \(\mu_y - \mu_x\) and variance \(\sigma^2 (\frac{1}{n_x} + \frac{1}{n_y})\)</li>
+<li>The pooled variance estimator \[S_p^2 = \{(n_x - 1) S_x^2 + (n_y - 1) S_y^2\}/(n_x + n_y - 2)\] is a good estimator of \(\sigma^2\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Note</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The pooled estimator is a mixture of the group variances, placing greater weight on whichever has a larger sample size</li>
+<li>If the sample sizes are the same the pooled variance estimate is the average of the group variances</li>
+<li>The pooled estimator is unbiased
+\[
+\begin{eqnarray*}
+E[S_p^2] & = & \frac{(n_x - 1) E[S_x^2] + (n_y - 1) E[S_y^2]}{n_x + n_y - 2}\\
+        & = & \frac{(n_x - 1)\sigma^2 + (n_y - 1)\sigma^2}{n_x + n_y - 2}
+\end{eqnarray*}
+\]</li>
+<li>The pooled variance  estimate is independent of \(\bar Y - \bar X\) since \(S_x\) is independent of \(\bar X\) and \(S_y\) is independent of \(\bar Y\) and the groups are independent</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Result</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The sum of two independent Chi-squared random variables is Chi-squared with degrees of freedom equal to the sum of the degrees of freedom of the summands</li>
+<li>Therefore
+\[
+\begin{eqnarray*}
+  (n_x + n_y - 2) S_p^2 / \sigma^2 & = & (n_x - 1)S_x^2 /\sigma^2 + (n_y - 1)S_y^2/\sigma^2 \\ \\
+  & = & \chi^2_{n_x - 1} + \chi^2_{n_y-1} \\ \\
+  & = & \chi^2_{n_x + n_y - 2}
+\end{eqnarray*}
+\]</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Putting this all together</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The statistic
+\[
+\frac{\frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\sigma \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}}%
+{\sqrt{\frac{(n_x + n_y - 2) S_p^2}{(n_x + n_y - 2)\sigma^2}}}
+= \frac{\bar Y - \bar X - (\mu_y - \mu_x)}{S_p \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}
+\]
+is a standard normal divided by the square root of an independent Chi-squared divided by its degrees of freedom </li>
+<li>Therefore this statistic follows Gosset&#39;s \(t\) distribution with \(n_x + n_y - 2\) degrees of freedom</li>
+<li>Notice the form is (estimator - true value) / SE</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Confidence interval</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Therefore a \((1 - \alpha)\times 100\%\) confidence interval for \(\mu_y - \mu_x\) is 
+\[
+\bar Y - \bar X \pm t_{n_x + n_y - 2, 1 - \alpha/2}S_p\left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}
+\]</li>
+<li>Remember this interval is assuming a constant variance across the two groups</li>
+<li>If there is some doubt, assume a different variance per group, which we will discuss later</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <h3>Based on Rosner, Fundamentals of Biostatistics</h3>
+
+<ul>
+<li>Comparing SBP for 8 oral contraceptive users versus 21 controls</li>
+<li>\(\bar X_{OC} = 132.86\) mmHg with \(s_{OC} = 15.34\) mmHg</li>
+<li>\(\bar X_{C} = 127.44\) mmHg with \(s_{C} = 18.23\) mmHg</li>
+<li>Pooled variance estimate</li>
+</ul>
+
+<pre><code class="r">sp &lt;- sqrt((7 * 15.34^2 + 20 * 18.23^2)/(8 + 21 - 2))
+132.86 - 127.44 + c(-1, 1) * qt(0.975, 27) * sp * (1/8 + 1/21)^0.5
+</code></pre>
+
+<pre><code>## [1] -9.521 20.361
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <article data-timings="">
+    <pre><code class="r">data(sleep)
+x1 &lt;- sleep$extra[sleep$group == 1]
+x2 &lt;- sleep$extra[sleep$group == 2]
+n1 &lt;- length(x1)
+n2 &lt;- length(x2)
+sp &lt;- sqrt(((n1 - 1) * sd(x1)^2 + (n2 - 1) * sd(x2)^2)/(n1 + n2 - 2))
+md &lt;- mean(x1) - mean(x2)
+semd &lt;- sp * sqrt(1/n1 + 1/n2)
+md + c(-1, 1) * qt(0.975, n1 + n2 - 2) * semd
+</code></pre>
+
+<pre><code>## [1] -3.3639  0.2039
+</code></pre>
+
+<pre><code class="r">t.test(x1, x2, paired = FALSE, var.equal = TRUE)$conf
+</code></pre>
+
+<pre><code>## [1] -3.3639  0.2039
+## attr(,&quot;conf.level&quot;)
+## [1] 0.95
+</code></pre>
+
+<pre><code class="r">t.test(x1, x2, paired = TRUE)$conf
+</code></pre>
+
+<pre><code>## [1] -2.4599 -0.7001
+## attr(,&quot;conf.level&quot;)
+## [1] 0.95
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>Ignoring pairing</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><img src="assets/fig/unnamed-chunk-3.png" alt="plot of chunk unnamed-chunk-3"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>Unequal variances</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Under unequal variances
+\[
+\bar Y - \bar X \sim N\left(\mu_y - \mu_x, \frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)
+\]</li>
+<li>The statistic 
+\[
+\frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\left(\frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)^{1/2}}
+\]
+approximately follows Gosset&#39;s \(t\) distribution with degrees of freedom equal to
+\[
+\frac{\left(S_x^2 / n_x + S_y^2/n_y\right)^2}
+{\left(\frac{S_x^2}{n_x}\right)^2 / (n_x - 1) +
+  \left(\frac{S_y^2}{n_y}\right)^2 / (n_y - 1)}
+\]</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Comparing SBP for 8 oral contraceptive users versus 21 controls</li>
+<li>\(\bar X_{OC} = 132.86\) mmHg with \(s_{OC} = 15.34\) mmHg</li>
+<li>\(\bar X_{C} = 127.44\) mmHg with \(s_{C} = 18.23\) mmHg</li>
+<li>\(df=15.04\), \(t_{15.04, .975} = 2.13\)</li>
+<li>Interval
+\[
+132.86 - 127.44 \pm 2.13 \left(\frac{15.34^2}{8} + \frac{18.23^2}{21} \right)^{1/2}
+= [-8.91, 19.75]
+\]</li>
+<li>In R, <code>t.test(..., var.equal = FALSE)</code></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <hgroup>
+    <h2>Comparing other kinds of data</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>For binomial data, there&#39;s lots of ways to compare two groups
+
+<ul>
+<li>Relative risk, risk difference, odds ratio.</li>
+<li>Chi-squared tests, normal approximations, exact tests.</li>
+</ul></li>
+<li>For count data, there&#39;s also Chi-squared tests and exact tests.</li>
+<li>We&#39;ll leave the discussions for comparing groups of data for binary
+and count data until covering glms in the regression class.</li>
+<li>In addition, Mathematical Biostatistics Boot Camp 2 covers many special
+cases relevant to biostatistics.</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Independent group \(t\) confidence intervals'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Notation'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Note'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Result'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Putting this all together'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Confidence interval'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Example'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title=''>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='Ignoring pairing'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='Unequal variances'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title='Example'>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title='Comparing other kinds of data'>
+         12
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/03_01_TwoGroupIntervals/index.md b/06_StatisticalInference/03_01_TwoGroupIntervals/index.md
index 7bcda6797..d856f7887 100644
--- a/06_StatisticalInference/03_01_TwoGroupIntervals/index.md
+++ b/06_StatisticalInference/03_01_TwoGroupIntervals/index.md
@@ -1,197 +1,195 @@
----
-title       : Two group intervals
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-
-## Independent group $t$ confidence intervals
-
-- Suppose that we want to compare the mean blood pressure between two groups in a randomized trial; those who received the treatment to those who received a placebo
-- We cannot use the paired t test because the groups are independent and may have different sample sizes
-- We now present methods for comparing independent groups
-
----
-
-## Notation
-
-- Let $X_1,\ldots,X_{n_x}$ be iid $N(\mu_x,\sigma^2)$
-- Let $Y_1,\ldots,Y_{n_y}$ be iid $N(\mu_y, \sigma^2)$
-- Let $\bar X$, $\bar Y$, $S_x$, $S_y$ be the means and standard deviations
-- Using the fact that linear combinations of normals are again normal, we know that $\bar Y - \bar X$ is also normal with mean $\mu_y - \mu_x$ and variance $\sigma^2 (\frac{1}{n_x} + \frac{1}{n_y})$
-- The pooled variance estimator $$S_p^2 = \{(n_x - 1) S_x^2 + (n_y - 1) S_y^2\}/(n_x + n_y - 2)$$ is a good estimator of $\sigma^2$
-
----
-
-## Note
-
-- The pooled estimator is a mixture of the group variances, placing greater weight on whichever has a larger sample size
-- If the sample sizes are the same the pooled variance estimate is the average of the group variances
-- The pooled estimator is unbiased
-$$
-    \begin{eqnarray*}
-    E[S_p^2] & = & \frac{(n_x - 1) E[S_x^2] + (n_y - 1) E[S_y^2]}{n_x + n_y - 2}\\
-            & = & \frac{(n_x - 1)\sigma^2 + (n_y - 1)\sigma^2}{n_x + n_y - 2}
-    \end{eqnarray*}
-$$
-- The pooled variance  estimate is independent of $\bar Y - \bar X$ since $S_x$ is independent of $\bar X$ and $S_y$ is independent of $\bar Y$ and the groups are independent
-
----
-
-## Result
-
-- The sum of two independent Chi-squared random variables is Chi-squared with degrees of freedom equal to the sum of the degrees of freedom of the summands
-- Therefore
-$$
-    \begin{eqnarray*}
-      (n_x + n_y - 2) S_p^2 / \sigma^2 & = & (n_x - 1)S_x^2 /\sigma^2 + (n_y - 1)S_y^2/\sigma^2 \\ \\
-      & = & \chi^2_{n_x - 1} + \chi^2_{n_y-1} \\ \\
-      & = & \chi^2_{n_x + n_y - 2}
-    \end{eqnarray*}
-$$
-
----
-
-## Putting this all together
-
-- The statistic
-$$
-    \frac{\frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\sigma \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}}%
-    {\sqrt{\frac{(n_x + n_y - 2) S_p^2}{(n_x + n_y - 2)\sigma^2}}}
-    = \frac{\bar Y - \bar X - (\mu_y - \mu_x)}{S_p \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}
-$$
-is a standard normal divided by the square root of an independent Chi-squared divided by its degrees of freedom 
-- Therefore this statistic follows Gosset's $t$ distribution with $n_x + n_y - 2$ degrees of freedom
-- Notice the form is (estimator - true value) / SE
-
----
-
-## Confidence interval
-
-- Therefore a $(1 - \alpha)\times 100\%$ confidence interval for $\mu_y - \mu_x$ is 
-$$
-    \bar Y - \bar X \pm t_{n_x + n_y - 2, 1 - \alpha/2}S_p\left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}
-$$
-- Remember this interval is assuming a constant variance across the two groups
-- If there is some doubt, assume a different variance per group, which we will discuss later
-
----
-
-
-## Example
-### Based on Rosner, Fundamentals of Biostatistics
-
-- Comparing SBP for 8 oral contraceptive users versus 21 controls
-- $\bar X_{OC} = 132.86$ mmHg with $s_{OC} = 15.34$ mmHg
-- $\bar X_{C} = 127.44$ mmHg with $s_{C} = 18.23$ mmHg
-- Pooled variance estimate
-
-```r
-sp <- sqrt((7 * 15.34^2 + 20 * 18.23^2) / (8 + 21 - 2))
-132.86 - 127.44 + c(-1, 1) * qt(.975, 27) * sp * (1 / 8 + 1 / 21)^.5
-```
-
-```
-[1] -9.521 20.361
-```
-
-
----
-
-```r
-data(sleep)
-x1 <- sleep$extra[sleep$group == 1]
-x2 <- sleep$extra[sleep$group == 2]
-n1 <- length(x1)
-n2 <- length(x2)
-sp <- sqrt( ((n1 - 1) * sd(x1)^2 + (n2-1) * sd(x2)^2) / (n1 + n2-2))
-md <- mean(x1) - mean(x2)
-semd <- sp * sqrt(1 / n1 + 1/n2)
-md + c(-1, 1) * qt(.975, n1 + n2 - 2) * semd
-```
-
-```
-[1] -3.3639  0.2039
-```
-
-```r
-t.test(x1, x2, paired = FALSE, var.equal = TRUE)$conf
-```
-
-```
-[1] -3.3639  0.2039
-attr(,"conf.level")
-[1] 0.95
-```
-
-```r
-t.test(x1, x2, paired = TRUE)$conf
-```
-
-```
-[1] -2.4599 -0.7001
-attr(,"conf.level")
-[1] 0.95
-```
-
-
----
-## Ignoring pairing
-<div class="rimage center"><img src="fig/unnamed-chunk-3.png" title="plot of chunk unnamed-chunk-3" alt="plot of chunk unnamed-chunk-3" class="plot" /></div>
-
-
----
-
-## Unequal variances
-
-- Under unequal variances
-$$
-    \bar Y - \bar X \sim N\left(\mu_y - \mu_x, \frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)
-$$
-- The statistic 
-$$
-    \frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\left(\frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)^{1/2}}
-$$
-approximately follows Gosset's $t$ distribution with degrees of freedom equal to
-$$
-    \frac{\left(S_x^2 / n_x + S_y^2/n_y\right)^2}
-    {\left(\frac{S_x^2}{n_x}\right)^2 / (n_x - 1) +
-      \left(\frac{S_y^2}{n_y}\right)^2 / (n_y - 1)}
-$$
-
----
-
-## Example
-
-- Comparing SBP for 8 oral contraceptive users versus 21 controls
-- $\bar X_{OC} = 132.86$ mmHg with $s_{OC} = 15.34$ mmHg
-- $\bar X_{C} = 127.44$ mmHg with $s_{C} = 18.23$ mmHg
-- $df=15.04$, $t_{15.04, .975} = 2.13$
-- Interval
-$$
-132.86 - 127.44 \pm 2.13 \left(\frac{15.34^2}{8} + \frac{18.23^2}{21} \right)^{1/2}
-= [-8.91, 19.75]
-$$
-- In R, `t.test(..., var.equal = FALSE)`
-
----
-## Comparing other kinds of data
-* For binomial data, there's lots of ways to compare two groups
-  * Relative risk, risk difference, odds ratio.
-  * Chi-squared tests, normal approximations, exact tests.
-* For count data, there's also Chi-squared tests and exact tests.
-* We'll leave the discussions for comparing groups of data for binary
-  and count data until covering glms in the regression class.
-* In addition, Mathematical Biostatistics Boot Camp 2 covers many special
-  cases relevant to biostatistics.
+---
+title       : Two group intervals
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Independent group $t$ confidence intervals
+
+- Suppose that we want to compare the mean blood pressure between two groups in a randomized trial; those who received the treatment to those who received a placebo
+- We cannot use the paired t test because the groups are independent and may have different sample sizes
+- We now present methods for comparing independent groups
+
+---
+
+## Notation
+
+- Let $X_1,\ldots,X_{n_x}$ be iid $N(\mu_x,\sigma^2)$
+- Let $Y_1,\ldots,Y_{n_y}$ be iid $N(\mu_y, \sigma^2)$
+- Let $\bar X$, $\bar Y$, $S_x$, $S_y$ be the means and standard deviations
+- Using the fact that linear combinations of normals are again normal, we know that $\bar Y - \bar X$ is also normal with mean $\mu_y - \mu_x$ and variance $\sigma^2 (\frac{1}{n_x} + \frac{1}{n_y})$
+- The pooled variance estimator $$S_p^2 = \{(n_x - 1) S_x^2 + (n_y - 1) S_y^2\}/(n_x + n_y - 2)$$ is a good estimator of $\sigma^2$
+
+---
+
+## Note
+
+- The pooled estimator is a mixture of the group variances, placing greater weight on whichever has a larger sample size
+- If the sample sizes are the same the pooled variance estimate is the average of the group variances
+- The pooled estimator is unbiased
+$$
+    \begin{eqnarray*}
+    E[S_p^2] & = & \frac{(n_x - 1) E[S_x^2] + (n_y - 1) E[S_y^2]}{n_x + n_y - 2}\\
+            & = & \frac{(n_x - 1)\sigma^2 + (n_y - 1)\sigma^2}{n_x + n_y - 2}
+    \end{eqnarray*}
+$$
+- The pooled variance  estimate is independent of $\bar Y - \bar X$ since $S_x$ is independent of $\bar X$ and $S_y$ is independent of $\bar Y$ and the groups are independent
+
+---
+
+## Result
+
+- The sum of two independent Chi-squared random variables is Chi-squared with degrees of freedom equal to the sum of the degrees of freedom of the summands
+- Therefore
+$$
+    \begin{eqnarray*}
+      (n_x + n_y - 2) S_p^2 / \sigma^2 & = & (n_x - 1)S_x^2 /\sigma^2 + (n_y - 1)S_y^2/\sigma^2 \\ \\
+      & = & \chi^2_{n_x - 1} + \chi^2_{n_y-1} \\ \\
+      & = & \chi^2_{n_x + n_y - 2}
+    \end{eqnarray*}
+$$
+
+---
+
+## Putting this all together
+
+- The statistic
+$$
+    \frac{\frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\sigma \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}}%
+    {\sqrt{\frac{(n_x + n_y - 2) S_p^2}{(n_x + n_y - 2)\sigma^2}}}
+    = \frac{\bar Y - \bar X - (\mu_y - \mu_x)}{S_p \left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}}
+$$
+is a standard normal divided by the square root of an independent Chi-squared divided by its degrees of freedom 
+- Therefore this statistic follows Gosset's $t$ distribution with $n_x + n_y - 2$ degrees of freedom
+- Notice the form is (estimator - true value) / SE
+
+---
+
+## Confidence interval
+
+- Therefore a $(1 - \alpha)\times 100\%$ confidence interval for $\mu_y - \mu_x$ is 
+$$
+    \bar Y - \bar X \pm t_{n_x + n_y - 2, 1 - \alpha/2}S_p\left(\frac{1}{n_x} + \frac{1}{n_y}\right)^{1/2}
+$$
+- Remember this interval is assuming a constant variance across the two groups
+- If there is some doubt, assume a different variance per group, which we will discuss later
+
+---
+
+
+## Example
+### Based on Rosner, Fundamentals of Biostatistics
+
+- Comparing SBP for 8 oral contraceptive users versus 21 controls
+- $\bar X_{OC} = 132.86$ mmHg with $s_{OC} = 15.34$ mmHg
+- $\bar X_{C} = 127.44$ mmHg with $s_{C} = 18.23$ mmHg
+- Pooled variance estimate
+
+```r
+sp <- sqrt((7 * 15.34^2 + 20 * 18.23^2)/(8 + 21 - 2))
+132.86 - 127.44 + c(-1, 1) * qt(0.975, 27) * sp * (1/8 + 1/21)^0.5
+```
+
+```
+## [1] -9.521 20.361
+```
+
+
+---
+
+```r
+data(sleep)
+x1 <- sleep$extra[sleep$group == 1]
+x2 <- sleep$extra[sleep$group == 2]
+n1 <- length(x1)
+n2 <- length(x2)
+sp <- sqrt(((n1 - 1) * sd(x1)^2 + (n2 - 1) * sd(x2)^2)/(n1 + n2 - 2))
+md <- mean(x1) - mean(x2)
+semd <- sp * sqrt(1/n1 + 1/n2)
+md + c(-1, 1) * qt(0.975, n1 + n2 - 2) * semd
+```
+
+```
+## [1] -3.3639  0.2039
+```
+
+```r
+t.test(x1, x2, paired = FALSE, var.equal = TRUE)$conf
+```
+
+```
+## [1] -3.3639  0.2039
+## attr(,"conf.level")
+## [1] 0.95
+```
+
+```r
+t.test(x1, x2, paired = TRUE)$conf
+```
+
+```
+## [1] -2.4599 -0.7001
+## attr(,"conf.level")
+## [1] 0.95
+```
+
+
+---
+## Ignoring pairing
+![plot of chunk unnamed-chunk-3](assets/fig/unnamed-chunk-3.png) 
+
+
+---
+
+## Unequal variances
+
+- Under unequal variances
+$$
+    \bar Y - \bar X \sim N\left(\mu_y - \mu_x, \frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)
+$$
+- The statistic 
+$$
+    \frac{\bar Y - \bar X - (\mu_y - \mu_x)}{\left(\frac{s_x^2}{n_x} + \frac{\sigma_y^2}{n_y}\right)^{1/2}}
+$$
+approximately follows Gosset's $t$ distribution with degrees of freedom equal to
+$$
+    \frac{\left(S_x^2 / n_x + S_y^2/n_y\right)^2}
+    {\left(\frac{S_x^2}{n_x}\right)^2 / (n_x - 1) +
+      \left(\frac{S_y^2}{n_y}\right)^2 / (n_y - 1)}
+$$
+
+---
+
+## Example
+
+- Comparing SBP for 8 oral contraceptive users versus 21 controls
+- $\bar X_{OC} = 132.86$ mmHg with $s_{OC} = 15.34$ mmHg
+- $\bar X_{C} = 127.44$ mmHg with $s_{C} = 18.23$ mmHg
+- $df=15.04$, $t_{15.04, .975} = 2.13$
+- Interval
+$$
+132.86 - 127.44 \pm 2.13 \left(\frac{15.34^2}{8} + \frac{18.23^2}{21} \right)^{1/2}
+= [-8.91, 19.75]
+$$
+- In R, `t.test(..., var.equal = FALSE)`
+
+---
+## Comparing other kinds of data
+* For binomial data, there's lots of ways to compare two groups
+  * Relative risk, risk difference, odds ratio.
+  * Chi-squared tests, normal approximations, exact tests.
+* For count data, there's also Chi-squared tests and exact tests.
+* We'll leave the discussions for comparing groups of data for binary
+  and count data until covering glms in the regression class.
+* In addition, Mathematical Biostatistics Boot Camp 2 covers many special
+  cases relevant to biostatistics.
diff --git a/06_StatisticalInference/03_01_TwoGroupIntervals/index.pdf b/06_StatisticalInference/03_01_TwoGroupIntervals/index.pdf
index a15898d7a..b46a7169f 100644
Binary files a/06_StatisticalInference/03_01_TwoGroupIntervals/index.pdf and b/06_StatisticalInference/03_01_TwoGroupIntervals/index.pdf differ
diff --git a/06_StatisticalInference/03_02_HypothesisTesting/index.Rmd b/06_StatisticalInference/03_02_HypothesisTesting/index.Rmd
index 3cfdbb6fd..79a50d8d6 100644
--- a/06_StatisticalInference/03_02_HypothesisTesting/index.Rmd
+++ b/06_StatisticalInference/03_02_HypothesisTesting/index.Rmd
@@ -1,215 +1,199 @@
----
-title       : Hypothesis testing
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F, results='hide'}
-# make this an external chunk that can be included in any file
-options(width = 100)
-opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/')
-
-options(xtable.type = 'html')
-knit_hooks$set(inline = function(x) {
-  if(is.numeric(x)) {
-    round(x, getOption('digits'))
-  } else {
-    paste(as.character(x), collapse = ', ')
-  }
-})
-knit_hooks$set(plot = knitr:::hook_plot_html)
-runif(1)
-```
-
-## Hypothesis testing
-* Hypothesis testing is concerned with making decisions using data
-* A null hypothesis is specified that represents the status quo,
-  usually labeled $H_0$
-* The null hypothesis is assumed true and statistical evidence is required
-  to reject it in favor of a research or alternative hypothesis 
-
----
-## Example
-* A respiratory disturbance index of more than $30$ events / hour, say, is 
-  considered evidence of severe sleep disordered breathing (SDB).
-* Suppose that in a sample of $100$ overweight subjects with other
-  risk factors for sleep disordered breathing at a sleep clinic, the
-  mean RDI was $32$ events / hour with a standard deviation of $10$ events / hour.
-* We might want to test the hypothesis that 
-  * $H_0 : \mu = 30$
-  * $H_a : \mu > 30$
-  * where $\mu$ is the population mean RDI.
-
----
-## Hypothesis testing
-* The alternative hypotheses are typically of the form $<$, $>$ or $\neq$
-* Note that there are four possible outcomes of our statistical decision process
-
-Truth | Decide | Result |
----|---|---|
-$H_0$ | $H_0$ | Correctly accept null |
-$H_0$ | $H_a$ | Type I error |
-$H_a$ | $H_a$ | Correctly reject null |
-$H_a$ | $H_0$ | Type II error |
-
----
-## Discussion
-* Consider a court of law; the null hypothesis is that the
-  defendant is innocent
-* We require evidence to reject the null hypothesis (convict)
-* If we require little evidence, then we would increase the
-  percentage of innocent people convicted (type I errors); however we
-  would also increase the percentage of guilty people convicted
-  (correctly rejecting the null)
-* If we require a lot of evidence, then we increase the the
-  percentage of innocent people let free (correctly accepting the
-  null) while we would also increase the percentage of guilty people
-  let free (type II errors)
-
----
-## Example
-* Consider our example again
-* A reasonable strategy would reject the null hypothesis if
-  $\bar X$ was larger than some constant, say $C$
-* Typically, $C$ is chosen so that the probability of a Type I
-  error, $\alpha$, is $.05$ (or some other relevant constant)
-* $\alpha$ = Type I error rate = Probability of rejecting the null hypothesis when, in fact, the null hypothesis is correct
-
----
-## Example continued
-
-
-$$
-\begin{align}
-0.05  & =  P\left(\bar X \geq C ~|~ \mu = 30 \right) \\
-      & =  P\left(\frac{\bar X - 30}{10 / \sqrt{100}} \geq \frac{C - 30}{10/\sqrt{100}} ~|~ \mu = 30\right) \\
-      & =  P\left(Z \geq \frac{C - 30}{1}\right) \\
-\end{align}
-$$
-
-* Hence $(C - 30) / 1 = 1.645$ implying $C = 31.645$
-* Since our mean is $32$ we reject the null hypothesis
-
----
-## Discussion
-* In general we don't convert $C$ back to the original scale
-* We would just reject because the Z-score; which is how many
-  standard errors the sample mean is above the hypothesized mean
-  $$
-  \frac{32 - 30}{10 / \sqrt{100}} = 2
-  $$
-  is greater than $1.645$
-* Or, whenever $\sqrt{n} (\bar X - \mu_0) / s > Z_{1-\alpha}$
-
----
-## General rules
-* The $Z$ test for $H_0:\mu = \mu_0$ versus 
-  * $H_1: \mu < \mu_0$
-  * $H_2: \mu \neq \mu_0$
-  * $H_3: \mu > \mu_0$ 
-* Test statistic $ TS = \frac{\bar{X} - \mu_0}{S / \sqrt{n}} $
-* Reject the null hypothesis when 
-  * $TS \leq -Z_{1 - \alpha}$
-  * $|TS| \geq Z_{1 - \alpha / 2}$
-  * $TS \geq Z_{1 - \alpha}$
-
----
-## Notes
-* We have fixed $\alpha$ to be low, so if we reject $H_0$ (either
-  our model is wrong) or there is a low probability that we have made
-  an error
-* We have not fixed the probability of a type II error, $\beta$;
-  therefore we tend to say ``Fail to reject $H_0$'' rather than
-  accepting $H_0$
-* Statistical significance is no the same as scientific
-  significance
-* The region of TS values for which you reject $H_0$ is called the
-  rejection region
-
----
-## More notes
-* The $Z$ test requires the assumptions of the CLT and for $n$ to be large enough
-  for it to apply
-* If $n$ is small, then a Gossett's $T$ test is performed exactly in the same way,
-  with the normal quantiles replaced by the appropriate Student's $T$ quantiles and
-  $n-1$ df
-* The probability of rejecting the null hypothesis when it is false is called *power*
-* Power is a used a lot to calculate sample sizes for experiments
-
----
-## Example reconsidered
-- Consider our example again. Suppose that $n= 16$ (rather than
-$100$). Then consider that
-$$
-.05 = P\left(\frac{\bar X - 30}{s / \sqrt{16}} \geq t_{1-\alpha, 15} ~|~ \mu = 30 \right)
-$$
-- So that our test statistic is now $\sqrt{16}(32 - 30) / 10 = 0.8 $, while the critical value is $t_{1-\alpha, 15} = 1.75$. 
-- We now fail to reject.
-
----
-## Two sided tests
-* Suppose that we would reject the null hypothesis if in fact the 
-  mean was too large or too small
-* That is, we want to test the alternative $H_a : \mu \neq 30$
-  (doesn't make a lot of sense in our setting)
-* Then note
-$$
- \alpha = P\left(\left. \left|\frac{\bar X - 30}{s /\sqrt{16}}\right| > t_{1-\alpha/2,15} ~\right|~ \mu = 30\right)
-$$
-* That is we will reject if the test statistic, $0.8$, is either
-  too large or too small, but the critical value is calculated using
-  $\alpha / 2$
-* In our example the critical value is $2.13$, so we fail to reject.
-
----
-## T test in R
-```{r}
-library(UsingR); data(father.son)
-t.test(father.son$sheight - father.son$fheight)
-```
-
----
-## Connections with confidence intervals
-* Consider testing $H_0: \mu = \mu_0$ versus $H_a: \mu \neq \mu_0$
-* Take the set of all possible values for which you fail to reject $H_0$, this set is a $(1-\alpha)100\%$ confidence interval for $\mu$
-* The same works in reverse; if a $(1-\alpha)100\%$ interval
-  contains $\mu_0$, then we *fail  to* reject $H_0$
-
----
-## Exact binomial test
-- Recall this problem, *Suppose a friend has $8$ children, $7$ of which are girls and none are twins*
-- Perform the relevant hypothesis test. $H_0 : p = 0.5$ $H_a : p > 0.5$
-  - What is the relevant rejection region so that the probability of rejecting is (less than) 5%?
-  
-Rejection region | Type I error rate |
----|---|
-[0 : 8] | `r pbinom(-1, size = 8, p = .5, lower.tail = FALSE)`
-[1 : 8] | `r pbinom( 0, size = 8, p = .5, lower.tail = FALSE)`
-[2 : 8] | `r pbinom( 1, size = 8, p = .5, lower.tail = FALSE)`
-[3 : 8] | `r pbinom( 2, size = 8, p = .5, lower.tail = FALSE)`
-[4 : 8] | `r pbinom( 3, size = 8, p = .5, lower.tail = FALSE)`
-[5 : 8] | `r pbinom( 4, size = 8, p = .5, lower.tail = FALSE)`
-[6 : 8] | `r pbinom( 5, size = 8, p = .5, lower.tail = FALSE)`
-[7 : 8] | `r pbinom( 6, size = 8, p = .5, lower.tail = FALSE)`
-[8 : 8] | `r pbinom( 7, size = 8, p = .5, lower.tail = FALSE)`
-
----
-## Notes
-* It's impossible to get an exact 5% level test for this case due to the discreteness of the binomial. 
-  * The closest is the rejection region [7 : 8]
-  * Any alpha level lower than `r 1 / 2 ^8` is not attainable.
-* For larger sample sizes, we could do a normal approximation, but you already knew this.
-* Two sided test isn't obvious. 
-  * Given a way to do two sided tests, we could take the set of values of $p_0$ for which we fail to reject to get an exact binomial confidence interval (called the Clopper/Pearson interval, BTW)
-* For these problems, people always create a P-value (next lecture) rather than computing the rejection region.
-
-
+---
+title       : Hypothesis testing
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Hypothesis testing
+* Hypothesis testing is concerned with making decisions using data
+* A null hypothesis is specified that represents the status quo,
+  usually labeled $H_0$
+* The null hypothesis is assumed true and statistical evidence is required
+  to reject it in favor of a research or alternative hypothesis 
+
+---
+## Example
+* A respiratory disturbance index of more than $30$ events / hour, say, is 
+  considered evidence of severe sleep disordered breathing (SDB).
+* Suppose that in a sample of $100$ overweight subjects with other
+  risk factors for sleep disordered breathing at a sleep clinic, the
+  mean RDI was $32$ events / hour with a standard deviation of $10$ events / hour.
+* We might want to test the hypothesis that 
+  * $H_0 : \mu = 30$
+  * $H_a : \mu > 30$
+  * where $\mu$ is the population mean RDI.
+
+---
+## Hypothesis testing
+* The alternative hypotheses are typically of the form $<$, $>$ or $\neq$
+* Note that there are four possible outcomes of our statistical decision process
+
+Truth | Decide | Result |
+---|---|---|
+$H_0$ | $H_0$ | Correctly accept null |
+$H_0$ | $H_a$ | Type I error |
+$H_a$ | $H_a$ | Correctly reject null |
+$H_a$ | $H_0$ | Type II error |
+
+---
+## Discussion
+* Consider a court of law; the null hypothesis is that the
+  defendant is innocent
+* We require evidence to reject the null hypothesis (convict)
+* If we require little evidence, then we would increase the
+  percentage of innocent people convicted (type I errors); however we
+  would also increase the percentage of guilty people convicted
+  (correctly rejecting the null)
+* If we require a lot of evidence, then we increase the the
+  percentage of innocent people let free (correctly accepting the
+  null) while we would also increase the percentage of guilty people
+  let free (type II errors)
+
+---
+## Example
+* Consider our example again
+* A reasonable strategy would reject the null hypothesis if
+  $\bar X$ was larger than some constant, say $C$
+* Typically, $C$ is chosen so that the probability of a Type I
+  error, $\alpha$, is $.05$ (or some other relevant constant)
+* $\alpha$ = Type I error rate = Probability of rejecting the null hypothesis when, in fact, the null hypothesis is correct
+
+---
+## Example continued
+
+
+$$
+\begin{align}
+0.05  & =  P\left(\bar X \geq C ~|~ \mu = 30 \right) \\
+      & =  P\left(\frac{\bar X - 30}{10 / \sqrt{100}} \geq \frac{C - 30}{10/\sqrt{100}} ~|~ \mu = 30\right) \\
+      & =  P\left(Z \geq \frac{C - 30}{1}\right) \\
+\end{align}
+$$
+
+* Hence $(C - 30) / 1 = 1.645$ implying $C = 31.645$
+* Since our mean is $32$ we reject the null hypothesis
+
+---
+## Discussion
+* In general we don't convert $C$ back to the original scale
+* We would just reject because the Z-score; which is how many
+  standard errors the sample mean is above the hypothesized mean
+  $$
+  \frac{32 - 30}{10 / \sqrt{100}} = 2
+  $$
+  is greater than $1.645$
+* Or, whenever $\sqrt{n} (\bar X - \mu_0) / s > Z_{1-\alpha}$
+
+---
+## General rules
+* The $Z$ test for $H_0:\mu = \mu_0$ versus 
+  * $H_1: \mu < \mu_0$
+  * $H_2: \mu \neq \mu_0$
+  * $H_3: \mu > \mu_0$ 
+* Test statistic $ TS = \frac{\bar{X} - \mu_0}{S / \sqrt{n}} $
+* Reject the null hypothesis when 
+  * $TS \leq -Z_{1 - \alpha}$
+  * $|TS| \geq Z_{1 - \alpha / 2}$
+  * $TS \geq Z_{1 - \alpha}$
+
+---
+## Notes
+* We have fixed $\alpha$ to be low, so if we reject $H_0$ (either
+  our model is wrong) or there is a low probability that we have made
+  an error
+* We have not fixed the probability of a type II error, $\beta$;
+  therefore we tend to say ``Fail to reject $H_0$'' rather than
+  accepting $H_0$
+* Statistical significance is no the same as scientific
+  significance
+* The region of TS values for which you reject $H_0$ is called the
+  rejection region
+
+---
+## More notes
+* The $Z$ test requires the assumptions of the CLT and for $n$ to be large enough
+  for it to apply
+* If $n$ is small, then a Gossett's $T$ test is performed exactly in the same way,
+  with the normal quantiles replaced by the appropriate Student's $T$ quantiles and
+  $n-1$ df
+* The probability of rejecting the null hypothesis when it is false is called *power*
+* Power is a used a lot to calculate sample sizes for experiments
+
+---
+## Example reconsidered
+- Consider our example again. Suppose that $n= 16$ (rather than
+$100$). Then consider that
+$$
+.05 = P\left(\frac{\bar X - 30}{s / \sqrt{16}} \geq t_{1-\alpha, 15} ~|~ \mu = 30 \right)
+$$
+- So that our test statistic is now $\sqrt{16}(32 - 30) / 10 = 0.8 $, while the critical value is $t_{1-\alpha, 15} = 1.75$. 
+- We now fail to reject.
+
+---
+## Two sided tests
+* Suppose that we would reject the null hypothesis if in fact the 
+  mean was too large or too small
+* That is, we want to test the alternative $H_a : \mu \neq 30$
+  (doesn't make a lot of sense in our setting)
+* Then note
+$$
+ \alpha = P\left(\left. \left|\frac{\bar X - 30}{s /\sqrt{16}}\right| > t_{1-\alpha/2,15} ~\right|~ \mu = 30\right)
+$$
+* That is we will reject if the test statistic, $0.8$, is either
+  too large or too small, but the critical value is calculated using
+  $\alpha / 2$
+* In our example the critical value is $2.13$, so we fail to reject.
+
+---
+## T test in R
+```{r}
+library(UsingR); data(father.son)
+t.test(father.son$sheight - father.son$fheight)
+```
+
+---
+## Connections with confidence intervals
+* Consider testing $H_0: \mu = \mu_0$ versus $H_a: \mu \neq \mu_0$
+* Take the set of all possible values for which you fail to reject $H_0$, this set is a $(1-\alpha)100\%$ confidence interval for $\mu$
+* The same works in reverse; if a $(1-\alpha)100\%$ interval
+  contains $\mu_0$, then we *fail  to* reject $H_0$
+
+---
+## Exact binomial test
+- Recall this problem, *Suppose a friend has $8$ children, $7$ of which are girls and none are twins*
+- Perform the relevant hypothesis test. $H_0 : p = 0.5$ $H_a : p > 0.5$
+  - What is the relevant rejection region so that the probability of rejecting is (less than) 5%?
+  
+Rejection region | Type I error rate |
+---|---|
+[0 : 8] | `r pbinom(-1, size = 8, p = .5, lower.tail = FALSE)`
+[1 : 8] | `r pbinom( 0, size = 8, p = .5, lower.tail = FALSE)`
+[2 : 8] | `r pbinom( 1, size = 8, p = .5, lower.tail = FALSE)`
+[3 : 8] | `r pbinom( 2, size = 8, p = .5, lower.tail = FALSE)`
+[4 : 8] | `r pbinom( 3, size = 8, p = .5, lower.tail = FALSE)`
+[5 : 8] | `r pbinom( 4, size = 8, p = .5, lower.tail = FALSE)`
+[6 : 8] | `r pbinom( 5, size = 8, p = .5, lower.tail = FALSE)`
+[7 : 8] | `r pbinom( 6, size = 8, p = .5, lower.tail = FALSE)`
+[8 : 8] | `r pbinom( 7, size = 8, p = .5, lower.tail = FALSE)`
+
+---
+## Notes
+* It's impossible to get an exact 5% level test for this case due to the discreteness of the binomial. 
+  * The closest is the rejection region [7 : 8]
+  * Any alpha level lower than `r 1 / 2 ^8` is not attainable.
+* For larger sample sizes, we could do a normal approximation, but you already knew this.
+* Two sided test isn't obvious. 
+  * Given a way to do two sided tests, we could take the set of values of $p_0$ for which we fail to reject to get an exact binomial confidence interval (called the Clopper/Pearson interval, BTW)
+* For these problems, people always create a P-value (next lecture) rather than computing the rejection region.
+
+
diff --git a/06_StatisticalInference/03_02_HypothesisTesting/index.html b/06_StatisticalInference/03_02_HypothesisTesting/index.html
index e8cc58b1c..a855fe1a6 100644
--- a/06_StatisticalInference/03_02_HypothesisTesting/index.html
+++ b/06_StatisticalInference/03_02_HypothesisTesting/index.html
@@ -1,582 +1,583 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Hypothesis testing</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Hypothesis testing">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Hypothesis testing</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Hypothesis testing</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Hypothesis testing is concerned with making decisions using data</li>
-<li>A null hypothesis is specified that represents the status quo,
-usually labeled \(H_0\)</li>
-<li>The null hypothesis is assumed true and statistical evidence is required
-to reject it in favor of a research or alternative hypothesis </li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>A respiratory disturbance index of more than \(30\) events / hour, say, is 
-considered evidence of severe sleep disordered breathing (SDB).</li>
-<li>Suppose that in a sample of \(100\) overweight subjects with other
-risk factors for sleep disordered breathing at a sleep clinic, the
-mean RDI was \(32\) events / hour with a standard deviation of \(10\) events / hour.</li>
-<li>We might want to test the hypothesis that 
-
-<ul>
-<li>\(H_0 : \mu = 30\)</li>
-<li>\(H_a : \mu > 30\)</li>
-<li>where \(\mu\) is the population mean RDI.</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Hypothesis testing</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The alternative hypotheses are typically of the form \(<\), \(>\) or \(\neq\)</li>
-<li>Note that there are four possible outcomes of our statistical decision process</li>
-</ul>
-
-<table><thead>
-<tr>
-<th>Truth</th>
-<th>Decide</th>
-<th>Result</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>\(H_0\)</td>
-<td>\(H_0\)</td>
-<td>Correctly accept null</td>
-</tr>
-<tr>
-<td>\(H_0\)</td>
-<td>\(H_a\)</td>
-<td>Type I error</td>
-</tr>
-<tr>
-<td>\(H_a\)</td>
-<td>\(H_a\)</td>
-<td>Correctly reject null</td>
-</tr>
-<tr>
-<td>\(H_a\)</td>
-<td>\(H_0\)</td>
-<td>Type II error</td>
-</tr>
-</tbody></table>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Discussion</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Consider a court of law; the null hypothesis is that the
-defendant is innocent</li>
-<li>We require evidence to reject the null hypothesis (convict)</li>
-<li>If we require little evidence, then we would increase the
-percentage of innocent people convicted (type I errors); however we
-would also increase the percentage of guilty people convicted
-(correctly rejecting the null)</li>
-<li>If we require a lot of evidence, then we increase the the
-percentage of innocent people let free (correctly accepting the
-null) while we would also increase the percentage of guilty people
-let free (type II errors)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Consider our example again</li>
-<li>A reasonable strategy would reject the null hypothesis if
-\(\bar X\) was larger than some constant, say \(C\)</li>
-<li>Typically, \(C\) is chosen so that the probability of a Type I
-error, \(\alpha\), is \(.05\) (or some other relevant constant)</li>
-<li>\(\alpha\) = Type I error rate = Probability of rejecting the null hypothesis when, in fact, the null hypothesis is correct</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Example continued</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>\[
-\begin{align}
-0.05  & =  P\left(\bar X \geq C ~|~ \mu = 30 \right) \\
-      & =  P\left(\frac{\bar X - 30}{10 / \sqrt{100}} \geq \frac{C - 30}{10/\sqrt{100}} ~|~ \mu = 30\right) \\
-      & =  P\left(Z \geq \frac{C - 30}{1}\right) \\
-\end{align}
-\]</p>
-
-<ul>
-<li>Hence \((C - 30) / 1 = 1.645\) implying \(C = 31.645\)</li>
-<li>Since our mean is \(32\) we reject the null hypothesis</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Discussion</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>In general we don&#39;t convert \(C\) back to the original scale</li>
-<li>We would just reject because the Z-score; which is how many
-standard errors the sample mean is above the hypothesized mean
-\[
-\frac{32 - 30}{10 / \sqrt{100}} = 2
-\]
-is greater than \(1.645\)</li>
-<li>Or, whenever \(\sqrt{n} (\bar X - \mu_0) / s > Z_{1-\alpha}\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>General rules</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The \(Z\) test for \(H_0:\mu = \mu_0\) versus 
-
-<ul>
-<li>\(H_1: \mu < \mu_0\)</li>
-<li>\(H_2: \mu \neq \mu_0\)</li>
-<li>\(H_3: \mu > \mu_0\) </li>
-</ul></li>
-<li>Test statistic $ TS = \frac{\bar{X} - \mu_0}{S / \sqrt{n}} $</li>
-<li>Reject the null hypothesis when 
-
-<ul>
-<li>\(TS \leq -Z_{1 - \alpha}\)</li>
-<li>\(|TS| \geq Z_{1 - \alpha / 2}\)</li>
-<li>\(TS \geq Z_{1 - \alpha}\)</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>Notes</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>We have fixed \(\alpha\) to be low, so if we reject \(H_0\) (either
-our model is wrong) or there is a low probability that we have made
-an error</li>
-<li>We have not fixed the probability of a type II error, \(\beta\);
-therefore we tend to say ``Fail to reject \(H_0\)&#39;&#39; rather than
-accepting \(H_0\)</li>
-<li>Statistical significance is no the same as scientific
-significance</li>
-<li>The region of TS values for which you reject \(H_0\) is called the
-rejection region</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>More notes</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The \(Z\) test requires the assumptions of the CLT and for \(n\) to be large enough
-for it to apply</li>
-<li>If \(n\) is small, then a Gossett&#39;s \(T\) test is performed exactly in the same way,
-with the normal quantiles replaced by the appropriate Student&#39;s \(T\) quantiles and
-\(n-1\) df</li>
-<li>The probability of rejecting the null hypothesis when it is false is called <em>power</em></li>
-<li>Power is a used a lot to calculate sample sizes for experiments</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <hgroup>
-    <h2>Example reconsidered</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Consider our example again. Suppose that \(n= 16\) (rather than
-\(100\)). Then consider that
-\[
-.05 = P\left(\frac{\bar X - 30}{s / \sqrt{16}} \geq t_{1-\alpha, 15} ~|~ \mu = 30 \right)
-\]</li>
-<li>So that our test statistic is now $\sqrt{16}(32 - 30) / 10 = 0.8 $, while the critical value is \(t_{1-\alpha, 15} = 1.75\). </li>
-<li>We now fail to reject.</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-12" style="background:;">
-  <hgroup>
-    <h2>Two sided tests</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose that we would reject the null hypothesis if in fact the 
-mean was too large or too small</li>
-<li>That is, we want to test the alternative \(H_a : \mu \neq 30\)
-(doesn&#39;t make a lot of sense in our setting)</li>
-<li>Then note
-\[
-\alpha = P\left(\left. \left|\frac{\bar X - 30}{s /\sqrt{16}}\right| > t_{1-\alpha/2,15} ~\right|~ \mu = 30\right)
-\]</li>
-<li>That is we will reject if the test statistic, \(0.8\), is either
-too large or too small, but the critical value is calculated using
-\(\alpha / 2\)</li>
-<li>In our example the critical value is \(2.13\), so we fail to reject.</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-13" style="background:;">
-  <hgroup>
-    <h2>T test in R</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">library(UsingR); data(father.son)
-t.test(father.son$sheight - father.son$fheight)
-</code></pre>
-
-<pre><code>
-    One Sample t-test
-
-data:  father.son$sheight - father.son$fheight
-t = 11.79, df = 1077, p-value &lt; 2.2e-16
-alternative hypothesis: true mean is not equal to 0
-95 percent confidence interval:
- 0.831 1.163
-sample estimates:
-mean of x 
-    0.997 
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-14" style="background:;">
-  <hgroup>
-    <h2>Connections with confidence intervals</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Consider testing \(H_0: \mu = \mu_0\) versus \(H_a: \mu \neq \mu_0\)</li>
-<li>Take the set of all possible values for which you fail to reject \(H_0\), this set is a \((1-\alpha)100\%\) confidence interval for \(\mu\)</li>
-<li>The same works in reverse; if a \((1-\alpha)100\%\) interval
-contains \(\mu_0\), then we <em>fail  to</em> reject \(H_0\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-15" style="background:;">
-  <hgroup>
-    <h2>Exact binomial test</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Recall this problem, <em>Suppose a friend has \(8\) children, \(7\) of which are girls and none are twins</em></li>
-<li>Perform the relevant hypothesis test. \(H_0 : p = 0.5\) \(H_a : p > 0.5\)
-
-<ul>
-<li>What is the relevant rejection region so that the probability of rejecting is (less than) 5%?</li>
-</ul></li>
-</ul>
-
-<table><thead>
-<tr>
-<th>Rejection region</th>
-<th>Type I error rate</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>[0 : 8]</td>
-<td>1</td>
-</tr>
-<tr>
-<td>[1 : 8]</td>
-<td>0.9961</td>
-</tr>
-<tr>
-<td>[2 : 8]</td>
-<td>0.9648</td>
-</tr>
-<tr>
-<td>[3 : 8]</td>
-<td>0.8555</td>
-</tr>
-<tr>
-<td>[4 : 8]</td>
-<td>0.6367</td>
-</tr>
-<tr>
-<td>[5 : 8]</td>
-<td>0.3633</td>
-</tr>
-<tr>
-<td>[6 : 8]</td>
-<td>0.1445</td>
-</tr>
-<tr>
-<td>[7 : 8]</td>
-<td>0.0352</td>
-</tr>
-<tr>
-<td>[8 : 8]</td>
-<td>0.0039</td>
-</tr>
-</tbody></table>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-16" style="background:;">
-  <hgroup>
-    <h2>Notes</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>It&#39;s impossible to get an exact 5% level test for this case due to the discreteness of the binomial. 
-
-<ul>
-<li>The closest is the rejection region [7 : 8]</li>
-<li>Any alpha level lower than 0.0039 is not attainable.</li>
-</ul></li>
-<li>For larger sample sizes, we could do a normal approximation, but you already knew this.</li>
-<li>Two sided test isn&#39;t obvious. 
-
-<ul>
-<li>Given a way to do two sided tests, we could take the set of values of \(p_0\) for which we fail to reject to get an exact binomial confidence interval (called the Clopper/Pearson interval, BTW)</li>
-</ul></li>
-<li>For these problems, people always create a P-value (next lecture) rather than computing the rejection region.</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Hypothesis testing'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Example'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Hypothesis testing'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Discussion'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Example'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Example continued'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Discussion'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='General rules'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='Notes'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='More notes'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title='Example reconsidered'>
-         11
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=12 title='Two sided tests'>
-         12
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=13 title='T test in R'>
-         13
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=14 title='Connections with confidence intervals'>
-         14
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=15 title='Exact binomial test'>
-         15
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=16 title='Notes'>
-         16
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Hypothesis testing</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Hypothesis testing">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Hypothesis testing</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Hypothesis testing</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Hypothesis testing is concerned with making decisions using data</li>
+<li>A null hypothesis is specified that represents the status quo,
+usually labeled \(H_0\)</li>
+<li>The null hypothesis is assumed true and statistical evidence is required
+to reject it in favor of a research or alternative hypothesis </li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>A respiratory disturbance index of more than \(30\) events / hour, say, is 
+considered evidence of severe sleep disordered breathing (SDB).</li>
+<li>Suppose that in a sample of \(100\) overweight subjects with other
+risk factors for sleep disordered breathing at a sleep clinic, the
+mean RDI was \(32\) events / hour with a standard deviation of \(10\) events / hour.</li>
+<li>We might want to test the hypothesis that 
+
+<ul>
+<li>\(H_0 : \mu = 30\)</li>
+<li>\(H_a : \mu > 30\)</li>
+<li>where \(\mu\) is the population mean RDI.</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Hypothesis testing</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The alternative hypotheses are typically of the form \(<\), \(>\) or \(\neq\)</li>
+<li>Note that there are four possible outcomes of our statistical decision process</li>
+</ul>
+
+<table><thead>
+<tr>
+<th>Truth</th>
+<th>Decide</th>
+<th>Result</th>
+</tr>
+</thead><tbody>
+<tr>
+<td>\(H_0\)</td>
+<td>\(H_0\)</td>
+<td>Correctly accept null</td>
+</tr>
+<tr>
+<td>\(H_0\)</td>
+<td>\(H_a\)</td>
+<td>Type I error</td>
+</tr>
+<tr>
+<td>\(H_a\)</td>
+<td>\(H_a\)</td>
+<td>Correctly reject null</td>
+</tr>
+<tr>
+<td>\(H_a\)</td>
+<td>\(H_0\)</td>
+<td>Type II error</td>
+</tr>
+</tbody></table>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Discussion</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Consider a court of law; the null hypothesis is that the
+defendant is innocent</li>
+<li>We require evidence to reject the null hypothesis (convict)</li>
+<li>If we require little evidence, then we would increase the
+percentage of innocent people convicted (type I errors); however we
+would also increase the percentage of guilty people convicted
+(correctly rejecting the null)</li>
+<li>If we require a lot of evidence, then we increase the the
+percentage of innocent people let free (correctly accepting the
+null) while we would also increase the percentage of guilty people
+let free (type II errors)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Consider our example again</li>
+<li>A reasonable strategy would reject the null hypothesis if
+\(\bar X\) was larger than some constant, say \(C\)</li>
+<li>Typically, \(C\) is chosen so that the probability of a Type I
+error, \(\alpha\), is \(.05\) (or some other relevant constant)</li>
+<li>\(\alpha\) = Type I error rate = Probability of rejecting the null hypothesis when, in fact, the null hypothesis is correct</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Example continued</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>\[
+\begin{align}
+0.05  & =  P\left(\bar X \geq C ~|~ \mu = 30 \right) \\
+      & =  P\left(\frac{\bar X - 30}{10 / \sqrt{100}} \geq \frac{C - 30}{10/\sqrt{100}} ~|~ \mu = 30\right) \\
+      & =  P\left(Z \geq \frac{C - 30}{1}\right) \\
+\end{align}
+\]</p>
+
+<ul>
+<li>Hence \((C - 30) / 1 = 1.645\) implying \(C = 31.645\)</li>
+<li>Since our mean is \(32\) we reject the null hypothesis</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Discussion</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>In general we don&#39;t convert \(C\) back to the original scale</li>
+<li>We would just reject because the Z-score; which is how many
+standard errors the sample mean is above the hypothesized mean
+\[
+\frac{32 - 30}{10 / \sqrt{100}} = 2
+\]
+is greater than \(1.645\)</li>
+<li>Or, whenever \(\sqrt{n} (\bar X - \mu_0) / s > Z_{1-\alpha}\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>General rules</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The \(Z\) test for \(H_0:\mu = \mu_0\) versus 
+
+<ul>
+<li>\(H_1: \mu < \mu_0\)</li>
+<li>\(H_2: \mu \neq \mu_0\)</li>
+<li>\(H_3: \mu > \mu_0\) </li>
+</ul></li>
+<li>Test statistic $ TS = \frac{\bar{X} - \mu_0}{S / \sqrt{n}} $</li>
+<li>Reject the null hypothesis when 
+
+<ul>
+<li>\(TS \leq -Z_{1 - \alpha}\)</li>
+<li>\(|TS| \geq Z_{1 - \alpha / 2}\)</li>
+<li>\(TS \geq Z_{1 - \alpha}\)</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>Notes</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>We have fixed \(\alpha\) to be low, so if we reject \(H_0\) (either
+our model is wrong) or there is a low probability that we have made
+an error</li>
+<li>We have not fixed the probability of a type II error, \(\beta\);
+therefore we tend to say ``Fail to reject \(H_0\)&#39;&#39; rather than
+accepting \(H_0\)</li>
+<li>Statistical significance is no the same as scientific
+significance</li>
+<li>The region of TS values for which you reject \(H_0\) is called the
+rejection region</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>More notes</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The \(Z\) test requires the assumptions of the CLT and for \(n\) to be large enough
+for it to apply</li>
+<li>If \(n\) is small, then a Gossett&#39;s \(T\) test is performed exactly in the same way,
+with the normal quantiles replaced by the appropriate Student&#39;s \(T\) quantiles and
+\(n-1\) df</li>
+<li>The probability of rejecting the null hypothesis when it is false is called <em>power</em></li>
+<li>Power is a used a lot to calculate sample sizes for experiments</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <hgroup>
+    <h2>Example reconsidered</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Consider our example again. Suppose that \(n= 16\) (rather than
+\(100\)). Then consider that
+\[
+.05 = P\left(\frac{\bar X - 30}{s / \sqrt{16}} \geq t_{1-\alpha, 15} ~|~ \mu = 30 \right)
+\]</li>
+<li>So that our test statistic is now $\sqrt{16}(32 - 30) / 10 = 0.8 $, while the critical value is \(t_{1-\alpha, 15} = 1.75\). </li>
+<li>We now fail to reject.</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <hgroup>
+    <h2>Two sided tests</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose that we would reject the null hypothesis if in fact the 
+mean was too large or too small</li>
+<li>That is, we want to test the alternative \(H_a : \mu \neq 30\)
+(doesn&#39;t make a lot of sense in our setting)</li>
+<li>Then note
+\[
+\alpha = P\left(\left. \left|\frac{\bar X - 30}{s /\sqrt{16}}\right| > t_{1-\alpha/2,15} ~\right|~ \mu = 30\right)
+\]</li>
+<li>That is we will reject if the test statistic, \(0.8\), is either
+too large or too small, but the critical value is calculated using
+\(\alpha / 2\)</li>
+<li>In our example the critical value is \(2.13\), so we fail to reject.</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-13" style="background:;">
+  <hgroup>
+    <h2>T test in R</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">library(UsingR)
+data(father.son)
+t.test(father.son$sheight - father.son$fheight)
+</code></pre>
+
+<pre><code>## 
+##  One Sample t-test
+## 
+## data:  father.son$sheight - father.son$fheight
+## t = 11.79, df = 1077, p-value &lt; 2.2e-16
+## alternative hypothesis: true mean is not equal to 0
+## 95 percent confidence interval:
+##  0.831 1.163
+## sample estimates:
+## mean of x 
+##     0.997
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-14" style="background:;">
+  <hgroup>
+    <h2>Connections with confidence intervals</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Consider testing \(H_0: \mu = \mu_0\) versus \(H_a: \mu \neq \mu_0\)</li>
+<li>Take the set of all possible values for which you fail to reject \(H_0\), this set is a \((1-\alpha)100\%\) confidence interval for \(\mu\)</li>
+<li>The same works in reverse; if a \((1-\alpha)100\%\) interval
+contains \(\mu_0\), then we <em>fail  to</em> reject \(H_0\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-15" style="background:;">
+  <hgroup>
+    <h2>Exact binomial test</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Recall this problem, <em>Suppose a friend has \(8\) children, \(7\) of which are girls and none are twins</em></li>
+<li>Perform the relevant hypothesis test. \(H_0 : p = 0.5\) \(H_a : p > 0.5\)
+
+<ul>
+<li>What is the relevant rejection region so that the probability of rejecting is (less than) 5%?</li>
+</ul></li>
+</ul>
+
+<table><thead>
+<tr>
+<th>Rejection region</th>
+<th>Type I error rate</th>
+</tr>
+</thead><tbody>
+<tr>
+<td>[0 : 8]</td>
+<td>1</td>
+</tr>
+<tr>
+<td>[1 : 8]</td>
+<td>0.9961</td>
+</tr>
+<tr>
+<td>[2 : 8]</td>
+<td>0.9648</td>
+</tr>
+<tr>
+<td>[3 : 8]</td>
+<td>0.8555</td>
+</tr>
+<tr>
+<td>[4 : 8]</td>
+<td>0.6367</td>
+</tr>
+<tr>
+<td>[5 : 8]</td>
+<td>0.3633</td>
+</tr>
+<tr>
+<td>[6 : 8]</td>
+<td>0.1445</td>
+</tr>
+<tr>
+<td>[7 : 8]</td>
+<td>0.0352</td>
+</tr>
+<tr>
+<td>[8 : 8]</td>
+<td>0.0039</td>
+</tr>
+</tbody></table>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-16" style="background:;">
+  <hgroup>
+    <h2>Notes</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>It&#39;s impossible to get an exact 5% level test for this case due to the discreteness of the binomial. 
+
+<ul>
+<li>The closest is the rejection region [7 : 8]</li>
+<li>Any alpha level lower than 0.0039 is not attainable.</li>
+</ul></li>
+<li>For larger sample sizes, we could do a normal approximation, but you already knew this.</li>
+<li>Two sided test isn&#39;t obvious. 
+
+<ul>
+<li>Given a way to do two sided tests, we could take the set of values of \(p_0\) for which we fail to reject to get an exact binomial confidence interval (called the Clopper/Pearson interval, BTW)</li>
+</ul></li>
+<li>For these problems, people always create a P-value (next lecture) rather than computing the rejection region.</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Hypothesis testing'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Example'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Hypothesis testing'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Discussion'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Example'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Example continued'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Discussion'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='General rules'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='Notes'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='More notes'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title='Example reconsidered'>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title='Two sided tests'>
+         12
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=13 title='T test in R'>
+         13
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=14 title='Connections with confidence intervals'>
+         14
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=15 title='Exact binomial test'>
+         15
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=16 title='Notes'>
+         16
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/03_02_HypothesisTesting/index.md b/06_StatisticalInference/03_02_HypothesisTesting/index.md
index 94bf98520..7f584af2c 100644
--- a/06_StatisticalInference/03_02_HypothesisTesting/index.md
+++ b/06_StatisticalInference/03_02_HypothesisTesting/index.md
@@ -1,217 +1,216 @@
----
-title       : Hypothesis testing
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-
-## Hypothesis testing
-* Hypothesis testing is concerned with making decisions using data
-* A null hypothesis is specified that represents the status quo,
-  usually labeled $H_0$
-* The null hypothesis is assumed true and statistical evidence is required
-  to reject it in favor of a research or alternative hypothesis 
-
----
-## Example
-* A respiratory disturbance index of more than $30$ events / hour, say, is 
-  considered evidence of severe sleep disordered breathing (SDB).
-* Suppose that in a sample of $100$ overweight subjects with other
-  risk factors for sleep disordered breathing at a sleep clinic, the
-  mean RDI was $32$ events / hour with a standard deviation of $10$ events / hour.
-* We might want to test the hypothesis that 
-  * $H_0 : \mu = 30$
-  * $H_a : \mu > 30$
-  * where $\mu$ is the population mean RDI.
-
----
-## Hypothesis testing
-* The alternative hypotheses are typically of the form $<$, $>$ or $\neq$
-* Note that there are four possible outcomes of our statistical decision process
-
-Truth | Decide | Result |
----|---|---|
-$H_0$ | $H_0$ | Correctly accept null |
-$H_0$ | $H_a$ | Type I error |
-$H_a$ | $H_a$ | Correctly reject null |
-$H_a$ | $H_0$ | Type II error |
-
----
-## Discussion
-* Consider a court of law; the null hypothesis is that the
-  defendant is innocent
-* We require evidence to reject the null hypothesis (convict)
-* If we require little evidence, then we would increase the
-  percentage of innocent people convicted (type I errors); however we
-  would also increase the percentage of guilty people convicted
-  (correctly rejecting the null)
-* If we require a lot of evidence, then we increase the the
-  percentage of innocent people let free (correctly accepting the
-  null) while we would also increase the percentage of guilty people
-  let free (type II errors)
-
----
-## Example
-* Consider our example again
-* A reasonable strategy would reject the null hypothesis if
-  $\bar X$ was larger than some constant, say $C$
-* Typically, $C$ is chosen so that the probability of a Type I
-  error, $\alpha$, is $.05$ (or some other relevant constant)
-* $\alpha$ = Type I error rate = Probability of rejecting the null hypothesis when, in fact, the null hypothesis is correct
-
----
-## Example continued
-
-
-$$
-\begin{align}
-0.05  & =  P\left(\bar X \geq C ~|~ \mu = 30 \right) \\
-      & =  P\left(\frac{\bar X - 30}{10 / \sqrt{100}} \geq \frac{C - 30}{10/\sqrt{100}} ~|~ \mu = 30\right) \\
-      & =  P\left(Z \geq \frac{C - 30}{1}\right) \\
-\end{align}
-$$
-
-* Hence $(C - 30) / 1 = 1.645$ implying $C = 31.645$
-* Since our mean is $32$ we reject the null hypothesis
-
----
-## Discussion
-* In general we don't convert $C$ back to the original scale
-* We would just reject because the Z-score; which is how many
-  standard errors the sample mean is above the hypothesized mean
-  $$
-  \frac{32 - 30}{10 / \sqrt{100}} = 2
-  $$
-  is greater than $1.645$
-* Or, whenever $\sqrt{n} (\bar X - \mu_0) / s > Z_{1-\alpha}$
-
----
-## General rules
-* The $Z$ test for $H_0:\mu = \mu_0$ versus 
-  * $H_1: \mu < \mu_0$
-  * $H_2: \mu \neq \mu_0$
-  * $H_3: \mu > \mu_0$ 
-* Test statistic $ TS = \frac{\bar{X} - \mu_0}{S / \sqrt{n}} $
-* Reject the null hypothesis when 
-  * $TS \leq -Z_{1 - \alpha}$
-  * $|TS| \geq Z_{1 - \alpha / 2}$
-  * $TS \geq Z_{1 - \alpha}$
-
----
-## Notes
-* We have fixed $\alpha$ to be low, so if we reject $H_0$ (either
-  our model is wrong) or there is a low probability that we have made
-  an error
-* We have not fixed the probability of a type II error, $\beta$;
-  therefore we tend to say ``Fail to reject $H_0$'' rather than
-  accepting $H_0$
-* Statistical significance is no the same as scientific
-  significance
-* The region of TS values for which you reject $H_0$ is called the
-  rejection region
-
----
-## More notes
-* The $Z$ test requires the assumptions of the CLT and for $n$ to be large enough
-  for it to apply
-* If $n$ is small, then a Gossett's $T$ test is performed exactly in the same way,
-  with the normal quantiles replaced by the appropriate Student's $T$ quantiles and
-  $n-1$ df
-* The probability of rejecting the null hypothesis when it is false is called *power*
-* Power is a used a lot to calculate sample sizes for experiments
-
----
-## Example reconsidered
-- Consider our example again. Suppose that $n= 16$ (rather than
-$100$). Then consider that
-$$
-.05 = P\left(\frac{\bar X - 30}{s / \sqrt{16}} \geq t_{1-\alpha, 15} ~|~ \mu = 30 \right)
-$$
-- So that our test statistic is now $\sqrt{16}(32 - 30) / 10 = 0.8 $, while the critical value is $t_{1-\alpha, 15} = 1.75$. 
-- We now fail to reject.
-
----
-## Two sided tests
-* Suppose that we would reject the null hypothesis if in fact the 
-  mean was too large or too small
-* That is, we want to test the alternative $H_a : \mu \neq 30$
-  (doesn't make a lot of sense in our setting)
-* Then note
-$$
- \alpha = P\left(\left. \left|\frac{\bar X - 30}{s /\sqrt{16}}\right| > t_{1-\alpha/2,15} ~\right|~ \mu = 30\right)
-$$
-* That is we will reject if the test statistic, $0.8$, is either
-  too large or too small, but the critical value is calculated using
-  $\alpha / 2$
-* In our example the critical value is $2.13$, so we fail to reject.
-
----
-## T test in R
-
-```r
-library(UsingR); data(father.son)
-t.test(father.son$sheight - father.son$fheight)
-```
-
-```
-
-	One Sample t-test
-
-data:  father.son$sheight - father.son$fheight
-t = 11.79, df = 1077, p-value < 2.2e-16
-alternative hypothesis: true mean is not equal to 0
-95 percent confidence interval:
- 0.831 1.163
-sample estimates:
-mean of x 
-    0.997 
-```
-
-
----
-## Connections with confidence intervals
-* Consider testing $H_0: \mu = \mu_0$ versus $H_a: \mu \neq \mu_0$
-* Take the set of all possible values for which you fail to reject $H_0$, this set is a $(1-\alpha)100\%$ confidence interval for $\mu$
-* The same works in reverse; if a $(1-\alpha)100\%$ interval
-  contains $\mu_0$, then we *fail  to* reject $H_0$
-
----
-## Exact binomial test
-- Recall this problem, *Suppose a friend has $8$ children, $7$ of which are girls and none are twins*
-- Perform the relevant hypothesis test. $H_0 : p = 0.5$ $H_a : p > 0.5$
-  - What is the relevant rejection region so that the probability of rejecting is (less than) 5%?
-  
-Rejection region | Type I error rate |
----|---|
-[0 : 8] | 1
-[1 : 8] | 0.9961
-[2 : 8] | 0.9648
-[3 : 8] | 0.8555
-[4 : 8] | 0.6367
-[5 : 8] | 0.3633
-[6 : 8] | 0.1445
-[7 : 8] | 0.0352
-[8 : 8] | 0.0039
-
----
-## Notes
-* It's impossible to get an exact 5% level test for this case due to the discreteness of the binomial. 
-  * The closest is the rejection region [7 : 8]
-  * Any alpha level lower than 0.0039 is not attainable.
-* For larger sample sizes, we could do a normal approximation, but you already knew this.
-* Two sided test isn't obvious. 
-  * Given a way to do two sided tests, we could take the set of values of $p_0$ for which we fail to reject to get an exact binomial confidence interval (called the Clopper/Pearson interval, BTW)
-* For these problems, people always create a P-value (next lecture) rather than computing the rejection region.
-
-
+---
+title       : Hypothesis testing
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Hypothesis testing
+* Hypothesis testing is concerned with making decisions using data
+* A null hypothesis is specified that represents the status quo,
+  usually labeled $H_0$
+* The null hypothesis is assumed true and statistical evidence is required
+  to reject it in favor of a research or alternative hypothesis 
+
+---
+## Example
+* A respiratory disturbance index of more than $30$ events / hour, say, is 
+  considered evidence of severe sleep disordered breathing (SDB).
+* Suppose that in a sample of $100$ overweight subjects with other
+  risk factors for sleep disordered breathing at a sleep clinic, the
+  mean RDI was $32$ events / hour with a standard deviation of $10$ events / hour.
+* We might want to test the hypothesis that 
+  * $H_0 : \mu = 30$
+  * $H_a : \mu > 30$
+  * where $\mu$ is the population mean RDI.
+
+---
+## Hypothesis testing
+* The alternative hypotheses are typically of the form $<$, $>$ or $\neq$
+* Note that there are four possible outcomes of our statistical decision process
+
+Truth | Decide | Result |
+---|---|---|
+$H_0$ | $H_0$ | Correctly accept null |
+$H_0$ | $H_a$ | Type I error |
+$H_a$ | $H_a$ | Correctly reject null |
+$H_a$ | $H_0$ | Type II error |
+
+---
+## Discussion
+* Consider a court of law; the null hypothesis is that the
+  defendant is innocent
+* We require evidence to reject the null hypothesis (convict)
+* If we require little evidence, then we would increase the
+  percentage of innocent people convicted (type I errors); however we
+  would also increase the percentage of guilty people convicted
+  (correctly rejecting the null)
+* If we require a lot of evidence, then we increase the the
+  percentage of innocent people let free (correctly accepting the
+  null) while we would also increase the percentage of guilty people
+  let free (type II errors)
+
+---
+## Example
+* Consider our example again
+* A reasonable strategy would reject the null hypothesis if
+  $\bar X$ was larger than some constant, say $C$
+* Typically, $C$ is chosen so that the probability of a Type I
+  error, $\alpha$, is $.05$ (or some other relevant constant)
+* $\alpha$ = Type I error rate = Probability of rejecting the null hypothesis when, in fact, the null hypothesis is correct
+
+---
+## Example continued
+
+
+$$
+\begin{align}
+0.05  & =  P\left(\bar X \geq C ~|~ \mu = 30 \right) \\
+      & =  P\left(\frac{\bar X - 30}{10 / \sqrt{100}} \geq \frac{C - 30}{10/\sqrt{100}} ~|~ \mu = 30\right) \\
+      & =  P\left(Z \geq \frac{C - 30}{1}\right) \\
+\end{align}
+$$
+
+* Hence $(C - 30) / 1 = 1.645$ implying $C = 31.645$
+* Since our mean is $32$ we reject the null hypothesis
+
+---
+## Discussion
+* In general we don't convert $C$ back to the original scale
+* We would just reject because the Z-score; which is how many
+  standard errors the sample mean is above the hypothesized mean
+  $$
+  \frac{32 - 30}{10 / \sqrt{100}} = 2
+  $$
+  is greater than $1.645$
+* Or, whenever $\sqrt{n} (\bar X - \mu_0) / s > Z_{1-\alpha}$
+
+---
+## General rules
+* The $Z$ test for $H_0:\mu = \mu_0$ versus 
+  * $H_1: \mu < \mu_0$
+  * $H_2: \mu \neq \mu_0$
+  * $H_3: \mu > \mu_0$ 
+* Test statistic $ TS = \frac{\bar{X} - \mu_0}{S / \sqrt{n}} $
+* Reject the null hypothesis when 
+  * $TS \leq -Z_{1 - \alpha}$
+  * $|TS| \geq Z_{1 - \alpha / 2}$
+  * $TS \geq Z_{1 - \alpha}$
+
+---
+## Notes
+* We have fixed $\alpha$ to be low, so if we reject $H_0$ (either
+  our model is wrong) or there is a low probability that we have made
+  an error
+* We have not fixed the probability of a type II error, $\beta$;
+  therefore we tend to say ``Fail to reject $H_0$'' rather than
+  accepting $H_0$
+* Statistical significance is no the same as scientific
+  significance
+* The region of TS values for which you reject $H_0$ is called the
+  rejection region
+
+---
+## More notes
+* The $Z$ test requires the assumptions of the CLT and for $n$ to be large enough
+  for it to apply
+* If $n$ is small, then a Gossett's $T$ test is performed exactly in the same way,
+  with the normal quantiles replaced by the appropriate Student's $T$ quantiles and
+  $n-1$ df
+* The probability of rejecting the null hypothesis when it is false is called *power*
+* Power is a used a lot to calculate sample sizes for experiments
+
+---
+## Example reconsidered
+- Consider our example again. Suppose that $n= 16$ (rather than
+$100$). Then consider that
+$$
+.05 = P\left(\frac{\bar X - 30}{s / \sqrt{16}} \geq t_{1-\alpha, 15} ~|~ \mu = 30 \right)
+$$
+- So that our test statistic is now $\sqrt{16}(32 - 30) / 10 = 0.8 $, while the critical value is $t_{1-\alpha, 15} = 1.75$. 
+- We now fail to reject.
+
+---
+## Two sided tests
+* Suppose that we would reject the null hypothesis if in fact the 
+  mean was too large or too small
+* That is, we want to test the alternative $H_a : \mu \neq 30$
+  (doesn't make a lot of sense in our setting)
+* Then note
+$$
+ \alpha = P\left(\left. \left|\frac{\bar X - 30}{s /\sqrt{16}}\right| > t_{1-\alpha/2,15} ~\right|~ \mu = 30\right)
+$$
+* That is we will reject if the test statistic, $0.8$, is either
+  too large or too small, but the critical value is calculated using
+  $\alpha / 2$
+* In our example the critical value is $2.13$, so we fail to reject.
+
+---
+## T test in R
+
+```r
+library(UsingR)
+data(father.son)
+t.test(father.son$sheight - father.son$fheight)
+```
+
+```
+## 
+## 	One Sample t-test
+## 
+## data:  father.son$sheight - father.son$fheight
+## t = 11.79, df = 1077, p-value < 2.2e-16
+## alternative hypothesis: true mean is not equal to 0
+## 95 percent confidence interval:
+##  0.831 1.163
+## sample estimates:
+## mean of x 
+##     0.997
+```
+
+
+---
+## Connections with confidence intervals
+* Consider testing $H_0: \mu = \mu_0$ versus $H_a: \mu \neq \mu_0$
+* Take the set of all possible values for which you fail to reject $H_0$, this set is a $(1-\alpha)100\%$ confidence interval for $\mu$
+* The same works in reverse; if a $(1-\alpha)100\%$ interval
+  contains $\mu_0$, then we *fail  to* reject $H_0$
+
+---
+## Exact binomial test
+- Recall this problem, *Suppose a friend has $8$ children, $7$ of which are girls and none are twins*
+- Perform the relevant hypothesis test. $H_0 : p = 0.5$ $H_a : p > 0.5$
+  - What is the relevant rejection region so that the probability of rejecting is (less than) 5%?
+  
+Rejection region | Type I error rate |
+---|---|
+[0 : 8] | 1
+[1 : 8] | 0.9961
+[2 : 8] | 0.9648
+[3 : 8] | 0.8555
+[4 : 8] | 0.6367
+[5 : 8] | 0.3633
+[6 : 8] | 0.1445
+[7 : 8] | 0.0352
+[8 : 8] | 0.0039
+
+---
+## Notes
+* It's impossible to get an exact 5% level test for this case due to the discreteness of the binomial. 
+  * The closest is the rejection region [7 : 8]
+  * Any alpha level lower than 0.0039 is not attainable.
+* For larger sample sizes, we could do a normal approximation, but you already knew this.
+* Two sided test isn't obvious. 
+  * Given a way to do two sided tests, we could take the set of values of $p_0$ for which we fail to reject to get an exact binomial confidence interval (called the Clopper/Pearson interval, BTW)
+* For these problems, people always create a P-value (next lecture) rather than computing the rejection region.
+
+
diff --git a/06_StatisticalInference/03_02_HypothesisTesting/index.pdf b/06_StatisticalInference/03_02_HypothesisTesting/index.pdf
index 9e5f2ae42..c5db5a783 100644
Binary files a/06_StatisticalInference/03_02_HypothesisTesting/index.pdf and b/06_StatisticalInference/03_02_HypothesisTesting/index.pdf differ
diff --git a/06_StatisticalInference/03_03_pValues/index.Rmd b/06_StatisticalInference/03_03_pValues/index.Rmd
index bc6f09908..ea0bb8398 100644
--- a/06_StatisticalInference/03_03_pValues/index.Rmd
+++ b/06_StatisticalInference/03_03_pValues/index.Rmd
@@ -1,107 +1,92 @@
----
-title       : P-values
-subtitle    : Statistical inference
-author      : Brian Caffo, Jeffrey Leek, Roger Peng 
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow   # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F}
-# make this an external chunk that can be included in any file
-options(width = 100)
-opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/')
-
-options(xtable.type = 'html')
-knit_hooks$set(inline = function(x) {
-  if(is.numeric(x)) {
-    round(x, getOption('digits'))
-  } else {
-    paste(as.character(x), collapse = ', ')
-  }
-})
-knit_hooks$set(plot = knitr:::hook_plot_html)
-```
-
-## P-values
-
-* Most common measure of "statistical significance"
-* Their ubiquity, along with concern over their interpretation and use
-  makes them controversial among statisticians
-  * [http://warnercnr.colostate.edu/~anderson/thompson1.html](http://warnercnr.colostate.edu/~anderson/thompson1.html)
-  * Also see *Statistical Evidence: A Likelihood Paradigm* by Richard Royall 
-  * *Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy* by Steve Goodman
-  * The hilariously titled: *The Earth is Round (p < .05)* by Cohen.
-* Some positive comments
-  * [simply statistics](http://simplystatistics.org/2012/01/06/p-values-and-hypothesis-testing-get-a-bad-rap-but-we/)
-  * [normal deviate](http://normaldeviate.wordpress.com/2013/03/14/double-misunderstandings-about-p-values/)
-  * [Error statistics](http://errorstatistics.com/2013/06/14/p-values-cant-be-trusted-except-when-used-to-argue-that-p-values-cant-be-trusted/)
-
----
-
-
-## What is a P-value? 
-
-__Idea__: Suppose nothing is going on - how unusual is it to see the estimate we got?
-
-__Approach__: 
-
-1. Define the hypothetical distribution of a data summary (statistic) when "nothing is going on" (_null hypothesis_)
-2. Calculate the summary/statistic with the data we have (_test statistic_)
-3. Compare what we calculated to our hypothetical distribution and see if the value is "extreme" (_p-value_)
-
----
-## P-values
-* The P-value is the probability under the null hypothesis of obtaining evidence as extreme or more extreme than would be observed by chance alone
-* If the P-value is small, then either $H_0$ is true and we have observed a rare event or $H_0$ is false
-*  In our example the $T$ statistic was $0.8$. 
-  * What's the probability of getting a $T$ statistic as large as $0.8$?
-```{r}
-pt(0.8, 15, lower.tail = FALSE) 
-```
-* Therefore, the probability of seeing evidence as extreme or more extreme than that actually obtained under $H_0$ is `r pt(0.8, 15, lower.tail = FALSE)`
-
----
-## The attained significance level
-* Our test statistic was $2$ for $H_0 : \mu_0  = 30$ versus $H_a:\mu > 30$.
-* Notice that we rejected the one sided test when $\alpha = 0.05$, would we reject if $\alpha = 0.01$, how about $0.001$?
-* The smallest value for alpha that you still reject the null hypothesis is called the *attained significance level*
-* This is equivalent, but philosophically a little different from, the *P-value*
-
----
-## Notes
-* By reporting a P-value the reader can perform the hypothesis
-  test at whatever $\alpha$ level he or she choses
-* If the P-value is less than $\alpha$ you reject the null hypothesis 
-* For two sided hypothesis test, double the smaller of the two one
-  sided hypothesis test Pvalues
-
----
-## Revisiting an earlier example
-- Suppose a friend has $8$ children, $7$ of which are girls and none are twins
-- If each gender has an independent $50$% probability for each birth, what's the probability of getting $7$ or more girls out of $8$ births?
-```{r}
-choose(8, 7) * .5 ^ 8 + choose(8, 8) * .5 ^ 8 
-pbinom(6, size = 8, prob = .5, lower.tail = FALSE)
-```
-
----
-## Poisson example
-- Suppose that a hospital has an infection rate of 10 infections per 100 person/days at risk (rate of 0.1) during the last monitoring period.
-- Assume that an infection rate of 0.05 is an important benchmark. 
-- Given the model, could the observed rate being larger than 0.05 be attributed to chance?
-- Under $H_0: \lambda = 0.05$ so that $\lambda_0 100 = 5$
-- Consider $H_a: \lambda > 0.05$.
-
-```{r}
-ppois(9, 5, lower.tail = FALSE)
-```
-
-
-
+---
+title       : P-values
+subtitle    : Statistical inference
+author      : Brian Caffo, Jeffrey Leek, Roger Peng 
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow   # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+  
+## P-values
+
+* Most common measure of "statistical significance"
+* Their ubiquity, along with concern over their interpretation and use
+  makes them controversial among statisticians
+  * [http://warnercnr.colostate.edu/~anderson/thompson1.html](http://warnercnr.colostate.edu/~anderson/thompson1.html)
+  * Also see *Statistical Evidence: A Likelihood Paradigm* by Richard Royall 
+  * *Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy* by Steve Goodman
+  * The hilariously titled: *The Earth is Round (p < .05)* by Cohen.
+* Some positive comments
+  * [simply statistics](http://simplystatistics.org/2012/01/06/p-values-and-hypothesis-testing-get-a-bad-rap-but-we/)
+  * [normal deviate](http://normaldeviate.wordpress.com/2013/03/14/double-misunderstandings-about-p-values/)
+  * [Error statistics](http://errorstatistics.com/2013/06/14/p-values-cant-be-trusted-except-when-used-to-argue-that-p-values-cant-be-trusted/)
+
+---
+
+
+## What is a P-value? 
+
+__Idea__: Suppose nothing is going on - how unusual is it to see the estimate we got?
+
+__Approach__: 
+
+1. Define the hypothetical distribution of a data summary (statistic) when "nothing is going on" (_null hypothesis_)
+2. Calculate the summary/statistic with the data we have (_test statistic_)
+3. Compare what we calculated to our hypothetical distribution and see if the value is "extreme" (_p-value_)
+
+---
+## P-values
+* The P-value is the probability under the null hypothesis of obtaining evidence as extreme or more extreme than would be observed by chance alone
+* If the P-value is small, then either $H_0$ is true and we have observed a rare event or $H_0$ is false
+*  In our example the $T$ statistic was $0.8$. 
+  * What's the probability of getting a $T$ statistic as large as $0.8$?
+```{r}
+pt(0.8, 15, lower.tail = FALSE) 
+```
+* Therefore, the probability of seeing evidence as extreme or more extreme than that actually obtained under $H_0$ is `r pt(0.8, 15, lower.tail = FALSE)`
+
+---
+## The attained significance level
+* Our test statistic was $2$ for $H_0 : \mu_0  = 30$ versus $H_a:\mu > 30$.
+* Notice that we rejected the one sided test when $\alpha = 0.05$, would we reject if $\alpha = 0.01$, how about $0.001$?
+* The smallest value for alpha that you still reject the null hypothesis is called the *attained significance level*
+* This is equivalent, but philosophically a little different from, the *P-value*
+
+---
+## Notes
+* By reporting a P-value the reader can perform the hypothesis
+  test at whatever $\alpha$ level he or she choses
+* If the P-value is less than $\alpha$ you reject the null hypothesis 
+* For two sided hypothesis test, double the smaller of the two one
+  sided hypothesis test Pvalues
+
+---
+## Revisiting an earlier example
+- Suppose a friend has $8$ children, $7$ of which are girls and none are twins
+- If each gender has an independent $50$% probability for each birth, what's the probability of getting $7$ or more girls out of $8$ births?
+```{r}
+choose(8, 7) * .5 ^ 8 + choose(8, 8) * .5 ^ 8 
+pbinom(6, size = 8, prob = .5, lower.tail = FALSE)
+```
+
+---
+## Poisson example
+- Suppose that a hospital has an infection rate of 10 infections per 100 person/days at risk (rate of 0.1) during the last monitoring period.
+- Assume that an infection rate of 0.05 is an important benchmark. 
+- Given the model, could the observed rate being larger than 0.05 be attributed to chance?
+- Under $H_0: \lambda = 0.05$ so that $\lambda_0 100 = 5$
+- Consider $H_a: \lambda > 0.05$.
+
+```{r}
+ppois(9, 5, lower.tail = FALSE)
+```
+
+
+
diff --git a/06_StatisticalInference/03_03_pValues/index.html b/06_StatisticalInference/03_03_pValues/index.html
index a92ee0403..8d31071c3 100644
--- a/06_StatisticalInference/03_03_pValues/index.html
+++ b/06_StatisticalInference/03_03_pValues/index.html
@@ -1,280 +1,280 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>P-values</title>
-  <meta charset="utf-8">
-  <meta name="description" content="P-values">
-  <meta name="author" content="Brian Caffo, Jeffrey Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>P-values</h1>
-    <h2>Statistical inference</h2>
-    <p>Brian Caffo, Jeffrey Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>P-values</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Most common measure of &quot;statistical significance&quot;</li>
-<li>Their ubiquity, along with concern over their interpretation and use
-makes them controversial among statisticians
-
-<ul>
-<li><a href="http://warnercnr.colostate.edu/%7Eanderson/thompson1.html">http://warnercnr.colostate.edu/~anderson/thompson1.html</a></li>
-<li>Also see <em>Statistical Evidence: A Likelihood Paradigm</em> by Richard Royall </li>
-<li><em>Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy</em> by Steve Goodman</li>
-<li>The hilariously titled: <em>The Earth is Round (p &lt; .05)</em> by Cohen.</li>
-</ul></li>
-<li>Some positive comments
-
-<ul>
-<li><a href="http://simplystatistics.org/2012/01/06/p-values-and-hypothesis-testing-get-a-bad-rap-but-we/">simply statistics</a></li>
-<li><a href="http://normaldeviate.wordpress.com/2013/03/14/double-misunderstandings-about-p-values/">normal deviate</a></li>
-<li><a href="http://errorstatistics.com/2013/06/14/p-values-cant-be-trusted-except-when-used-to-argue-that-p-values-cant-be-trusted/">Error statistics</a></li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>What is a P-value?</h2>
-  </hgroup>
-  <article data-timings="">
-    <p><strong>Idea</strong>: Suppose nothing is going on - how unusual is it to see the estimate we got?</p>
-
-<p><strong>Approach</strong>: </p>
-
-<ol>
-<li>Define the hypothetical distribution of a data summary (statistic) when &quot;nothing is going on&quot; (<em>null hypothesis</em>)</li>
-<li>Calculate the summary/statistic with the data we have (<em>test statistic</em>)</li>
-<li>Compare what we calculated to our hypothetical distribution and see if the value is &quot;extreme&quot; (<em>p-value</em>)</li>
-</ol>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>P-values</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The P-value is the probability under the null hypothesis of obtaining evidence as extreme or more extreme than would be observed by chance alone</li>
-<li>If the P-value is small, then either \(H_0\) is true and we have observed a rare event or \(H_0\) is false</li>
-<li> In our example the \(T\) statistic was \(0.8\). 
-
-<ul>
-<li>What&#39;s the probability of getting a \(T\) statistic as large as \(0.8\)?</li>
-</ul></li>
-</ul>
-
-<pre><code class="r">pt(0.8, 15, lower.tail = FALSE) 
-</code></pre>
-
-<pre><code>[1] 0.2181
-</code></pre>
-
-<ul>
-<li>Therefore, the probability of seeing evidence as extreme or more extreme than that actually obtained under \(H_0\) is 0.2181</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>The attained significance level</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Our test statistic was \(2\) for \(H_0 : \mu_0  = 30\) versus \(H_a:\mu > 30\).</li>
-<li>Notice that we rejected the one sided test when \(\alpha = 0.05\), would we reject if \(\alpha = 0.01\), how about \(0.001\)?</li>
-<li>The smallest value for alpha that you still reject the null hypothesis is called the <em>attained significance level</em></li>
-<li>This is equivalent, but philosophically a little different from, the <em>P-value</em></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Notes</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>By reporting a P-value the reader can perform the hypothesis
-test at whatever \(\alpha\) level he or she choses</li>
-<li>If the P-value is less than \(\alpha\) you reject the null hypothesis </li>
-<li>For two sided hypothesis test, double the smaller of the two one
-sided hypothesis test Pvalues</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Revisiting an earlier example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose a friend has \(8\) children, \(7\) of which are girls and none are twins</li>
-<li>If each gender has an independent \(50\)% probability for each birth, what&#39;s the probability of getting \(7\) or more girls out of \(8\) births?</li>
-</ul>
-
-<pre><code class="r">choose(8, 7) * .5 ^ 8 + choose(8, 8) * .5 ^ 8 
-</code></pre>
-
-<pre><code>[1] 0.03516
-</code></pre>
-
-<pre><code class="r">pbinom(6, size = 8, prob = .5, lower.tail = FALSE)
-</code></pre>
-
-<pre><code>[1] 0.03516
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Poisson example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose that a hospital has an infection rate of 10 infections per 100 person/days at risk (rate of 0.1) during the last monitoring period.</li>
-<li>Assume that an infection rate of 0.05 is an important benchmark. </li>
-<li>Given the model, could the observed rate being larger than 0.05 be attributed to chance?</li>
-<li>Under \(H_0: \lambda = 0.05\) so that \(\lambda_0 100 = 5\)</li>
-<li>Consider \(H_a: \lambda > 0.05\).</li>
-</ul>
-
-<pre><code class="r">ppois(9, 5, lower.tail = FALSE)
-</code></pre>
-
-<pre><code>[1] 0.03183
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='P-values'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='What is a P-value?'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='P-values'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='The attained significance level'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Notes'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Revisiting an earlier example'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Poisson example'>
-         7
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>P-values</title>
+  <meta charset="utf-8">
+  <meta name="description" content="P-values">
+  <meta name="author" content="Brian Caffo, Jeffrey Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>P-values</h1>
+    <h2>Statistical inference</h2>
+    <p>Brian Caffo, Jeffrey Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>P-values</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Most common measure of &quot;statistical significance&quot;</li>
+<li>Their ubiquity, along with concern over their interpretation and use
+makes them controversial among statisticians
+
+<ul>
+<li><a href="http://warnercnr.colostate.edu/%7Eanderson/thompson1.html">http://warnercnr.colostate.edu/~anderson/thompson1.html</a></li>
+<li>Also see <em>Statistical Evidence: A Likelihood Paradigm</em> by Richard Royall </li>
+<li><em>Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy</em> by Steve Goodman</li>
+<li>The hilariously titled: <em>The Earth is Round (p &lt; .05)</em> by Cohen.</li>
+</ul></li>
+<li>Some positive comments
+
+<ul>
+<li><a href="http://simplystatistics.org/2012/01/06/p-values-and-hypothesis-testing-get-a-bad-rap-but-we/">simply statistics</a></li>
+<li><a href="http://normaldeviate.wordpress.com/2013/03/14/double-misunderstandings-about-p-values/">normal deviate</a></li>
+<li><a href="http://errorstatistics.com/2013/06/14/p-values-cant-be-trusted-except-when-used-to-argue-that-p-values-cant-be-trusted/">Error statistics</a></li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>What is a P-value?</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><strong>Idea</strong>: Suppose nothing is going on - how unusual is it to see the estimate we got?</p>
+
+<p><strong>Approach</strong>: </p>
+
+<ol>
+<li>Define the hypothetical distribution of a data summary (statistic) when &quot;nothing is going on&quot; (<em>null hypothesis</em>)</li>
+<li>Calculate the summary/statistic with the data we have (<em>test statistic</em>)</li>
+<li>Compare what we calculated to our hypothetical distribution and see if the value is &quot;extreme&quot; (<em>p-value</em>)</li>
+</ol>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>P-values</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The P-value is the probability under the null hypothesis of obtaining evidence as extreme or more extreme than would be observed by chance alone</li>
+<li>If the P-value is small, then either \(H_0\) is true and we have observed a rare event or \(H_0\) is false</li>
+<li> In our example the \(T\) statistic was \(0.8\). 
+
+<ul>
+<li>What&#39;s the probability of getting a \(T\) statistic as large as \(0.8\)?</li>
+</ul></li>
+</ul>
+
+<pre><code class="r">pt(0.8, 15, lower.tail = FALSE)
+</code></pre>
+
+<pre><code>## [1] 0.2181
+</code></pre>
+
+<ul>
+<li>Therefore, the probability of seeing evidence as extreme or more extreme than that actually obtained under \(H_0\) is 0.2181</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>The attained significance level</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Our test statistic was \(2\) for \(H_0 : \mu_0  = 30\) versus \(H_a:\mu > 30\).</li>
+<li>Notice that we rejected the one sided test when \(\alpha = 0.05\), would we reject if \(\alpha = 0.01\), how about \(0.001\)?</li>
+<li>The smallest value for alpha that you still reject the null hypothesis is called the <em>attained significance level</em></li>
+<li>This is equivalent, but philosophically a little different from, the <em>P-value</em></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Notes</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>By reporting a P-value the reader can perform the hypothesis
+test at whatever \(\alpha\) level he or she choses</li>
+<li>If the P-value is less than \(\alpha\) you reject the null hypothesis </li>
+<li>For two sided hypothesis test, double the smaller of the two one
+sided hypothesis test Pvalues</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Revisiting an earlier example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose a friend has \(8\) children, \(7\) of which are girls and none are twins</li>
+<li>If each gender has an independent \(50\)% probability for each birth, what&#39;s the probability of getting \(7\) or more girls out of \(8\) births?</li>
+</ul>
+
+<pre><code class="r">choose(8, 7) * 0.5^8 + choose(8, 8) * 0.5^8
+</code></pre>
+
+<pre><code>## [1] 0.03516
+</code></pre>
+
+<pre><code class="r">pbinom(6, size = 8, prob = 0.5, lower.tail = FALSE)
+</code></pre>
+
+<pre><code>## [1] 0.03516
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Poisson example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose that a hospital has an infection rate of 10 infections per 100 person/days at risk (rate of 0.1) during the last monitoring period.</li>
+<li>Assume that an infection rate of 0.05 is an important benchmark. </li>
+<li>Given the model, could the observed rate being larger than 0.05 be attributed to chance?</li>
+<li>Under \(H_0: \lambda = 0.05\) so that \(\lambda_0 100 = 5\)</li>
+<li>Consider \(H_a: \lambda > 0.05\).</li>
+</ul>
+
+<pre><code class="r">ppois(9, 5, lower.tail = FALSE)
+</code></pre>
+
+<pre><code>## [1] 0.03183
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='P-values'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='What is a P-value?'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='P-values'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='The attained significance level'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Notes'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Revisiting an earlier example'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Poisson example'>
+         7
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/03_03_pValues/index.md b/06_StatisticalInference/03_03_pValues/index.md
index 3cec2d0b8..5dabc5349 100644
--- a/06_StatisticalInference/03_03_pValues/index.md
+++ b/06_StatisticalInference/03_03_pValues/index.md
@@ -1,119 +1,117 @@
----
-title       : P-values
-subtitle    : Statistical inference
-author      : Brian Caffo, Jeffrey Leek, Roger Peng 
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow   # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-
-## P-values
-
-* Most common measure of "statistical significance"
-* Their ubiquity, along with concern over their interpretation and use
-  makes them controversial among statisticians
-  * [http://warnercnr.colostate.edu/~anderson/thompson1.html](http://warnercnr.colostate.edu/~anderson/thompson1.html)
-  * Also see *Statistical Evidence: A Likelihood Paradigm* by Richard Royall 
-  * *Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy* by Steve Goodman
-  * The hilariously titled: *The Earth is Round (p < .05)* by Cohen.
-* Some positive comments
-  * [simply statistics](http://simplystatistics.org/2012/01/06/p-values-and-hypothesis-testing-get-a-bad-rap-but-we/)
-  * [normal deviate](http://normaldeviate.wordpress.com/2013/03/14/double-misunderstandings-about-p-values/)
-  * [Error statistics](http://errorstatistics.com/2013/06/14/p-values-cant-be-trusted-except-when-used-to-argue-that-p-values-cant-be-trusted/)
-
----
-
-
-## What is a P-value? 
-
-__Idea__: Suppose nothing is going on - how unusual is it to see the estimate we got?
-
-__Approach__: 
-
-1. Define the hypothetical distribution of a data summary (statistic) when "nothing is going on" (_null hypothesis_)
-2. Calculate the summary/statistic with the data we have (_test statistic_)
-3. Compare what we calculated to our hypothetical distribution and see if the value is "extreme" (_p-value_)
-
----
-## P-values
-* The P-value is the probability under the null hypothesis of obtaining evidence as extreme or more extreme than would be observed by chance alone
-* If the P-value is small, then either $H_0$ is true and we have observed a rare event or $H_0$ is false
-*  In our example the $T$ statistic was $0.8$. 
-  * What's the probability of getting a $T$ statistic as large as $0.8$?
-
-```r
-pt(0.8, 15, lower.tail = FALSE) 
-```
-
-```
-[1] 0.2181
-```
-
-* Therefore, the probability of seeing evidence as extreme or more extreme than that actually obtained under $H_0$ is 0.2181
-
----
-## The attained significance level
-* Our test statistic was $2$ for $H_0 : \mu_0  = 30$ versus $H_a:\mu > 30$.
-* Notice that we rejected the one sided test when $\alpha = 0.05$, would we reject if $\alpha = 0.01$, how about $0.001$?
-* The smallest value for alpha that you still reject the null hypothesis is called the *attained significance level*
-* This is equivalent, but philosophically a little different from, the *P-value*
-
----
-## Notes
-* By reporting a P-value the reader can perform the hypothesis
-  test at whatever $\alpha$ level he or she choses
-* If the P-value is less than $\alpha$ you reject the null hypothesis 
-* For two sided hypothesis test, double the smaller of the two one
-  sided hypothesis test Pvalues
-
----
-## Revisiting an earlier example
-- Suppose a friend has $8$ children, $7$ of which are girls and none are twins
-- If each gender has an independent $50$% probability for each birth, what's the probability of getting $7$ or more girls out of $8$ births?
-
-```r
-choose(8, 7) * .5 ^ 8 + choose(8, 8) * .5 ^ 8 
-```
-
-```
-[1] 0.03516
-```
-
-```r
-pbinom(6, size = 8, prob = .5, lower.tail = FALSE)
-```
-
-```
-[1] 0.03516
-```
-
-
----
-## Poisson example
-- Suppose that a hospital has an infection rate of 10 infections per 100 person/days at risk (rate of 0.1) during the last monitoring period.
-- Assume that an infection rate of 0.05 is an important benchmark. 
-- Given the model, could the observed rate being larger than 0.05 be attributed to chance?
-- Under $H_0: \lambda = 0.05$ so that $\lambda_0 100 = 5$
-- Consider $H_a: \lambda > 0.05$.
-
-
-```r
-ppois(9, 5, lower.tail = FALSE)
-```
-
-```
-[1] 0.03183
-```
-
-
-
-
+---
+title       : P-values
+subtitle    : Statistical inference
+author      : Brian Caffo, Jeffrey Leek, Roger Peng 
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow   # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## P-values
+
+* Most common measure of "statistical significance"
+* Their ubiquity, along with concern over their interpretation and use
+  makes them controversial among statisticians
+  * [http://warnercnr.colostate.edu/~anderson/thompson1.html](http://warnercnr.colostate.edu/~anderson/thompson1.html)
+  * Also see *Statistical Evidence: A Likelihood Paradigm* by Richard Royall 
+  * *Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy* by Steve Goodman
+  * The hilariously titled: *The Earth is Round (p < .05)* by Cohen.
+* Some positive comments
+  * [simply statistics](http://simplystatistics.org/2012/01/06/p-values-and-hypothesis-testing-get-a-bad-rap-but-we/)
+  * [normal deviate](http://normaldeviate.wordpress.com/2013/03/14/double-misunderstandings-about-p-values/)
+  * [Error statistics](http://errorstatistics.com/2013/06/14/p-values-cant-be-trusted-except-when-used-to-argue-that-p-values-cant-be-trusted/)
+
+---
+
+
+## What is a P-value? 
+
+__Idea__: Suppose nothing is going on - how unusual is it to see the estimate we got?
+
+__Approach__: 
+
+1. Define the hypothetical distribution of a data summary (statistic) when "nothing is going on" (_null hypothesis_)
+2. Calculate the summary/statistic with the data we have (_test statistic_)
+3. Compare what we calculated to our hypothetical distribution and see if the value is "extreme" (_p-value_)
+
+---
+## P-values
+* The P-value is the probability under the null hypothesis of obtaining evidence as extreme or more extreme than would be observed by chance alone
+* If the P-value is small, then either $H_0$ is true and we have observed a rare event or $H_0$ is false
+*  In our example the $T$ statistic was $0.8$. 
+  * What's the probability of getting a $T$ statistic as large as $0.8$?
+
+```r
+pt(0.8, 15, lower.tail = FALSE)
+```
+
+```
+## [1] 0.2181
+```
+
+* Therefore, the probability of seeing evidence as extreme or more extreme than that actually obtained under $H_0$ is 0.2181
+
+---
+## The attained significance level
+* Our test statistic was $2$ for $H_0 : \mu_0  = 30$ versus $H_a:\mu > 30$.
+* Notice that we rejected the one sided test when $\alpha = 0.05$, would we reject if $\alpha = 0.01$, how about $0.001$?
+* The smallest value for alpha that you still reject the null hypothesis is called the *attained significance level*
+* This is equivalent, but philosophically a little different from, the *P-value*
+
+---
+## Notes
+* By reporting a P-value the reader can perform the hypothesis
+  test at whatever $\alpha$ level he or she choses
+* If the P-value is less than $\alpha$ you reject the null hypothesis 
+* For two sided hypothesis test, double the smaller of the two one
+  sided hypothesis test Pvalues
+
+---
+## Revisiting an earlier example
+- Suppose a friend has $8$ children, $7$ of which are girls and none are twins
+- If each gender has an independent $50$% probability for each birth, what's the probability of getting $7$ or more girls out of $8$ births?
+
+```r
+choose(8, 7) * 0.5^8 + choose(8, 8) * 0.5^8
+```
+
+```
+## [1] 0.03516
+```
+
+```r
+pbinom(6, size = 8, prob = 0.5, lower.tail = FALSE)
+```
+
+```
+## [1] 0.03516
+```
+
+
+---
+## Poisson example
+- Suppose that a hospital has an infection rate of 10 infections per 100 person/days at risk (rate of 0.1) during the last monitoring period.
+- Assume that an infection rate of 0.05 is an important benchmark. 
+- Given the model, could the observed rate being larger than 0.05 be attributed to chance?
+- Under $H_0: \lambda = 0.05$ so that $\lambda_0 100 = 5$
+- Consider $H_a: \lambda > 0.05$.
+
+
+```r
+ppois(9, 5, lower.tail = FALSE)
+```
+
+```
+## [1] 0.03183
+```
+
+
+
+
diff --git a/06_StatisticalInference/03_03_pValues/index.pdf b/06_StatisticalInference/03_03_pValues/index.pdf
index 85d8bd9d1..ba31db25c 100644
Binary files a/06_StatisticalInference/03_03_pValues/index.pdf and b/06_StatisticalInference/03_03_pValues/index.pdf differ
diff --git a/06_StatisticalInference/03_04_Power/assets/fig/unnamed-chunk-2.png b/06_StatisticalInference/03_04_Power/assets/fig/unnamed-chunk-2.png
new file mode 100644
index 000000000..c1c383311
Binary files /dev/null and b/06_StatisticalInference/03_04_Power/assets/fig/unnamed-chunk-2.png differ
diff --git a/06_StatisticalInference/03_04_Power/index.Rmd b/06_StatisticalInference/03_04_Power/index.Rmd
index ae0446258..53898e8cc 100644
--- a/06_StatisticalInference/03_04_Power/index.Rmd
+++ b/06_StatisticalInference/03_04_Power/index.Rmd
@@ -1,153 +1,137 @@
----
-title       : Power
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F, results='hide'}
-# make this an external chunk that can be included in any file
-options(width = 100)
-opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/')
-
-options(xtable.type = 'html')
-knit_hooks$set(inline = function(x) {
-  if(is.numeric(x)) {
-    round(x, getOption('digits'))
-  } else {
-    paste(as.character(x), collapse = ', ')
-  }
-})
-knit_hooks$set(plot = knitr:::hook_plot_html)
-runif(1)
-```
-
-## Power
-- Power is the probability of rejecting the null hypothesis when it is false
-- Ergo, power (as it's name would suggest) is a good thing; you want more power
-- A type II error (a bad thing, as its name would suggest) is failing to reject the null hypothesis when it's false; the probability of a type II error is usually called $\beta$
-- Note Power  $= 1 - \beta$
-
----
-## Notes
-- Consider our previous example involving RDI
-- $H_0: \mu = 30$ versus $H_a: \mu > 30$
-- Then power is 
-$$P\left(\frac{\bar X - 30}{s /\sqrt{n}} > t_{1-\alpha,n-1} ~|~ \mu = \mu_a \right)$$
-- Note that this is a function that depends on the specific value of $\mu_a$!
-- Notice as $\mu_a$ approaches $30$ the power approaches $\alpha$
-
-
----
-## Calculating power for Gaussian data
-Assume that $n$ is large and that we know $\sigma$
-$$
-\begin{align}
-1 -\beta & = 
-P\left(\frac{\bar X - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\
-& = P\left(\frac{\bar X - \mu_a + \mu_a - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\ \\
-& = P\left(\frac{\bar X - \mu_a}{\sigma /\sqrt{n}} > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
-& = P\left(Z > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
-\end{align}
-$$
-
----
-## Example continued
--  Suppose that we wanted to detect a increase in mean RDI
-  of at least 2 events / hour (above 30). 
-- Assume normality and that the sample in question will have a standard deviation of $4$;
-- What would be the power if we took a sample size of $16$?
-  - $Z_{1-\alpha} = 1.645$ 
-  - $\frac{\mu_a - 30}{\sigma /\sqrt{n}} = 2 / (4 /\sqrt{16}) = 2$ 
-  - $P(Z > 1.645 - 2) = P(Z > -0.355) = 64\%$
-```{r}
-pnorm(-0.355, lower.tail = FALSE)
-```
-
----
-## Note
-- Consider $H_0 : \mu = \mu_0$ and $H_a : \mu > \mu_0$ with $\mu = \mu_a$ under $H_a$.
-- Under $H_0$ the statistic $Z = \frac{\sqrt{n}(\bar X - \mu_0)}{\sigma}$ is $N(0, 1)$
-- Under $H_a$ $Z$ is $N\left( \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}, 1\right)$
-- We reject if $Z > Z_{1-\alpha}$
-
-```
-sigma <- 10; mu_0 = 0; mu_a = 2; n <- 100; alpha = .05
-plot(c(-3, 6),c(0, dnorm(0)), type = "n", frame = false, xlab = "Z value", ylab = "")
-xvals <- seq(-3, 6, length = 1000)
-lines(xvals, dnorm(xvals), type = "l", lwd = 3)
-lines(xvals, dnorm(xvals, mean = sqrt(n) * (mu_a - mu_0) / sigma), lwd =3)
-abline(v = qnorm(1 - alpha))
-```
-
----
-```{r, fig.height=5, fig.width=5, echo = FALSE}
-sigma <- 10; mu_0 = 0; mu_a = 2; n <- 100; alpha = .05
-plot(c(-3, 6),c(0, dnorm(0)), type = "n", frame = false, xlab = "Z value", ylab = "")
-xvals <- seq(-3, 6, length = 1000)
-lines(xvals, dnorm(xvals), type = "l", lwd = 3)
-lines(xvals, dnorm(xvals, mean = sqrt(n) * (mu_a - mu_0) / sigma), lwd =3)
-abline(v = qnorm(1 - alpha))
-```
-
-
----
-## Question
-- When testing $H_a : \mu > \mu_0$, notice if power is $1 - \beta$, then 
-$$1 - \beta = P\left(Z > z_{1-\alpha} - \frac{\mu_a - \mu_0}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right) = P(Z > z_{\beta})$$
-- This yields the equation
-$$z_{1-\alpha} - \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma} = z_{\beta}$$
-- Unknowns: $\mu_a$, $\sigma$, $n$, $\beta$
-- Knowns: $\mu_0$, $\alpha$
-- Specify any 3 of the unknowns and you can solve for the remainder
-
----
-## Notes
-- The calculation for $H_a:\mu < \mu_0$ is similar
-- For $H_a: \mu \neq \mu_0$ calculate the one sided power using
-  $\alpha / 2$ (this is only approximately right, it excludes the probability of
-  getting a large TS in the opposite direction of the truth)
-- Power goes up as $\alpha$ gets larger
-- Power of a one sided test is greater than the power of the
-  associated two sided test
-- Power goes up as $\mu_1$ gets further away from $\mu_0$
-- Power goes up as $n$ goes up
-- Power doesn't need $\mu_a$, $\sigma$ and $n$, instead only $\frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}$
-  - The quantity $\frac{\mu_a - \mu_0}{\sigma}$ is called the effect size, the difference in the means in standard deviation units.
-  - Being unit free, it has some hope of interpretability across settings
-
----
-## T-test power
--  Consider calculating power for a Gossett's $T$ test for our example
--  The power is
-  $$
-  P\left(\frac{\bar X - \mu_0}{S /\sqrt{n}} > t_{1-\alpha, n-1} ~|~ \mu = \mu_a \right)
-  $$
-- Calcuting this requires the non-central t distribution.
-- `power.t.test` does this very well
-  - Omit one of the arguments and it solves for it
-
----
-## Example
-```{r}
-power.t.test(n = 16, delta = 2 / 4, sd=1, type = "one.sample",  alt = "one.sided")$power
-power.t.test(n = 16, delta = 2, sd=4, type = "one.sample",  alt = "one.sided")$power
-power.t.test(n = 16, delta = 100, sd=200, type = "one.sample", alt = "one.sided")$power
-```
-
----
-## Example
-```{r}
-power.t.test(power = .8, delta = 2 / 4, sd=1, type = "one.sample",  alt = "one.sided")$n
-power.t.test(power = .8, delta = 2, sd=4, type = "one.sample",  alt = "one.sided")$n
-power.t.test(power = .8, delta = 100, sd=200, type = "one.sample", alt = "one.sided")$n
-```
-
+---
+title       : Power
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Power
+- Power is the probability of rejecting the null hypothesis when it is false
+- Ergo, power (as it's name would suggest) is a good thing; you want more power
+- A type II error (a bad thing, as its name would suggest) is failing to reject the null hypothesis when it's false; the probability of a type II error is usually called $\beta$
+- Note Power  $= 1 - \beta$
+
+---
+## Notes
+- Consider our previous example involving RDI
+- $H_0: \mu = 30$ versus $H_a: \mu > 30$
+- Then power is 
+$$P\left(\frac{\bar X - 30}{s /\sqrt{n}} > t_{1-\alpha,n-1} ~|~ \mu = \mu_a \right)$$
+- Note that this is a function that depends on the specific value of $\mu_a$!
+- Notice as $\mu_a$ approaches $30$ the power approaches $\alpha$
+
+
+---
+## Calculating power for Gaussian data
+Assume that $n$ is large and that we know $\sigma$
+$$
+\begin{align}
+1 -\beta & = 
+P\left(\frac{\bar X - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\
+& = P\left(\frac{\bar X - \mu_a + \mu_a - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\ \\
+& = P\left(\frac{\bar X - \mu_a}{\sigma /\sqrt{n}} > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
+& = P\left(Z > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
+\end{align}
+$$
+
+---
+## Example continued
+-  Suppose that we wanted to detect a increase in mean RDI
+  of at least 2 events / hour (above 30). 
+- Assume normality and that the sample in question will have a standard deviation of $4$;
+- What would be the power if we took a sample size of $16$?
+  - $Z_{1-\alpha} = 1.645$ 
+  - $\frac{\mu_a - 30}{\sigma /\sqrt{n}} = 2 / (4 /\sqrt{16}) = 2$ 
+  - $P(Z > 1.645 - 2) = P(Z > -0.355) = 64\%$
+```{r}
+pnorm(-0.355, lower.tail = FALSE)
+```
+
+---
+## Note 
+- Consider $H_0 : \mu = \mu_0$ and $H_a : \mu > \mu_0$ with $\mu = \mu_a$ under $H_a$.
+- Under $H_0$ the statistic $Z = \frac{\sqrt{n}(\bar X - \mu_0)}{\sigma}$ is $N(0, 1)$
+- Under $H_a$ $Z$ is $N\left( \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}, 1\right)$
+- We reject if $Z > Z_{1-\alpha}$
+
+```
+sigma <- 10; mu_0 = 0; mu_a = 2; n <- 100; alpha = .05
+plot(c(-3, 6),c(0, dnorm(0)), type = "n", frame = FALSE, xlab = "Z value", ylab = "")
+xvals <- seq(-3, 6, length = 1000)
+lines(xvals, dnorm(xvals), type = "l", lwd = 3)
+lines(xvals, dnorm(xvals, mean = sqrt(n) * (mu_a - mu_0) / sigma), lwd =3)
+abline(v = qnorm(1 - alpha))
+```
+
+---
+```{r, fig.height=5, fig.width=5, echo = FALSE}
+sigma <- 10; mu_0 = 0; mu_a = 2; n <- 100; alpha = .05
+plot(c(-3, 6),c(0, dnorm(0)), type = "n", frame = FALSE, xlab = "Z value", ylab = "")
+xvals <- seq(-3, 6, length = 1000)
+lines(xvals, dnorm(xvals), type = "l", lwd = 3)
+lines(xvals, dnorm(xvals, mean = sqrt(n) * (mu_a - mu_0) / sigma), lwd =3)
+abline(v = qnorm(1 - alpha))
+```
+
+
+---
+## Question
+- When testing $H_a : \mu > \mu_0$, notice if power is $1 - \beta$, then 
+$$1 - \beta = P\left(Z > z_{1-\alpha} - \frac{\mu_a - \mu_0}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right) = P(Z > z_{\beta})$$
+- This yields the equation
+$$z_{1-\alpha} - \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma} = z_{\beta}$$
+- Unknowns: $\mu_a$, $\sigma$, $n$, $\beta$
+- Knowns: $\mu_0$, $\alpha$
+- Specify any 3 of the unknowns and you can solve for the remainder
+
+---
+## Notes
+- The calculation for $H_a:\mu < \mu_0$ is similar
+- For $H_a: \mu \neq \mu_0$ calculate the one sided power using
+  $\alpha / 2$ (this is only approximately right, it excludes the probability of
+  getting a large TS in the opposite direction of the truth)
+- Power goes up as $\alpha$ gets larger
+- Power of a one sided test is greater than the power of the
+  associated two sided test
+- Power goes up as $\mu_1$ gets further away from $\mu_0$
+- Power goes up as $n$ goes up
+- Power doesn't need $\mu_a$, $\sigma$ and $n$, instead only $\frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}$
+  - The quantity $\frac{\mu_a - \mu_0}{\sigma}$ is called the effect size, the difference in the means in standard deviation units.
+  - Being unit free, it has some hope of interpretability across settings
+
+---
+## T-test power
+-  Consider calculating power for a Gossett's $T$ test for our example
+-  The power is
+  $$
+  P\left(\frac{\bar X - \mu_0}{S /\sqrt{n}} > t_{1-\alpha, n-1} ~|~ \mu = \mu_a \right)
+  $$
+- Calcuting this requires the non-central t distribution.
+- `power.t.test` does this very well
+  - Omit one of the arguments and it solves for it
+
+---
+## Example
+```{r}
+power.t.test(n = 16, delta = 2 / 4, sd=1, type = "one.sample",  alt = "one.sided")$power
+power.t.test(n = 16, delta = 2, sd=4, type = "one.sample",  alt = "one.sided")$power
+power.t.test(n = 16, delta = 100, sd=200, type = "one.sample", alt = "one.sided")$power
+```
+
+---
+## Example
+```{r}
+power.t.test(power = .8, delta = 2 / 4, sd=1, type = "one.sample",  alt = "one.sided")$n
+power.t.test(power = .8, delta = 2, sd=4, type = "one.sample",  alt = "one.sided")$n
+power.t.test(power = .8, delta = 100, sd=200, type = "one.sample", alt = "one.sided")$n
+```
+
diff --git a/06_StatisticalInference/03_04_Power/index.html b/06_StatisticalInference/03_04_Power/index.html
index 84415fc6d..6465ed50f 100644
--- a/06_StatisticalInference/03_04_Power/index.html
+++ b/06_StatisticalInference/03_04_Power/index.html
@@ -1,382 +1,382 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Power</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Power">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Power</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Power</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Power is the probability of rejecting the null hypothesis when it is false</li>
-<li>Ergo, power (as it&#39;s name would suggest) is a good thing; you want more power</li>
-<li>A type II error (a bad thing, as its name would suggest) is failing to reject the null hypothesis when it&#39;s false; the probability of a type II error is usually called \(\beta\)</li>
-<li>Note Power  \(= 1 - \beta\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Notes</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Consider our previous example involving RDI</li>
-<li>\(H_0: \mu = 30\) versus \(H_a: \mu > 30\)</li>
-<li>Then power is 
-\[P\left(\frac{\bar X - 30}{s /\sqrt{n}} > t_{1-\alpha,n-1} ~|~ \mu = \mu_a \right)\]</li>
-<li>Note that this is a function that depends on the specific value of \(\mu_a\)!</li>
-<li>Notice as \(\mu_a\) approaches \(30\) the power approaches \(\alpha\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Calculating power for Gaussian data</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>Assume that \(n\) is large and that we know \(\sigma\)
-\[
-\begin{align}
-1 -\beta & = 
-P\left(\frac{\bar X - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\
-& = P\left(\frac{\bar X - \mu_a + \mu_a - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\ \\
-& = P\left(\frac{\bar X - \mu_a}{\sigma /\sqrt{n}} > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
-& = P\left(Z > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
-\end{align}
-\]</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Example continued</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li> Suppose that we wanted to detect a increase in mean RDI
-of at least 2 events / hour (above 30). </li>
-<li>Assume normality and that the sample in question will have a standard deviation of \(4\);</li>
-<li>What would be the power if we took a sample size of \(16\)?
-
-<ul>
-<li>\(Z_{1-\alpha} = 1.645\) </li>
-<li>\(\frac{\mu_a - 30}{\sigma /\sqrt{n}} = 2 / (4 /\sqrt{16}) = 2\) </li>
-<li>\(P(Z > 1.645 - 2) = P(Z > -0.355) = 64\%\)</li>
-</ul></li>
-</ul>
-
-<pre><code class="r">pnorm(-0.355, lower.tail = FALSE)
-</code></pre>
-
-<pre><code>[1] 0.6387
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Note</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Consider \(H_0 : \mu = \mu_0\) and \(H_a : \mu > \mu_0\) with \(\mu = \mu_a\) under \(H_a\).</li>
-<li>Under \(H_0\) the statistic \(Z = \frac{\sqrt{n}(\bar X - \mu_0)}{\sigma}\) is \(N(0, 1)\)</li>
-<li>Under \(H_a\) \(Z\) is \(N\left( \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}, 1\right)\)</li>
-<li>We reject if \(Z > Z_{1-\alpha}\)</li>
-</ul>
-
-<pre><code>sigma &lt;- 10; mu_0 = 0; mu_a = 2; n &lt;- 100; alpha = .05
-plot(c(-3, 6),c(0, dnorm(0)), type = &quot;n&quot;, frame = false, xlab = &quot;Z value&quot;, ylab = &quot;&quot;)
-xvals &lt;- seq(-3, 6, length = 1000)
-lines(xvals, dnorm(xvals), type = &quot;l&quot;, lwd = 3)
-lines(xvals, dnorm(xvals, mean = sqrt(n) * (mu_a - mu_0) / sigma), lwd =3)
-abline(v = qnorm(1 - alpha))
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <article data-timings="">
-    <div class="rimage center"><img src="fig/unnamed-chunk-2.png" title="plot of chunk unnamed-chunk-2" alt="plot of chunk unnamed-chunk-2" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Question</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>When testing \(H_a : \mu > \mu_0\), notice if power is \(1 - \beta\), then 
-\[1 - \beta = P\left(Z > z_{1-\alpha} - \frac{\mu_a - \mu_0}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right) = P(Z > z_{\beta})\]</li>
-<li>This yields the equation
-\[z_{1-\alpha} - \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma} = z_{\beta}\]</li>
-<li>Unknowns: \(\mu_a\), \(\sigma\), \(n\), \(\beta\)</li>
-<li>Knowns: \(\mu_0\), \(\alpha\)</li>
-<li>Specify any 3 of the unknowns and you can solve for the remainder</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Notes</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The calculation for \(H_a:\mu < \mu_0\) is similar</li>
-<li>For \(H_a: \mu \neq \mu_0\) calculate the one sided power using
-\(\alpha / 2\) (this is only approximately right, it excludes the probability of
-getting a large TS in the opposite direction of the truth)</li>
-<li>Power goes up as \(\alpha\) gets larger</li>
-<li>Power of a one sided test is greater than the power of the
-associated two sided test</li>
-<li>Power goes up as \(\mu_1\) gets further away from \(\mu_0\)</li>
-<li>Power goes up as \(n\) goes up</li>
-<li>Power doesn&#39;t need \(\mu_a\), \(\sigma\) and \(n\), instead only \(\frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}\)
-
-<ul>
-<li>The quantity \(\frac{\mu_a - \mu_0}{\sigma}\) is called the effect size, the difference in the means in standard deviation units.</li>
-<li>Being unit free, it has some hope of interpretability across settings</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>T-test power</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li> Consider calculating power for a Gossett&#39;s \(T\) test for our example</li>
-<li> The power is
-\[
-P\left(\frac{\bar X - \mu_0}{S /\sqrt{n}} > t_{1-\alpha, n-1} ~|~ \mu = \mu_a \right)
-\]</li>
-<li>Calcuting this requires the non-central t distribution.</li>
-<li><code>power.t.test</code> does this very well
-
-<ul>
-<li>Omit one of the arguments and it solves for it</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">power.t.test(n = 16, delta = 2 / 4, sd=1, type = &quot;one.sample&quot;,  alt = &quot;one.sided&quot;)$power
-</code></pre>
-
-<pre><code>[1] 0.604
-</code></pre>
-
-<pre><code class="r">power.t.test(n = 16, delta = 2, sd=4, type = &quot;one.sample&quot;,  alt = &quot;one.sided&quot;)$power
-</code></pre>
-
-<pre><code>[1] 0.604
-</code></pre>
-
-<pre><code class="r">power.t.test(n = 16, delta = 100, sd=200, type = &quot;one.sample&quot;, alt = &quot;one.sided&quot;)$power
-</code></pre>
-
-<pre><code>[1] 0.604
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">power.t.test(power = .8, delta = 2 / 4, sd=1, type = &quot;one.sample&quot;,  alt = &quot;one.sided&quot;)$n
-</code></pre>
-
-<pre><code>[1] 26.14
-</code></pre>
-
-<pre><code class="r">power.t.test(power = .8, delta = 2, sd=4, type = &quot;one.sample&quot;,  alt = &quot;one.sided&quot;)$n
-</code></pre>
-
-<pre><code>[1] 26.14
-</code></pre>
-
-<pre><code class="r">power.t.test(power = .8, delta = 100, sd=200, type = &quot;one.sample&quot;, alt = &quot;one.sided&quot;)$n
-</code></pre>
-
-<pre><code>[1] 26.14
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Power'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Notes'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Calculating power for Gaussian data'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Example continued'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Note'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title=''>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Question'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Notes'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='T-test power'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='Example'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title='Example'>
-         11
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Power</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Power">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Power</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Power</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Power is the probability of rejecting the null hypothesis when it is false</li>
+<li>Ergo, power (as it&#39;s name would suggest) is a good thing; you want more power</li>
+<li>A type II error (a bad thing, as its name would suggest) is failing to reject the null hypothesis when it&#39;s false; the probability of a type II error is usually called \(\beta\)</li>
+<li>Note Power  \(= 1 - \beta\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Notes</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Consider our previous example involving RDI</li>
+<li>\(H_0: \mu = 30\) versus \(H_a: \mu > 30\)</li>
+<li>Then power is 
+\[P\left(\frac{\bar X - 30}{s /\sqrt{n}} > t_{1-\alpha,n-1} ~|~ \mu = \mu_a \right)\]</li>
+<li>Note that this is a function that depends on the specific value of \(\mu_a\)!</li>
+<li>Notice as \(\mu_a\) approaches \(30\) the power approaches \(\alpha\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Calculating power for Gaussian data</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>Assume that \(n\) is large and that we know \(\sigma\)
+\[
+\begin{align}
+1 -\beta & = 
+P\left(\frac{\bar X - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\
+& = P\left(\frac{\bar X - \mu_a + \mu_a - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\ \\
+& = P\left(\frac{\bar X - \mu_a}{\sigma /\sqrt{n}} > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
+& = P\left(Z > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
+\end{align}
+\]</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Example continued</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li> Suppose that we wanted to detect a increase in mean RDI
+of at least 2 events / hour (above 30). </li>
+<li>Assume normality and that the sample in question will have a standard deviation of \(4\);</li>
+<li>What would be the power if we took a sample size of \(16\)?
+
+<ul>
+<li>\(Z_{1-\alpha} = 1.645\) </li>
+<li>\(\frac{\mu_a - 30}{\sigma /\sqrt{n}} = 2 / (4 /\sqrt{16}) = 2\) </li>
+<li>\(P(Z > 1.645 - 2) = P(Z > -0.355) = 64\%\)</li>
+</ul></li>
+</ul>
+
+<pre><code class="r">pnorm(-0.355, lower.tail = FALSE)
+</code></pre>
+
+<pre><code>## [1] 0.6387
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Note</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Consider \(H_0 : \mu = \mu_0\) and \(H_a : \mu > \mu_0\) with \(\mu = \mu_a\) under \(H_a\).</li>
+<li>Under \(H_0\) the statistic \(Z = \frac{\sqrt{n}(\bar X - \mu_0)}{\sigma}\) is \(N(0, 1)\)</li>
+<li>Under \(H_a\) \(Z\) is \(N\left( \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}, 1\right)\)</li>
+<li>We reject if \(Z > Z_{1-\alpha}\)</li>
+</ul>
+
+<pre><code>sigma &lt;- 10; mu_0 = 0; mu_a = 2; n &lt;- 100; alpha = .05
+plot(c(-3, 6),c(0, dnorm(0)), type = &quot;n&quot;, frame = FALSE, xlab = &quot;Z value&quot;, ylab = &quot;&quot;)
+xvals &lt;- seq(-3, 6, length = 1000)
+lines(xvals, dnorm(xvals), type = &quot;l&quot;, lwd = 3)
+lines(xvals, dnorm(xvals, mean = sqrt(n) * (mu_a - mu_0) / sigma), lwd =3)
+abline(v = qnorm(1 - alpha))
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <article data-timings="">
+    <p><img src="assets/fig/unnamed-chunk-2.png" alt="plot of chunk unnamed-chunk-2"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Question</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>When testing \(H_a : \mu > \mu_0\), notice if power is \(1 - \beta\), then 
+\[1 - \beta = P\left(Z > z_{1-\alpha} - \frac{\mu_a - \mu_0}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right) = P(Z > z_{\beta})\]</li>
+<li>This yields the equation
+\[z_{1-\alpha} - \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma} = z_{\beta}\]</li>
+<li>Unknowns: \(\mu_a\), \(\sigma\), \(n\), \(\beta\)</li>
+<li>Knowns: \(\mu_0\), \(\alpha\)</li>
+<li>Specify any 3 of the unknowns and you can solve for the remainder</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Notes</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The calculation for \(H_a:\mu < \mu_0\) is similar</li>
+<li>For \(H_a: \mu \neq \mu_0\) calculate the one sided power using
+\(\alpha / 2\) (this is only approximately right, it excludes the probability of
+getting a large TS in the opposite direction of the truth)</li>
+<li>Power goes up as \(\alpha\) gets larger</li>
+<li>Power of a one sided test is greater than the power of the
+associated two sided test</li>
+<li>Power goes up as \(\mu_1\) gets further away from \(\mu_0\)</li>
+<li>Power goes up as \(n\) goes up</li>
+<li>Power doesn&#39;t need \(\mu_a\), \(\sigma\) and \(n\), instead only \(\frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}\)
+
+<ul>
+<li>The quantity \(\frac{\mu_a - \mu_0}{\sigma}\) is called the effect size, the difference in the means in standard deviation units.</li>
+<li>Being unit free, it has some hope of interpretability across settings</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>T-test power</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li> Consider calculating power for a Gossett&#39;s \(T\) test for our example</li>
+<li> The power is
+\[
+P\left(\frac{\bar X - \mu_0}{S /\sqrt{n}} > t_{1-\alpha, n-1} ~|~ \mu = \mu_a \right)
+\]</li>
+<li>Calcuting this requires the non-central t distribution.</li>
+<li><code>power.t.test</code> does this very well
+
+<ul>
+<li>Omit one of the arguments and it solves for it</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">power.t.test(n = 16, delta = 2/4, sd = 1, type = &quot;one.sample&quot;, alt = &quot;one.sided&quot;)$power
+</code></pre>
+
+<pre><code>## [1] 0.604
+</code></pre>
+
+<pre><code class="r">power.t.test(n = 16, delta = 2, sd = 4, type = &quot;one.sample&quot;, alt = &quot;one.sided&quot;)$power
+</code></pre>
+
+<pre><code>## [1] 0.604
+</code></pre>
+
+<pre><code class="r">power.t.test(n = 16, delta = 100, sd = 200, type = &quot;one.sample&quot;, alt = &quot;one.sided&quot;)$power
+</code></pre>
+
+<pre><code>## [1] 0.604
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">power.t.test(power = 0.8, delta = 2/4, sd = 1, type = &quot;one.sample&quot;, alt = &quot;one.sided&quot;)$n
+</code></pre>
+
+<pre><code>## [1] 26.14
+</code></pre>
+
+<pre><code class="r">power.t.test(power = 0.8, delta = 2, sd = 4, type = &quot;one.sample&quot;, alt = &quot;one.sided&quot;)$n
+</code></pre>
+
+<pre><code>## [1] 26.14
+</code></pre>
+
+<pre><code class="r">power.t.test(power = 0.8, delta = 100, sd = 200, type = &quot;one.sample&quot;, alt = &quot;one.sided&quot;)$n
+</code></pre>
+
+<pre><code>## [1] 26.14
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Power'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Notes'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Calculating power for Gaussian data'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Example continued'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Note'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title=''>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Question'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Notes'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='T-test power'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='Example'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title='Example'>
+         11
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/03_04_Power/index.md b/06_StatisticalInference/03_04_Power/index.md
index 1b289c777..37484c7b4 100644
--- a/06_StatisticalInference/03_04_Power/index.md
+++ b/06_StatisticalInference/03_04_Power/index.md
@@ -1,179 +1,177 @@
----
-title       : Power
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-
-## Power
-- Power is the probability of rejecting the null hypothesis when it is false
-- Ergo, power (as it's name would suggest) is a good thing; you want more power
-- A type II error (a bad thing, as its name would suggest) is failing to reject the null hypothesis when it's false; the probability of a type II error is usually called $\beta$
-- Note Power  $= 1 - \beta$
-
----
-## Notes
-- Consider our previous example involving RDI
-- $H_0: \mu = 30$ versus $H_a: \mu > 30$
-- Then power is 
-$$P\left(\frac{\bar X - 30}{s /\sqrt{n}} > t_{1-\alpha,n-1} ~|~ \mu = \mu_a \right)$$
-- Note that this is a function that depends on the specific value of $\mu_a$!
-- Notice as $\mu_a$ approaches $30$ the power approaches $\alpha$
-
-
----
-## Calculating power for Gaussian data
-Assume that $n$ is large and that we know $\sigma$
-$$
-\begin{align}
-1 -\beta & = 
-P\left(\frac{\bar X - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\
-& = P\left(\frac{\bar X - \mu_a + \mu_a - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\ \\
-& = P\left(\frac{\bar X - \mu_a}{\sigma /\sqrt{n}} > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
-& = P\left(Z > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
-\end{align}
-$$
-
----
-## Example continued
--  Suppose that we wanted to detect a increase in mean RDI
-  of at least 2 events / hour (above 30). 
-- Assume normality and that the sample in question will have a standard deviation of $4$;
-- What would be the power if we took a sample size of $16$?
-  - $Z_{1-\alpha} = 1.645$ 
-  - $\frac{\mu_a - 30}{\sigma /\sqrt{n}} = 2 / (4 /\sqrt{16}) = 2$ 
-  - $P(Z > 1.645 - 2) = P(Z > -0.355) = 64\%$
-
-```r
-pnorm(-0.355, lower.tail = FALSE)
-```
-
-```
-[1] 0.6387
-```
-
-
----
-## Note
-- Consider $H_0 : \mu = \mu_0$ and $H_a : \mu > \mu_0$ with $\mu = \mu_a$ under $H_a$.
-- Under $H_0$ the statistic $Z = \frac{\sqrt{n}(\bar X - \mu_0)}{\sigma}$ is $N(0, 1)$
-- Under $H_a$ $Z$ is $N\left( \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}, 1\right)$
-- We reject if $Z > Z_{1-\alpha}$
-
-```
-sigma <- 10; mu_0 = 0; mu_a = 2; n <- 100; alpha = .05
-plot(c(-3, 6),c(0, dnorm(0)), type = "n", frame = false, xlab = "Z value", ylab = "")
-xvals <- seq(-3, 6, length = 1000)
-lines(xvals, dnorm(xvals), type = "l", lwd = 3)
-lines(xvals, dnorm(xvals, mean = sqrt(n) * (mu_a - mu_0) / sigma), lwd =3)
-abline(v = qnorm(1 - alpha))
-```
-
----
-<div class="rimage center"><img src="fig/unnamed-chunk-2.png" title="plot of chunk unnamed-chunk-2" alt="plot of chunk unnamed-chunk-2" class="plot" /></div>
-
-
-
----
-## Question
-- When testing $H_a : \mu > \mu_0$, notice if power is $1 - \beta$, then 
-$$1 - \beta = P\left(Z > z_{1-\alpha} - \frac{\mu_a - \mu_0}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right) = P(Z > z_{\beta})$$
-- This yields the equation
-$$z_{1-\alpha} - \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma} = z_{\beta}$$
-- Unknowns: $\mu_a$, $\sigma$, $n$, $\beta$
-- Knowns: $\mu_0$, $\alpha$
-- Specify any 3 of the unknowns and you can solve for the remainder
-
----
-## Notes
-- The calculation for $H_a:\mu < \mu_0$ is similar
-- For $H_a: \mu \neq \mu_0$ calculate the one sided power using
-  $\alpha / 2$ (this is only approximately right, it excludes the probability of
-  getting a large TS in the opposite direction of the truth)
-- Power goes up as $\alpha$ gets larger
-- Power of a one sided test is greater than the power of the
-  associated two sided test
-- Power goes up as $\mu_1$ gets further away from $\mu_0$
-- Power goes up as $n$ goes up
-- Power doesn't need $\mu_a$, $\sigma$ and $n$, instead only $\frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}$
-  - The quantity $\frac{\mu_a - \mu_0}{\sigma}$ is called the effect size, the difference in the means in standard deviation units.
-  - Being unit free, it has some hope of interpretability across settings
-
----
-## T-test power
--  Consider calculating power for a Gossett's $T$ test for our example
--  The power is
-  $$
-  P\left(\frac{\bar X - \mu_0}{S /\sqrt{n}} > t_{1-\alpha, n-1} ~|~ \mu = \mu_a \right)
-  $$
-- Calcuting this requires the non-central t distribution.
-- `power.t.test` does this very well
-  - Omit one of the arguments and it solves for it
-
----
-## Example
-
-```r
-power.t.test(n = 16, delta = 2 / 4, sd=1, type = "one.sample",  alt = "one.sided")$power
-```
-
-```
-[1] 0.604
-```
-
-```r
-power.t.test(n = 16, delta = 2, sd=4, type = "one.sample",  alt = "one.sided")$power
-```
-
-```
-[1] 0.604
-```
-
-```r
-power.t.test(n = 16, delta = 100, sd=200, type = "one.sample", alt = "one.sided")$power
-```
-
-```
-[1] 0.604
-```
-
-
----
-## Example
-
-```r
-power.t.test(power = .8, delta = 2 / 4, sd=1, type = "one.sample",  alt = "one.sided")$n
-```
-
-```
-[1] 26.14
-```
-
-```r
-power.t.test(power = .8, delta = 2, sd=4, type = "one.sample",  alt = "one.sided")$n
-```
-
-```
-[1] 26.14
-```
-
-```r
-power.t.test(power = .8, delta = 100, sd=200, type = "one.sample", alt = "one.sided")$n
-```
-
-```
-[1] 26.14
-```
-
-
+---
+title       : Power
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+
+## Power
+- Power is the probability of rejecting the null hypothesis when it is false
+- Ergo, power (as it's name would suggest) is a good thing; you want more power
+- A type II error (a bad thing, as its name would suggest) is failing to reject the null hypothesis when it's false; the probability of a type II error is usually called $\beta$
+- Note Power  $= 1 - \beta$
+
+---
+## Notes
+- Consider our previous example involving RDI
+- $H_0: \mu = 30$ versus $H_a: \mu > 30$
+- Then power is 
+$$P\left(\frac{\bar X - 30}{s /\sqrt{n}} > t_{1-\alpha,n-1} ~|~ \mu = \mu_a \right)$$
+- Note that this is a function that depends on the specific value of $\mu_a$!
+- Notice as $\mu_a$ approaches $30$ the power approaches $\alpha$
+
+
+---
+## Calculating power for Gaussian data
+Assume that $n$ is large and that we know $\sigma$
+$$
+\begin{align}
+1 -\beta & = 
+P\left(\frac{\bar X - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\
+& = P\left(\frac{\bar X - \mu_a + \mu_a - 30}{\sigma /\sqrt{n}} > z_{1-\alpha} ~|~ \mu = \mu_a \right)\\ \\
+& = P\left(\frac{\bar X - \mu_a}{\sigma /\sqrt{n}} > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
+& = P\left(Z > z_{1-\alpha} - \frac{\mu_a - 30}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right)\\ \\
+\end{align}
+$$
+
+---
+## Example continued
+-  Suppose that we wanted to detect a increase in mean RDI
+  of at least 2 events / hour (above 30). 
+- Assume normality and that the sample in question will have a standard deviation of $4$;
+- What would be the power if we took a sample size of $16$?
+  - $Z_{1-\alpha} = 1.645$ 
+  - $\frac{\mu_a - 30}{\sigma /\sqrt{n}} = 2 / (4 /\sqrt{16}) = 2$ 
+  - $P(Z > 1.645 - 2) = P(Z > -0.355) = 64\%$
+
+```r
+pnorm(-0.355, lower.tail = FALSE)
+```
+
+```
+## [1] 0.6387
+```
+
+
+---
+## Note 
+- Consider $H_0 : \mu = \mu_0$ and $H_a : \mu > \mu_0$ with $\mu = \mu_a$ under $H_a$.
+- Under $H_0$ the statistic $Z = \frac{\sqrt{n}(\bar X - \mu_0)}{\sigma}$ is $N(0, 1)$
+- Under $H_a$ $Z$ is $N\left( \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}, 1\right)$
+- We reject if $Z > Z_{1-\alpha}$
+
+```
+sigma <- 10; mu_0 = 0; mu_a = 2; n <- 100; alpha = .05
+plot(c(-3, 6),c(0, dnorm(0)), type = "n", frame = FALSE, xlab = "Z value", ylab = "")
+xvals <- seq(-3, 6, length = 1000)
+lines(xvals, dnorm(xvals), type = "l", lwd = 3)
+lines(xvals, dnorm(xvals, mean = sqrt(n) * (mu_a - mu_0) / sigma), lwd =3)
+abline(v = qnorm(1 - alpha))
+```
+
+---
+![plot of chunk unnamed-chunk-2](assets/fig/unnamed-chunk-2.png) 
+
+
+
+---
+## Question
+- When testing $H_a : \mu > \mu_0$, notice if power is $1 - \beta$, then 
+$$1 - \beta = P\left(Z > z_{1-\alpha} - \frac{\mu_a - \mu_0}{\sigma /\sqrt{n}} ~|~ \mu = \mu_a \right) = P(Z > z_{\beta})$$
+- This yields the equation
+$$z_{1-\alpha} - \frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma} = z_{\beta}$$
+- Unknowns: $\mu_a$, $\sigma$, $n$, $\beta$
+- Knowns: $\mu_0$, $\alpha$
+- Specify any 3 of the unknowns and you can solve for the remainder
+
+---
+## Notes
+- The calculation for $H_a:\mu < \mu_0$ is similar
+- For $H_a: \mu \neq \mu_0$ calculate the one sided power using
+  $\alpha / 2$ (this is only approximately right, it excludes the probability of
+  getting a large TS in the opposite direction of the truth)
+- Power goes up as $\alpha$ gets larger
+- Power of a one sided test is greater than the power of the
+  associated two sided test
+- Power goes up as $\mu_1$ gets further away from $\mu_0$
+- Power goes up as $n$ goes up
+- Power doesn't need $\mu_a$, $\sigma$ and $n$, instead only $\frac{\sqrt{n}(\mu_a - \mu_0)}{\sigma}$
+  - The quantity $\frac{\mu_a - \mu_0}{\sigma}$ is called the effect size, the difference in the means in standard deviation units.
+  - Being unit free, it has some hope of interpretability across settings
+
+---
+## T-test power
+-  Consider calculating power for a Gossett's $T$ test for our example
+-  The power is
+  $$
+  P\left(\frac{\bar X - \mu_0}{S /\sqrt{n}} > t_{1-\alpha, n-1} ~|~ \mu = \mu_a \right)
+  $$
+- Calcuting this requires the non-central t distribution.
+- `power.t.test` does this very well
+  - Omit one of the arguments and it solves for it
+
+---
+## Example
+
+```r
+power.t.test(n = 16, delta = 2/4, sd = 1, type = "one.sample", alt = "one.sided")$power
+```
+
+```
+## [1] 0.604
+```
+
+```r
+power.t.test(n = 16, delta = 2, sd = 4, type = "one.sample", alt = "one.sided")$power
+```
+
+```
+## [1] 0.604
+```
+
+```r
+power.t.test(n = 16, delta = 100, sd = 200, type = "one.sample", alt = "one.sided")$power
+```
+
+```
+## [1] 0.604
+```
+
+
+---
+## Example
+
+```r
+power.t.test(power = 0.8, delta = 2/4, sd = 1, type = "one.sample", alt = "one.sided")$n
+```
+
+```
+## [1] 26.14
+```
+
+```r
+power.t.test(power = 0.8, delta = 2, sd = 4, type = "one.sample", alt = "one.sided")$n
+```
+
+```
+## [1] 26.14
+```
+
+```r
+power.t.test(power = 0.8, delta = 100, sd = 200, type = "one.sample", alt = "one.sided")$n
+```
+
+```
+## [1] 26.14
+```
+
+
diff --git a/06_StatisticalInference/03_04_Power/index.pdf b/06_StatisticalInference/03_04_Power/index.pdf
index a5daf8abd..b5e3024dc 100644
Binary files a/06_StatisticalInference/03_04_Power/index.pdf and b/06_StatisticalInference/03_04_Power/index.pdf differ
diff --git a/06_StatisticalInference/03_05_MultipleTesting/assets/fig/unnamed-chunk-3.png b/06_StatisticalInference/03_05_MultipleTesting/assets/fig/unnamed-chunk-3.png
new file mode 100644
index 000000000..556c3a44b
Binary files /dev/null and b/06_StatisticalInference/03_05_MultipleTesting/assets/fig/unnamed-chunk-3.png differ
diff --git a/06_StatisticalInference/03_05_MultipleTesting/index.Rmd b/06_StatisticalInference/03_05_MultipleTesting/index.Rmd
index 6c19901a0..4d5cc68a4 100644
--- a/06_StatisticalInference/03_05_MultipleTesting/index.Rmd
+++ b/06_StatisticalInference/03_05_MultipleTesting/index.Rmd
@@ -1,271 +1,253 @@
----
-title       : Multiple testing
-subtitle    : Statistical Inference 
-author      : Brian Caffo, Jeffrey Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow   # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F}
-# make this an external chunk that can be included in any file
-options(width = 100)
-opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/')
-
-options(xtable.type = 'html')
-knit_hooks$set(inline = function(x) {
-  if(is.numeric(x)) {
-    round(x, getOption('digits'))
-  } else {
-    paste(as.character(x), collapse = ', ')
-  }
-})
-knit_hooks$set(plot = knitr:::hook_plot_html)
-```
-
-## Key ideas
-
-* Hypothesis testing/significance analysis is commonly overused
-* Correcting for multiple testing avoids false positives or discoveries
-* Two key components
-  * Error measure
-  * Correction
-
-
----
-
-## Three eras of statistics
-
-__The age of Quetelet and his successors, in which huge census-level data sets were brought to bear on simple but important questions__: Are there more male than female births? Is the rate of insanity rising?
-
-The classical period of Pearson, Fisher, Neyman, Hotelling, and their successors, intellectual giants who __developed a theory of optimal inference capable of wringing every drop of information out of a scientific experiment__. The questions dealt with still tended to be simple Is treatment A better than treatment B? 
-
-__The era of scientific mass production__, in which new technologies typified by the microarray allow a single team of scientists to produce data sets of a size Quetelet would envy. But now the flood of data is accompanied by a deluge of questions, perhaps thousands of estimates or hypothesis tests that the statistician is charged with answering together; not at all what the classical masters had in mind. Which variables matter among the thousands measured? How do you relate unrelated information?
-
-[http://www-stat.stanford.edu/~ckirby/brad/papers/2010LSIexcerpt.pdf](http://www-stat.stanford.edu/~ckirby/brad/papers/2010LSIexcerpt.pdf)
-
----
-
-## Reasons for multiple testing
-
-<img class=center src=fig/datasources.png height=450>
-
-
----
-
-## Why correct for multiple tests?
-
-<img class=center src=fig/jellybeans1.png height=450>
-
-
-[http://xkcd.com/882/](http://xkcd.com/882/)
-
----
-
-## Why correct for multiple tests?
-
-<img class=center src=fig/jellybeans2.png height=400>
-
-[http://xkcd.com/882/](http://xkcd.com/882/)
-
-
----
-
-## Types of errors
-
-Suppose you are testing a hypothesis that a parameter $\beta$ equals zero versus the alternative that it does not equal zero. These are the possible outcomes. 
-</br></br>
-
-                    | $\beta=0$   | $\beta\neq0$   |  Hypotheses
---------------------|-------------|----------------|---------
-Claim $\beta=0$     |      $U$    |      $T$       |  $m-R$
-Claim $\beta\neq 0$ |      $V$    |      $S$       |  $R$
-    Claims          |     $m_0$   |      $m-m_0$   |  $m$
-
-</br></br>
-
-__Type I error or false positive ($V$)__ Say that the parameter does not equal zero when it does
-
-__Type II error or false negative ($T$)__ Say that the parameter equals zero when it doesn't 
-
-
----
-
-## Error rates
-
-__False positive rate__ - The rate at which false results ($\beta = 0$) are called significant: $E\left[\frac{V}{m_0}\right]$*
-
-__Family wise error rate (FWER)__ - The probability of at least one false positive ${\rm Pr}(V \geq 1)$
-
-__False discovery rate (FDR)__ - The rate at which claims of significance are false $E\left[\frac{V}{R}\right]$
-
-* The false positive rate is closely related to the type I error rate [http://en.wikipedia.org/wiki/False_positive_rate](http://en.wikipedia.org/wiki/False_positive_rate)
-
----
-
-## Controlling the false positive rate
-
-If P-values are correctly calculated calling all $P < \alpha$ significant will control the false positive rate at level $\alpha$ on average. 
-
-<redtext>Problem</redtext>: Suppose that you perform 10,000 tests and $\beta = 0$ for all of them. 
-
-Suppose that you call all $P < 0.05$ significant. 
-
-The expected number of false positives is: $10,000 \times 0.05 = 500$  false positives. 
-
-__How do we avoid so many false positives?__
-
-
----
-
-## Controlling family-wise error rate (FWER)
-
-
-The [Bonferroni correction](http://en.wikipedia.org/wiki/Bonferroni_correction) is the oldest multiple testing correction. 
-
-__Basic idea__: 
-* Suppose you do $m$ tests
-* You want to control FWER at level $\alpha$ so $Pr(V \geq 1) < \alpha$
-* Calculate P-values normally
-* Set $\alpha_{fwer} = \alpha/m$
-* Call all $P$-values less than $\alpha_{fwer}$ significant
-
-__Pros__: Easy to calculate, conservative
-__Cons__: May be very conservative
-
-
----
-
-## Controlling false discovery rate (FDR)
-
-This is the most popular correction when performing _lots_ of tests say in genomics, imaging, astronomy, or other signal-processing disciplines. 
-
-__Basic idea__: 
-* Suppose you do $m$ tests
-* You want to control FDR at level $\alpha$ so $E\left[\frac{V}{R}\right]$
-* Calculate P-values normally
-* Order the P-values from smallest to largest $P_{(1)},...,P_{(m)}$
-* Call any $P_{(i)} \leq \alpha \times \frac{i}{m}$ significant
-
-__Pros__: Still pretty easy to calculate, less conservative (maybe much less)
-
-__Cons__: Allows for more false positives, may behave strangely under dependence
-
----
-
-## Example with 10 P-values
-
-<img class=center src=fig/example10pvals.png height=450>
-
-Controlling all error rates at $\alpha = 0.20$
-
----
-
-## Adjusted P-values
-
-* One approach is to adjust the threshold $\alpha$
-* A different approach is to calculate "adjusted p-values"
-* They _are not p-values_ anymore
-* But they can be used directly without adjusting $\alpha$
-
-__Example__: 
-* Suppose P-values are $P_1,\ldots,P_m$
-* You could adjust them by taking $P_i^{fwer} = \max{m \times P_i,1}$ for each P-value.
-* Then if you call all $P_i^{fwer} < \alpha$ significant you will control the FWER. 
-
----
-
-## Case study I: no true positives
-
-```{r createPvals,cache=TRUE}
-set.seed(1010093)
-pValues <- rep(NA,1000)
-for(i in 1:1000){
-  y <- rnorm(20)
-  x <- rnorm(20)
-  pValues[i] <- summary(lm(y ~ x))$coeff[2,4]
-}
-
-# Controls false positive rate
-sum(pValues < 0.05)
-```
-
----
-
-## Case study I: no true positives
-
-```{r, dependson="createPvals"}
-# Controls FWER 
-sum(p.adjust(pValues,method="bonferroni") < 0.05)
-# Controls FDR 
-sum(p.adjust(pValues,method="BH") < 0.05)
-```
-
-
----
-
-## Case study II: 50% true positives
-
-```{r createPvals2,cache=TRUE}
-set.seed(1010093)
-pValues <- rep(NA,1000)
-for(i in 1:1000){
-  x <- rnorm(20)
-  # First 500 beta=0, last 500 beta=2
-  if(i <= 500){y <- rnorm(20)}else{ y <- rnorm(20,mean=2*x)}
-  pValues[i] <- summary(lm(y ~ x))$coeff[2,4]
-}
-trueStatus <- rep(c("zero","not zero"),each=500)
-table(pValues < 0.05, trueStatus)
-```
-
----
-
-
-## Case study II: 50% true positives
-
-```{r, dependson="createPvals2"}
-# Controls FWER 
-table(p.adjust(pValues,method="bonferroni") < 0.05,trueStatus)
-# Controls FDR 
-table(p.adjust(pValues,method="BH") < 0.05,trueStatus)
-```
-
-
----
-
-
-## Case study II: 50% true positives
-
-__P-values versus adjusted P-values__
-```{r, dependson="createPvals2",fig.height=4,fig.width=8}
-par(mfrow=c(1,2))
-plot(pValues,p.adjust(pValues,method="bonferroni"),pch=19)
-plot(pValues,p.adjust(pValues,method="BH"),pch=19)
-```
-
-
----
-
-
-## Notes and resources
-
-__Notes__:
-* Multiple testing is an entire subfield
-* A basic Bonferroni/BH correction is usually enough
-* If there is strong dependence between tests there may be problems
-  * Consider method="BY"
-
-__Further resources__:
-* [Multiple testing procedures with applications to genomics](http://www.amazon.com/Multiple-Procedures-Applications-Genomics-Statistics/dp/0387493166/ref=sr_1_2/102-3292576-129059?ie=UTF8&s=books&qid=1187394873&sr=1-2)
-* [Statistical significance for genome-wide studies](http://www.pnas.org/content/100/16/9440.full)
-* [Introduction to multiple testing](http://ies.ed.gov/ncee/pubs/20084018/app_b.asp)
-
+---
+title       : Multiple testing
+subtitle    : Statistical Inference 
+author      : Brian Caffo, Jeffrey Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow   # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+## Key ideas
+
+* Hypothesis testing/significance analysis is commonly overused
+* Correcting for multiple testing avoids false positives or discoveries
+* Two key components
+  * Error measure
+  * Correction
+
+
+---
+
+## Three eras of statistics
+
+__The age of Quetelet and his successors, in which huge census-level data sets were brought to bear on simple but important questions__: Are there more male than female births? Is the rate of insanity rising?
+
+The classical period of Pearson, Fisher, Neyman, Hotelling, and their successors, intellectual giants who __developed a theory of optimal inference capable of wringing every drop of information out of a scientific experiment__. The questions dealt with still tended to be simple Is treatment A better than treatment B? 
+
+__The era of scientific mass production__, in which new technologies typified by the microarray allow a single team of scientists to produce data sets of a size Quetelet would envy. But now the flood of data is accompanied by a deluge of questions, perhaps thousands of estimates or hypothesis tests that the statistician is charged with answering together; not at all what the classical masters had in mind. Which variables matter among the thousands measured? How do you relate unrelated information?
+
+[http://www-stat.stanford.edu/~ckirby/brad/papers/2010LSIexcerpt.pdf](http://www-stat.stanford.edu/~ckirby/brad/papers/2010LSIexcerpt.pdf)
+
+---
+
+## Reasons for multiple testing
+
+<img class=center src=fig/datasources.png height=450>
+
+
+---
+
+## Why correct for multiple tests?
+
+<img class=center src=fig/jellybeans1.png height=450>
+
+
+[http://xkcd.com/882/](http://xkcd.com/882/)
+
+---
+
+## Why correct for multiple tests?
+
+<img class=center src=fig/jellybeans2.png height=400>
+
+[http://xkcd.com/882/](http://xkcd.com/882/)
+
+
+---
+
+## Types of errors
+
+Suppose you are testing a hypothesis that a parameter $\beta$ equals zero versus the alternative that it does not equal zero. These are the possible outcomes. 
+</br></br>
+
+                    | $\beta=0$   | $\beta\neq0$   |  Hypotheses
+--------------------|-------------|----------------|---------
+Claim $\beta=0$     |      $U$    |      $T$       |  $m-R$
+Claim $\beta\neq 0$ |      $V$    |      $S$       |  $R$
+    Claims          |     $m_0$   |      $m-m_0$   |  $m$
+
+</br></br>
+
+__Type I error or false positive ($V$)__ Say that the parameter does not equal zero when it does
+
+__Type II error or false negative ($T$)__ Say that the parameter equals zero when it doesn't 
+
+
+---
+
+## Error rates
+
+__False positive rate__ - The rate at which false results ($\beta = 0$) are called significant: $E\left[\frac{V}{m_0}\right]$*
+
+__Family wise error rate (FWER)__ - The probability of at least one false positive ${\rm Pr}(V \geq 1)$
+
+__False discovery rate (FDR)__ - The rate at which claims of significance are false $E\left[\frac{V}{R}\right]$
+
+* The false positive rate is closely related to the type I error rate [http://en.wikipedia.org/wiki/False_positive_rate](http://en.wikipedia.org/wiki/False_positive_rate)
+
+---
+
+## Controlling the false positive rate
+
+If P-values are correctly calculated calling all $P < \alpha$ significant will control the false positive rate at level $\alpha$ on average. 
+
+<redtext>Problem</redtext>: Suppose that you perform 10,000 tests and $\beta = 0$ for all of them. 
+
+Suppose that you call all $P < 0.05$ significant. 
+
+The expected number of false positives is: $10,000 \times 0.05 = 500$  false positives. 
+
+__How do we avoid so many false positives?__
+
+
+---
+
+## Controlling family-wise error rate (FWER)
+
+
+The [Bonferroni correction](http://en.wikipedia.org/wiki/Bonferroni_correction) is the oldest multiple testing correction. 
+
+__Basic idea__: 
+* Suppose you do $m$ tests
+* You want to control FWER at level $\alpha$ so $Pr(V \geq 1) < \alpha$
+* Calculate P-values normally
+* Set $\alpha_{fwer} = \alpha/m$
+* Call all $P$-values less than $\alpha_{fwer}$ significant
+
+__Pros__: Easy to calculate, conservative
+__Cons__: May be very conservative
+
+
+---
+
+## Controlling false discovery rate (FDR)
+
+This is the most popular correction when performing _lots_ of tests say in genomics, imaging, astronomy, or other signal-processing disciplines. 
+
+__Basic idea__: 
+* Suppose you do $m$ tests
+* You want to control FDR at level $\alpha$ so $E\left[\frac{V}{R}\right]$
+* Calculate P-values normally
+* Order the P-values from smallest to largest $P_{(1)},...,P_{(m)}$
+* Call any $P_{(i)} \leq \alpha \times \frac{i}{m}$ significant
+
+__Pros__: Still pretty easy to calculate, less conservative (maybe much less)
+
+__Cons__: Allows for more false positives, may behave strangely under dependence
+
+---
+
+## Example with 10 P-values
+
+<img class=center src=fig/example10pvals.png height=450>
+
+Controlling all error rates at $\alpha = 0.20$
+
+---
+
+## Adjusted P-values
+
+* One approach is to adjust the threshold $\alpha$
+* A different approach is to calculate "adjusted p-values"
+* They _are not p-values_ anymore
+* But they can be used directly without adjusting $\alpha$
+
+__Example__: 
+* Suppose P-values are $P_1,\ldots,P_m$
+* You could adjust them by taking $P_i^{fwer} = \max{m \times P_i,1}$ for each P-value.
+* Then if you call all $P_i^{fwer} < \alpha$ significant you will control the FWER. 
+
+---
+
+## Case study I: no true positives
+
+```{r createPvals,cache=TRUE}
+set.seed(1010093)
+pValues <- rep(NA,1000)
+for(i in 1:1000){
+  y <- rnorm(20)
+  x <- rnorm(20)
+  pValues[i] <- summary(lm(y ~ x))$coeff[2,4]
+}
+
+# Controls false positive rate
+sum(pValues < 0.05)
+```
+
+---
+
+## Case study I: no true positives
+
+```{r, dependson="createPvals"}
+# Controls FWER 
+sum(p.adjust(pValues,method="bonferroni") < 0.05)
+# Controls FDR 
+sum(p.adjust(pValues,method="BH") < 0.05)
+```
+
+
+---
+
+## Case study II: 50% true positives
+
+```{r createPvals2,cache=TRUE}
+set.seed(1010093)
+pValues <- rep(NA,1000)
+for(i in 1:1000){
+  x <- rnorm(20)
+  # First 500 beta=0, last 500 beta=2
+  if(i <= 500){y <- rnorm(20)}else{ y <- rnorm(20,mean=2*x)}
+  pValues[i] <- summary(lm(y ~ x))$coeff[2,4]
+}
+trueStatus <- rep(c("zero","not zero"),each=500)
+table(pValues < 0.05, trueStatus)
+```
+
+---
+
+
+## Case study II: 50% true positives
+
+```{r, dependson="createPvals2"}
+# Controls FWER 
+table(p.adjust(pValues,method="bonferroni") < 0.05,trueStatus)
+# Controls FDR 
+table(p.adjust(pValues,method="BH") < 0.05,trueStatus)
+```
+
+
+---
+
+
+## Case study II: 50% true positives
+
+__P-values versus adjusted P-values__
+```{r, dependson="createPvals2",fig.height=4,fig.width=8}
+par(mfrow=c(1,2))
+plot(pValues,p.adjust(pValues,method="bonferroni"),pch=19)
+plot(pValues,p.adjust(pValues,method="BH"),pch=19)
+```
+
+
+---
+
+
+## Notes and resources
+
+__Notes__:
+* Multiple testing is an entire subfield
+* A basic Bonferroni/BH correction is usually enough
+* If there is strong dependence between tests there may be problems
+  * Consider method="BY"
+
+__Further resources__:
+* [Multiple testing procedures with applications to genomics](http://www.amazon.com/Multiple-Procedures-Applications-Genomics-Statistics/dp/0387493166/ref=sr_1_2/102-3292576-129059?ie=UTF8&s=books&qid=1187394873&sr=1-2)
+* [Statistical significance for genome-wide studies](http://www.pnas.org/content/100/16/9440.full)
+* [Introduction to multiple testing](http://ies.ed.gov/ncee/pubs/20084018/app_b.asp)
+
diff --git a/06_StatisticalInference/03_05_MultipleTesting/index.html b/06_StatisticalInference/03_05_MultipleTesting/index.html
index bb3a271db..dfc498eeb 100644
--- a/06_StatisticalInference/03_05_MultipleTesting/index.html
+++ b/06_StatisticalInference/03_05_MultipleTesting/index.html
@@ -1,581 +1,585 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Multiple testing</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Multiple testing">
-  <meta name="author" content="Brian Caffo, Jeffrey Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Multiple testing</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeffrey Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Key ideas</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Hypothesis testing/significance analysis is commonly overused</li>
-<li>Correcting for multiple testing avoids false positives or discoveries</li>
-<li>Two key components
-
-<ul>
-<li>Error measure</li>
-<li>Correction</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>Three eras of statistics</h2>
-  </hgroup>
-  <article data-timings="">
-    <p><strong>The age of Quetelet and his successors, in which huge census-level data sets were brought to bear on simple but important questions</strong>: Are there more male than female births? Is the rate of insanity rising?</p>
-
-<p>The classical period of Pearson, Fisher, Neyman, Hotelling, and their successors, intellectual giants who <strong>developed a theory of optimal inference capable of wringing every drop of information out of a scientific experiment</strong>. The questions dealt with still tended to be simple Is treatment A better than treatment B? </p>
-
-<p><strong>The era of scientific mass production</strong>, in which new technologies typified by the microarray allow a single team of scientists to produce data sets of a size Quetelet would envy. But now the flood of data is accompanied by a deluge of questions, perhaps thousands of estimates or hypothesis tests that the statistician is charged with answering together; not at all what the classical masters had in mind. Which variables matter among the thousands measured? How do you relate unrelated information?</p>
-
-<p><a href="http://www-stat.stanford.edu/%7Eckirby/brad/papers/2010LSIexcerpt.pdf">http://www-stat.stanford.edu/~ckirby/brad/papers/2010LSIexcerpt.pdf</a></p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>Reasons for multiple testing</h2>
-  </hgroup>
-  <article data-timings="">
-    <p><img class=center src=fig/datasources.png height=450></p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Why correct for multiple tests?</h2>
-  </hgroup>
-  <article data-timings="">
-    <p><img class=center src=fig/jellybeans1.png height=450></p>
-
-<p><a href="http://xkcd.com/882/">http://xkcd.com/882/</a></p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Why correct for multiple tests?</h2>
-  </hgroup>
-  <article data-timings="">
-    <p><img class=center src=fig/jellybeans2.png height=400></p>
-
-<p><a href="http://xkcd.com/882/">http://xkcd.com/882/</a></p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Types of errors</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>Suppose you are testing a hypothesis that a parameter \(\beta\) equals zero versus the alternative that it does not equal zero. These are the possible outcomes. 
-</br></br></p>
-
-<table><thead>
-<tr>
-<th></th>
-<th>\(\beta=0\)</th>
-<th>\(\beta\neq0\)</th>
-<th>Hypotheses</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>Claim \(\beta=0\)</td>
-<td>\(U\)</td>
-<td>\(T\)</td>
-<td>\(m-R\)</td>
-</tr>
-<tr>
-<td>Claim \(\beta\neq 0\)</td>
-<td>\(V\)</td>
-<td>\(S\)</td>
-<td>\(R\)</td>
-</tr>
-<tr>
-<td>Claims</td>
-<td>\(m_0\)</td>
-<td>\(m-m_0\)</td>
-<td>\(m\)</td>
-</tr>
-</tbody></table>
-
-<p></br></br></p>
-
-<p><strong>Type I error or false positive (\(V\))</strong> Say that the parameter does not equal zero when it does</p>
-
-<p><strong>Type II error or false negative (\(T\))</strong> Say that the parameter equals zero when it doesn&#39;t </p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Error rates</h2>
-  </hgroup>
-  <article data-timings="">
-    <p><strong>False positive rate</strong> - The rate at which false results (\(\beta = 0\)) are called significant: \(E\left[\frac{V}{m_0}\right]\)*</p>
-
-<p><strong>Family wise error rate (FWER)</strong> - The probability of at least one false positive \({\rm Pr}(V \geq 1)\)</p>
-
-<p><strong>False discovery rate (FDR)</strong> - The rate at which claims of significance are false \(E\left[\frac{V}{R}\right]\)</p>
-
-<ul>
-<li>The false positive rate is closely related to the type I error rate <a href="http://en.wikipedia.org/wiki/False_positive_rate">http://en.wikipedia.org/wiki/False_positive_rate</a></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Controlling the false positive rate</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>If P-values are correctly calculated calling all \(P < \alpha\) significant will control the false positive rate at level \(\alpha\) on average. </p>
-
-<p><redtext>Problem</redtext>: Suppose that you perform 10,000 tests and \(\beta = 0\) for all of them. </p>
-
-<p>Suppose that you call all \(P < 0.05\) significant. </p>
-
-<p>The expected number of false positives is: \(10,000 \times 0.05 = 500\)  false positives. </p>
-
-<p><strong>How do we avoid so many false positives?</strong></p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>Controlling family-wise error rate (FWER)</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>The <a href="http://en.wikipedia.org/wiki/Bonferroni_correction">Bonferroni correction</a> is the oldest multiple testing correction. </p>
-
-<p><strong>Basic idea</strong>: </p>
-
-<ul>
-<li>Suppose you do \(m\) tests</li>
-<li>You want to control FWER at level \(\alpha\) so \(Pr(V \geq 1) < \alpha\)</li>
-<li>Calculate P-values normally</li>
-<li>Set \(\alpha_{fwer} = \alpha/m\)</li>
-<li>Call all \(P\)-values less than \(\alpha_{fwer}\) significant</li>
-</ul>
-
-<p><strong>Pros</strong>: Easy to calculate, conservative
-<strong>Cons</strong>: May be very conservative</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>Controlling false discovery rate (FDR)</h2>
-  </hgroup>
-  <article data-timings="">
-    <p>This is the most popular correction when performing <em>lots</em> of tests say in genomics, imaging, astronomy, or other signal-processing disciplines. </p>
-
-<p><strong>Basic idea</strong>: </p>
-
-<ul>
-<li>Suppose you do \(m\) tests</li>
-<li>You want to control FDR at level \(\alpha\) so \(E\left[\frac{V}{R}\right]\)</li>
-<li>Calculate P-values normally</li>
-<li>Order the P-values from smallest to largest \(P_{(1)},...,P_{(m)}\)</li>
-<li>Call any \(P_{(i)} \leq \alpha \times \frac{i}{m}\) significant</li>
-</ul>
-
-<p><strong>Pros</strong>: Still pretty easy to calculate, less conservative (maybe much less)</p>
-
-<p><strong>Cons</strong>: Allows for more false positives, may behave strangely under dependence</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <hgroup>
-    <h2>Example with 10 P-values</h2>
-  </hgroup>
-  <article data-timings="">
-    <p><img class=center src=fig/example10pvals.png height=450></p>
-
-<p>Controlling all error rates at \(\alpha = 0.20\)</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-12" style="background:;">
-  <hgroup>
-    <h2>Adjusted P-values</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>One approach is to adjust the threshold \(\alpha\)</li>
-<li>A different approach is to calculate &quot;adjusted p-values&quot;</li>
-<li>They <em>are not p-values</em> anymore</li>
-<li>But they can be used directly without adjusting \(\alpha\)</li>
-</ul>
-
-<p><strong>Example</strong>: </p>
-
-<ul>
-<li>Suppose P-values are \(P_1,\ldots,P_m\)</li>
-<li>You could adjust them by taking \(P_i^{fwer} = \max{m \times P_i,1}\) for each P-value.</li>
-<li>Then if you call all \(P_i^{fwer} < \alpha\) significant you will control the FWER. </li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-13" style="background:;">
-  <hgroup>
-    <h2>Case study I: no true positives</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">set.seed(1010093)
-pValues &lt;- rep(NA,1000)
-for(i in 1:1000){
-  y &lt;- rnorm(20)
-  x &lt;- rnorm(20)
-  pValues[i] &lt;- summary(lm(y ~ x))$coeff[2,4]
-}
-
-# Controls false positive rate
-sum(pValues &lt; 0.05)
-</code></pre>
-
-<pre><code>[1] 51
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-14" style="background:;">
-  <hgroup>
-    <h2>Case study I: no true positives</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r"># Controls FWER 
-sum(p.adjust(pValues,method=&quot;bonferroni&quot;) &lt; 0.05)
-</code></pre>
-
-<pre><code>[1] 0
-</code></pre>
-
-<pre><code class="r"># Controls FDR 
-sum(p.adjust(pValues,method=&quot;BH&quot;) &lt; 0.05)
-</code></pre>
-
-<pre><code>[1] 0
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-15" style="background:;">
-  <hgroup>
-    <h2>Case study II: 50% true positives</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">set.seed(1010093)
-pValues &lt;- rep(NA,1000)
-for(i in 1:1000){
-  x &lt;- rnorm(20)
-  # First 500 beta=0, last 500 beta=2
-  if(i &lt;= 500){y &lt;- rnorm(20)}else{ y &lt;- rnorm(20,mean=2*x)}
-  pValues[i] &lt;- summary(lm(y ~ x))$coeff[2,4]
-}
-trueStatus &lt;- rep(c(&quot;zero&quot;,&quot;not zero&quot;),each=500)
-table(pValues &lt; 0.05, trueStatus)
-</code></pre>
-
-<pre><code>       trueStatus
-        not zero zero
-  FALSE        0  476
-  TRUE       500   24
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-16" style="background:;">
-  <hgroup>
-    <h2>Case study II: 50% true positives</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r"># Controls FWER 
-table(p.adjust(pValues,method=&quot;bonferroni&quot;) &lt; 0.05,trueStatus)
-</code></pre>
-
-<pre><code>       trueStatus
-        not zero zero
-  FALSE       23  500
-  TRUE       477    0
-</code></pre>
-
-<pre><code class="r"># Controls FDR 
-table(p.adjust(pValues,method=&quot;BH&quot;) &lt; 0.05,trueStatus)
-</code></pre>
-
-<pre><code>       trueStatus
-        not zero zero
-  FALSE        0  487
-  TRUE       500   13
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-17" style="background:;">
-  <hgroup>
-    <h2>Case study II: 50% true positives</h2>
-  </hgroup>
-  <article data-timings="">
-    <p><strong>P-values versus adjusted P-values</strong></p>
-
-<pre><code class="r">par(mfrow=c(1,2))
-plot(pValues,p.adjust(pValues,method=&quot;bonferroni&quot;),pch=19)
-plot(pValues,p.adjust(pValues,method=&quot;BH&quot;),pch=19)
-</code></pre>
-
-<div class="rimage center"><img src="fig/unnamed-chunk-3.png" title="plot of chunk unnamed-chunk-3" alt="plot of chunk unnamed-chunk-3" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-18" style="background:;">
-  <hgroup>
-    <h2>Notes and resources</h2>
-  </hgroup>
-  <article data-timings="">
-    <p><strong>Notes</strong>:</p>
-
-<ul>
-<li>Multiple testing is an entire subfield</li>
-<li>A basic Bonferroni/BH correction is usually enough</li>
-<li>If there is strong dependence between tests there may be problems
-
-<ul>
-<li>Consider method=&quot;BY&quot;</li>
-</ul></li>
-</ul>
-
-<p><strong>Further resources</strong>:</p>
-
-<ul>
-<li><a href="http://www.amazon.com/Multiple-Procedures-Applications-Genomics-Statistics/dp/0387493166/ref=sr_1_2/102-3292576-129059?ie=UTF8&amp;s=books&amp;qid=1187394873&amp;sr=1-2">Multiple testing procedures with applications to genomics</a></li>
-<li><a href="http://www.pnas.org/content/100/16/9440.full">Statistical significance for genome-wide studies</a></li>
-<li><a href="http://ies.ed.gov/ncee/pubs/20084018/app_b.asp">Introduction to multiple testing</a></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='Key ideas'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='Three eras of statistics'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='Reasons for multiple testing'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Why correct for multiple tests?'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Why correct for multiple tests?'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Types of errors'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Error rates'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Controlling the false positive rate'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='Controlling family-wise error rate (FWER)'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='Controlling false discovery rate (FDR)'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title='Example with 10 P-values'>
-         11
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=12 title='Adjusted P-values'>
-         12
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=13 title='Case study I: no true positives'>
-         13
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=14 title='Case study I: no true positives'>
-         14
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=15 title='Case study II: 50% true positives'>
-         15
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=16 title='Case study II: 50% true positives'>
-         16
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=17 title='Case study II: 50% true positives'>
-         17
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=18 title='Notes and resources'>
-         18
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Multiple testing</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Multiple testing">
+  <meta name="author" content="Brian Caffo, Jeffrey Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Multiple testing</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeffrey Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>Key ideas</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Hypothesis testing/significance analysis is commonly overused</li>
+<li>Correcting for multiple testing avoids false positives or discoveries</li>
+<li>Two key components
+
+<ul>
+<li>Error measure</li>
+<li>Correction</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>Three eras of statistics</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><strong>The age of Quetelet and his successors, in which huge census-level data sets were brought to bear on simple but important questions</strong>: Are there more male than female births? Is the rate of insanity rising?</p>
+
+<p>The classical period of Pearson, Fisher, Neyman, Hotelling, and their successors, intellectual giants who <strong>developed a theory of optimal inference capable of wringing every drop of information out of a scientific experiment</strong>. The questions dealt with still tended to be simple Is treatment A better than treatment B? </p>
+
+<p><strong>The era of scientific mass production</strong>, in which new technologies typified by the microarray allow a single team of scientists to produce data sets of a size Quetelet would envy. But now the flood of data is accompanied by a deluge of questions, perhaps thousands of estimates or hypothesis tests that the statistician is charged with answering together; not at all what the classical masters had in mind. Which variables matter among the thousands measured? How do you relate unrelated information?</p>
+
+<p><a href="http://www-stat.stanford.edu/%7Eckirby/brad/papers/2010LSIexcerpt.pdf">http://www-stat.stanford.edu/~ckirby/brad/papers/2010LSIexcerpt.pdf</a></p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>Reasons for multiple testing</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><img class=center src=fig/datasources.png height=450></p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Why correct for multiple tests?</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><img class=center src=fig/jellybeans1.png height=450></p>
+
+<p><a href="http://xkcd.com/882/">http://xkcd.com/882/</a></p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Why correct for multiple tests?</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><img class=center src=fig/jellybeans2.png height=400></p>
+
+<p><a href="http://xkcd.com/882/">http://xkcd.com/882/</a></p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Types of errors</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>Suppose you are testing a hypothesis that a parameter \(\beta\) equals zero versus the alternative that it does not equal zero. These are the possible outcomes. 
+</br></br></p>
+
+<table><thead>
+<tr>
+<th></th>
+<th>\(\beta=0\)</th>
+<th>\(\beta\neq0\)</th>
+<th>Hypotheses</th>
+</tr>
+</thead><tbody>
+<tr>
+<td>Claim \(\beta=0\)</td>
+<td>\(U\)</td>
+<td>\(T\)</td>
+<td>\(m-R\)</td>
+</tr>
+<tr>
+<td>Claim \(\beta\neq 0\)</td>
+<td>\(V\)</td>
+<td>\(S\)</td>
+<td>\(R\)</td>
+</tr>
+<tr>
+<td>Claims</td>
+<td>\(m_0\)</td>
+<td>\(m-m_0\)</td>
+<td>\(m\)</td>
+</tr>
+</tbody></table>
+
+<p></br></br></p>
+
+<p><strong>Type I error or false positive (\(V\))</strong> Say that the parameter does not equal zero when it does</p>
+
+<p><strong>Type II error or false negative (\(T\))</strong> Say that the parameter equals zero when it doesn&#39;t </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Error rates</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><strong>False positive rate</strong> - The rate at which false results (\(\beta = 0\)) are called significant: \(E\left[\frac{V}{m_0}\right]\)*</p>
+
+<p><strong>Family wise error rate (FWER)</strong> - The probability of at least one false positive \({\rm Pr}(V \geq 1)\)</p>
+
+<p><strong>False discovery rate (FDR)</strong> - The rate at which claims of significance are false \(E\left[\frac{V}{R}\right]\)</p>
+
+<ul>
+<li>The false positive rate is closely related to the type I error rate <a href="http://en.wikipedia.org/wiki/False_positive_rate">http://en.wikipedia.org/wiki/False_positive_rate</a></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Controlling the false positive rate</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>If P-values are correctly calculated calling all \(P < \alpha\) significant will control the false positive rate at level \(\alpha\) on average. </p>
+
+<p><redtext>Problem</redtext>: Suppose that you perform 10,000 tests and \(\beta = 0\) for all of them. </p>
+
+<p>Suppose that you call all \(P < 0.05\) significant. </p>
+
+<p>The expected number of false positives is: \(10,000 \times 0.05 = 500\)  false positives. </p>
+
+<p><strong>How do we avoid so many false positives?</strong></p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>Controlling family-wise error rate (FWER)</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>The <a href="http://en.wikipedia.org/wiki/Bonferroni_correction">Bonferroni correction</a> is the oldest multiple testing correction. </p>
+
+<p><strong>Basic idea</strong>: </p>
+
+<ul>
+<li>Suppose you do \(m\) tests</li>
+<li>You want to control FWER at level \(\alpha\) so \(Pr(V \geq 1) < \alpha\)</li>
+<li>Calculate P-values normally</li>
+<li>Set \(\alpha_{fwer} = \alpha/m\)</li>
+<li>Call all \(P\)-values less than \(\alpha_{fwer}\) significant</li>
+</ul>
+
+<p><strong>Pros</strong>: Easy to calculate, conservative
+<strong>Cons</strong>: May be very conservative</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>Controlling false discovery rate (FDR)</h2>
+  </hgroup>
+  <article data-timings="">
+    <p>This is the most popular correction when performing <em>lots</em> of tests say in genomics, imaging, astronomy, or other signal-processing disciplines. </p>
+
+<p><strong>Basic idea</strong>: </p>
+
+<ul>
+<li>Suppose you do \(m\) tests</li>
+<li>You want to control FDR at level \(\alpha\) so \(E\left[\frac{V}{R}\right]\)</li>
+<li>Calculate P-values normally</li>
+<li>Order the P-values from smallest to largest \(P_{(1)},...,P_{(m)}\)</li>
+<li>Call any \(P_{(i)} \leq \alpha \times \frac{i}{m}\) significant</li>
+</ul>
+
+<p><strong>Pros</strong>: Still pretty easy to calculate, less conservative (maybe much less)</p>
+
+<p><strong>Cons</strong>: Allows for more false positives, may behave strangely under dependence</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <hgroup>
+    <h2>Example with 10 P-values</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><img class=center src=fig/example10pvals.png height=450></p>
+
+<p>Controlling all error rates at \(\alpha = 0.20\)</p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <hgroup>
+    <h2>Adjusted P-values</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>One approach is to adjust the threshold \(\alpha\)</li>
+<li>A different approach is to calculate &quot;adjusted p-values&quot;</li>
+<li>They <em>are not p-values</em> anymore</li>
+<li>But they can be used directly without adjusting \(\alpha\)</li>
+</ul>
+
+<p><strong>Example</strong>: </p>
+
+<ul>
+<li>Suppose P-values are \(P_1,\ldots,P_m\)</li>
+<li>You could adjust them by taking \(P_i^{fwer} = \max{m \times P_i,1}\) for each P-value.</li>
+<li>Then if you call all \(P_i^{fwer} < \alpha\) significant you will control the FWER. </li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-13" style="background:;">
+  <hgroup>
+    <h2>Case study I: no true positives</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">set.seed(1010093)
+pValues &lt;- rep(NA, 1000)
+for (i in 1:1000) {
+    y &lt;- rnorm(20)
+    x &lt;- rnorm(20)
+    pValues[i] &lt;- summary(lm(y ~ x))$coeff[2, 4]
+}
+
+# Controls false positive rate
+sum(pValues &lt; 0.05)
+</code></pre>
+
+<pre><code>## [1] 51
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-14" style="background:;">
+  <hgroup>
+    <h2>Case study I: no true positives</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r"># Controls FWER
+sum(p.adjust(pValues, method = &quot;bonferroni&quot;) &lt; 0.05)
+</code></pre>
+
+<pre><code>## [1] 0
+</code></pre>
+
+<pre><code class="r"># Controls FDR
+sum(p.adjust(pValues, method = &quot;BH&quot;) &lt; 0.05)
+</code></pre>
+
+<pre><code>## [1] 0
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-15" style="background:;">
+  <hgroup>
+    <h2>Case study II: 50% true positives</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">set.seed(1010093)
+pValues &lt;- rep(NA, 1000)
+for (i in 1:1000) {
+    x &lt;- rnorm(20)
+    # First 500 beta=0, last 500 beta=2
+    if (i &lt;= 500) {
+        y &lt;- rnorm(20)
+    } else {
+        y &lt;- rnorm(20, mean = 2 * x)
+    }
+    pValues[i] &lt;- summary(lm(y ~ x))$coeff[2, 4]
+}
+trueStatus &lt;- rep(c(&quot;zero&quot;, &quot;not zero&quot;), each = 500)
+table(pValues &lt; 0.05, trueStatus)
+</code></pre>
+
+<pre><code>##        trueStatus
+##         not zero zero
+##   FALSE        0  476
+##   TRUE       500   24
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-16" style="background:;">
+  <hgroup>
+    <h2>Case study II: 50% true positives</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r"># Controls FWER
+table(p.adjust(pValues, method = &quot;bonferroni&quot;) &lt; 0.05, trueStatus)
+</code></pre>
+
+<pre><code>##        trueStatus
+##         not zero zero
+##   FALSE       23  500
+##   TRUE       477    0
+</code></pre>
+
+<pre><code class="r"># Controls FDR
+table(p.adjust(pValues, method = &quot;BH&quot;) &lt; 0.05, trueStatus)
+</code></pre>
+
+<pre><code>##        trueStatus
+##         not zero zero
+##   FALSE        0  487
+##   TRUE       500   13
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-17" style="background:;">
+  <hgroup>
+    <h2>Case study II: 50% true positives</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><strong>P-values versus adjusted P-values</strong></p>
+
+<pre><code class="r">par(mfrow = c(1, 2))
+plot(pValues, p.adjust(pValues, method = &quot;bonferroni&quot;), pch = 19)
+plot(pValues, p.adjust(pValues, method = &quot;BH&quot;), pch = 19)
+</code></pre>
+
+<p><img src="assets/fig/unnamed-chunk-3.png" alt="plot of chunk unnamed-chunk-3"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-18" style="background:;">
+  <hgroup>
+    <h2>Notes and resources</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><strong>Notes</strong>:</p>
+
+<ul>
+<li>Multiple testing is an entire subfield</li>
+<li>A basic Bonferroni/BH correction is usually enough</li>
+<li>If there is strong dependence between tests there may be problems
+
+<ul>
+<li>Consider method=&quot;BY&quot;</li>
+</ul></li>
+</ul>
+
+<p><strong>Further resources</strong>:</p>
+
+<ul>
+<li><a href="http://www.amazon.com/Multiple-Procedures-Applications-Genomics-Statistics/dp/0387493166/ref=sr_1_2/102-3292576-129059?ie=UTF8&amp;s=books&amp;qid=1187394873&amp;sr=1-2">Multiple testing procedures with applications to genomics</a></li>
+<li><a href="http://www.pnas.org/content/100/16/9440.full">Statistical significance for genome-wide studies</a></li>
+<li><a href="http://ies.ed.gov/ncee/pubs/20084018/app_b.asp">Introduction to multiple testing</a></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='Key ideas'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='Three eras of statistics'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='Reasons for multiple testing'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Why correct for multiple tests?'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Why correct for multiple tests?'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Types of errors'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Error rates'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Controlling the false positive rate'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='Controlling family-wise error rate (FWER)'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='Controlling false discovery rate (FDR)'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title='Example with 10 P-values'>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title='Adjusted P-values'>
+         12
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=13 title='Case study I: no true positives'>
+         13
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=14 title='Case study I: no true positives'>
+         14
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=15 title='Case study II: 50% true positives'>
+         15
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=16 title='Case study II: 50% true positives'>
+         16
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=17 title='Case study II: 50% true positives'>
+         17
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=18 title='Notes and resources'>
+         18
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/03_05_MultipleTesting/index.md b/06_StatisticalInference/03_05_MultipleTesting/index.md
index 317eae1a7..08f1afa2f 100644
--- a/06_StatisticalInference/03_05_MultipleTesting/index.md
+++ b/06_StatisticalInference/03_05_MultipleTesting/index.md
@@ -1,309 +1,308 @@
----
-title       : Multiple testing
-subtitle    : Statistical Inference 
-author      : Brian Caffo, Jeffrey Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow   # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-
-
-
-## Key ideas
-
-* Hypothesis testing/significance analysis is commonly overused
-* Correcting for multiple testing avoids false positives or discoveries
-* Two key components
-  * Error measure
-  * Correction
-
-
----
-
-## Three eras of statistics
-
-__The age of Quetelet and his successors, in which huge census-level data sets were brought to bear on simple but important questions__: Are there more male than female births? Is the rate of insanity rising?
-
-The classical period of Pearson, Fisher, Neyman, Hotelling, and their successors, intellectual giants who __developed a theory of optimal inference capable of wringing every drop of information out of a scientific experiment__. The questions dealt with still tended to be simple Is treatment A better than treatment B? 
-
-__The era of scientific mass production__, in which new technologies typified by the microarray allow a single team of scientists to produce data sets of a size Quetelet would envy. But now the flood of data is accompanied by a deluge of questions, perhaps thousands of estimates or hypothesis tests that the statistician is charged with answering together; not at all what the classical masters had in mind. Which variables matter among the thousands measured? How do you relate unrelated information?
-
-[http://www-stat.stanford.edu/~ckirby/brad/papers/2010LSIexcerpt.pdf](http://www-stat.stanford.edu/~ckirby/brad/papers/2010LSIexcerpt.pdf)
-
----
-
-## Reasons for multiple testing
-
-<img class=center src=fig/datasources.png height=450>
-
-
----
-
-## Why correct for multiple tests?
-
-<img class=center src=fig/jellybeans1.png height=450>
-
-
-[http://xkcd.com/882/](http://xkcd.com/882/)
-
----
-
-## Why correct for multiple tests?
-
-<img class=center src=fig/jellybeans2.png height=400>
-
-[http://xkcd.com/882/](http://xkcd.com/882/)
-
-
----
-
-## Types of errors
-
-Suppose you are testing a hypothesis that a parameter $\beta$ equals zero versus the alternative that it does not equal zero. These are the possible outcomes. 
-</br></br>
-
-                    | $\beta=0$   | $\beta\neq0$   |  Hypotheses
---------------------|-------------|----------------|---------
-Claim $\beta=0$     |      $U$    |      $T$       |  $m-R$
-Claim $\beta\neq 0$ |      $V$    |      $S$       |  $R$
-    Claims          |     $m_0$   |      $m-m_0$   |  $m$
-
-</br></br>
-
-__Type I error or false positive ($V$)__ Say that the parameter does not equal zero when it does
-
-__Type II error or false negative ($T$)__ Say that the parameter equals zero when it doesn't 
-
-
----
-
-## Error rates
-
-__False positive rate__ - The rate at which false results ($\beta = 0$) are called significant: $E\left[\frac{V}{m_0}\right]$*
-
-__Family wise error rate (FWER)__ - The probability of at least one false positive ${\rm Pr}(V \geq 1)$
-
-__False discovery rate (FDR)__ - The rate at which claims of significance are false $E\left[\frac{V}{R}\right]$
-
-* The false positive rate is closely related to the type I error rate [http://en.wikipedia.org/wiki/False_positive_rate](http://en.wikipedia.org/wiki/False_positive_rate)
-
----
-
-## Controlling the false positive rate
-
-If P-values are correctly calculated calling all $P < \alpha$ significant will control the false positive rate at level $\alpha$ on average. 
-
-<redtext>Problem</redtext>: Suppose that you perform 10,000 tests and $\beta = 0$ for all of them. 
-
-Suppose that you call all $P < 0.05$ significant. 
-
-The expected number of false positives is: $10,000 \times 0.05 = 500$  false positives. 
-
-__How do we avoid so many false positives?__
-
-
----
-
-## Controlling family-wise error rate (FWER)
-
-
-The [Bonferroni correction](http://en.wikipedia.org/wiki/Bonferroni_correction) is the oldest multiple testing correction. 
-
-__Basic idea__: 
-* Suppose you do $m$ tests
-* You want to control FWER at level $\alpha$ so $Pr(V \geq 1) < \alpha$
-* Calculate P-values normally
-* Set $\alpha_{fwer} = \alpha/m$
-* Call all $P$-values less than $\alpha_{fwer}$ significant
-
-__Pros__: Easy to calculate, conservative
-__Cons__: May be very conservative
-
-
----
-
-## Controlling false discovery rate (FDR)
-
-This is the most popular correction when performing _lots_ of tests say in genomics, imaging, astronomy, or other signal-processing disciplines. 
-
-__Basic idea__: 
-* Suppose you do $m$ tests
-* You want to control FDR at level $\alpha$ so $E\left[\frac{V}{R}\right]$
-* Calculate P-values normally
-* Order the P-values from smallest to largest $P_{(1)},...,P_{(m)}$
-* Call any $P_{(i)} \leq \alpha \times \frac{i}{m}$ significant
-
-__Pros__: Still pretty easy to calculate, less conservative (maybe much less)
-
-__Cons__: Allows for more false positives, may behave strangely under dependence
-
----
-
-## Example with 10 P-values
-
-<img class=center src=fig/example10pvals.png height=450>
-
-Controlling all error rates at $\alpha = 0.20$
-
----
-
-## Adjusted P-values
-
-* One approach is to adjust the threshold $\alpha$
-* A different approach is to calculate "adjusted p-values"
-* They _are not p-values_ anymore
-* But they can be used directly without adjusting $\alpha$
-
-__Example__: 
-* Suppose P-values are $P_1,\ldots,P_m$
-* You could adjust them by taking $P_i^{fwer} = \max{m \times P_i,1}$ for each P-value.
-* Then if you call all $P_i^{fwer} < \alpha$ significant you will control the FWER. 
-
----
-
-## Case study I: no true positives
-
-
-```r
-set.seed(1010093)
-pValues <- rep(NA,1000)
-for(i in 1:1000){
-  y <- rnorm(20)
-  x <- rnorm(20)
-  pValues[i] <- summary(lm(y ~ x))$coeff[2,4]
-}
-
-# Controls false positive rate
-sum(pValues < 0.05)
-```
-
-```
-[1] 51
-```
-
-
----
-
-## Case study I: no true positives
-
-
-```r
-# Controls FWER 
-sum(p.adjust(pValues,method="bonferroni") < 0.05)
-```
-
-```
-[1] 0
-```
-
-```r
-# Controls FDR 
-sum(p.adjust(pValues,method="BH") < 0.05)
-```
-
-```
-[1] 0
-```
-
-
-
----
-
-## Case study II: 50% true positives
-
-
-```r
-set.seed(1010093)
-pValues <- rep(NA,1000)
-for(i in 1:1000){
-  x <- rnorm(20)
-  # First 500 beta=0, last 500 beta=2
-  if(i <= 500){y <- rnorm(20)}else{ y <- rnorm(20,mean=2*x)}
-  pValues[i] <- summary(lm(y ~ x))$coeff[2,4]
-}
-trueStatus <- rep(c("zero","not zero"),each=500)
-table(pValues < 0.05, trueStatus)
-```
-
-```
-       trueStatus
-        not zero zero
-  FALSE        0  476
-  TRUE       500   24
-```
-
-
----
-
-
-## Case study II: 50% true positives
-
-
-```r
-# Controls FWER 
-table(p.adjust(pValues,method="bonferroni") < 0.05,trueStatus)
-```
-
-```
-       trueStatus
-        not zero zero
-  FALSE       23  500
-  TRUE       477    0
-```
-
-```r
-# Controls FDR 
-table(p.adjust(pValues,method="BH") < 0.05,trueStatus)
-```
-
-```
-       trueStatus
-        not zero zero
-  FALSE        0  487
-  TRUE       500   13
-```
-
-
-
----
-
-
-## Case study II: 50% true positives
-
-__P-values versus adjusted P-values__
-
-```r
-par(mfrow=c(1,2))
-plot(pValues,p.adjust(pValues,method="bonferroni"),pch=19)
-plot(pValues,p.adjust(pValues,method="BH"),pch=19)
-```
-
-<div class="rimage center"><img src="fig/unnamed-chunk-3.png" title="plot of chunk unnamed-chunk-3" alt="plot of chunk unnamed-chunk-3" class="plot" /></div>
-
-
-
----
-
-
-## Notes and resources
-
-__Notes__:
-* Multiple testing is an entire subfield
-* A basic Bonferroni/BH correction is usually enough
-* If there is strong dependence between tests there may be problems
-  * Consider method="BY"
-
-__Further resources__:
-* [Multiple testing procedures with applications to genomics](http://www.amazon.com/Multiple-Procedures-Applications-Genomics-Statistics/dp/0387493166/ref=sr_1_2/102-3292576-129059?ie=UTF8&s=books&qid=1187394873&sr=1-2)
-* [Statistical significance for genome-wide studies](http://www.pnas.org/content/100/16/9440.full)
-* [Introduction to multiple testing](http://ies.ed.gov/ncee/pubs/20084018/app_b.asp)
-
+---
+title       : Multiple testing
+subtitle    : Statistical Inference 
+author      : Brian Caffo, Jeffrey Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow   # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+---
+## Key ideas
+
+* Hypothesis testing/significance analysis is commonly overused
+* Correcting for multiple testing avoids false positives or discoveries
+* Two key components
+  * Error measure
+  * Correction
+
+
+---
+
+## Three eras of statistics
+
+__The age of Quetelet and his successors, in which huge census-level data sets were brought to bear on simple but important questions__: Are there more male than female births? Is the rate of insanity rising?
+
+The classical period of Pearson, Fisher, Neyman, Hotelling, and their successors, intellectual giants who __developed a theory of optimal inference capable of wringing every drop of information out of a scientific experiment__. The questions dealt with still tended to be simple Is treatment A better than treatment B? 
+
+__The era of scientific mass production__, in which new technologies typified by the microarray allow a single team of scientists to produce data sets of a size Quetelet would envy. But now the flood of data is accompanied by a deluge of questions, perhaps thousands of estimates or hypothesis tests that the statistician is charged with answering together; not at all what the classical masters had in mind. Which variables matter among the thousands measured? How do you relate unrelated information?
+
+[http://www-stat.stanford.edu/~ckirby/brad/papers/2010LSIexcerpt.pdf](http://www-stat.stanford.edu/~ckirby/brad/papers/2010LSIexcerpt.pdf)
+
+---
+
+## Reasons for multiple testing
+
+<img class=center src=fig/datasources.png height=450>
+
+
+---
+
+## Why correct for multiple tests?
+
+<img class=center src=fig/jellybeans1.png height=450>
+
+
+[http://xkcd.com/882/](http://xkcd.com/882/)
+
+---
+
+## Why correct for multiple tests?
+
+<img class=center src=fig/jellybeans2.png height=400>
+
+[http://xkcd.com/882/](http://xkcd.com/882/)
+
+
+---
+
+## Types of errors
+
+Suppose you are testing a hypothesis that a parameter $\beta$ equals zero versus the alternative that it does not equal zero. These are the possible outcomes. 
+</br></br>
+
+                    | $\beta=0$   | $\beta\neq0$   |  Hypotheses
+--------------------|-------------|----------------|---------
+Claim $\beta=0$     |      $U$    |      $T$       |  $m-R$
+Claim $\beta\neq 0$ |      $V$    |      $S$       |  $R$
+    Claims          |     $m_0$   |      $m-m_0$   |  $m$
+
+</br></br>
+
+__Type I error or false positive ($V$)__ Say that the parameter does not equal zero when it does
+
+__Type II error or false negative ($T$)__ Say that the parameter equals zero when it doesn't 
+
+
+---
+
+## Error rates
+
+__False positive rate__ - The rate at which false results ($\beta = 0$) are called significant: $E\left[\frac{V}{m_0}\right]$*
+
+__Family wise error rate (FWER)__ - The probability of at least one false positive ${\rm Pr}(V \geq 1)$
+
+__False discovery rate (FDR)__ - The rate at which claims of significance are false $E\left[\frac{V}{R}\right]$
+
+* The false positive rate is closely related to the type I error rate [http://en.wikipedia.org/wiki/False_positive_rate](http://en.wikipedia.org/wiki/False_positive_rate)
+
+---
+
+## Controlling the false positive rate
+
+If P-values are correctly calculated calling all $P < \alpha$ significant will control the false positive rate at level $\alpha$ on average. 
+
+<redtext>Problem</redtext>: Suppose that you perform 10,000 tests and $\beta = 0$ for all of them. 
+
+Suppose that you call all $P < 0.05$ significant. 
+
+The expected number of false positives is: $10,000 \times 0.05 = 500$  false positives. 
+
+__How do we avoid so many false positives?__
+
+
+---
+
+## Controlling family-wise error rate (FWER)
+
+
+The [Bonferroni correction](http://en.wikipedia.org/wiki/Bonferroni_correction) is the oldest multiple testing correction. 
+
+__Basic idea__: 
+* Suppose you do $m$ tests
+* You want to control FWER at level $\alpha$ so $Pr(V \geq 1) < \alpha$
+* Calculate P-values normally
+* Set $\alpha_{fwer} = \alpha/m$
+* Call all $P$-values less than $\alpha_{fwer}$ significant
+
+__Pros__: Easy to calculate, conservative
+__Cons__: May be very conservative
+
+
+---
+
+## Controlling false discovery rate (FDR)
+
+This is the most popular correction when performing _lots_ of tests say in genomics, imaging, astronomy, or other signal-processing disciplines. 
+
+__Basic idea__: 
+* Suppose you do $m$ tests
+* You want to control FDR at level $\alpha$ so $E\left[\frac{V}{R}\right]$
+* Calculate P-values normally
+* Order the P-values from smallest to largest $P_{(1)},...,P_{(m)}$
+* Call any $P_{(i)} \leq \alpha \times \frac{i}{m}$ significant
+
+__Pros__: Still pretty easy to calculate, less conservative (maybe much less)
+
+__Cons__: Allows for more false positives, may behave strangely under dependence
+
+---
+
+## Example with 10 P-values
+
+<img class=center src=fig/example10pvals.png height=450>
+
+Controlling all error rates at $\alpha = 0.20$
+
+---
+
+## Adjusted P-values
+
+* One approach is to adjust the threshold $\alpha$
+* A different approach is to calculate "adjusted p-values"
+* They _are not p-values_ anymore
+* But they can be used directly without adjusting $\alpha$
+
+__Example__: 
+* Suppose P-values are $P_1,\ldots,P_m$
+* You could adjust them by taking $P_i^{fwer} = \max{m \times P_i,1}$ for each P-value.
+* Then if you call all $P_i^{fwer} < \alpha$ significant you will control the FWER. 
+
+---
+
+## Case study I: no true positives
+
+
+```r
+set.seed(1010093)
+pValues <- rep(NA, 1000)
+for (i in 1:1000) {
+    y <- rnorm(20)
+    x <- rnorm(20)
+    pValues[i] <- summary(lm(y ~ x))$coeff[2, 4]
+}
+
+# Controls false positive rate
+sum(pValues < 0.05)
+```
+
+```
+## [1] 51
+```
+
+
+---
+
+## Case study I: no true positives
+
+
+```r
+# Controls FWER
+sum(p.adjust(pValues, method = "bonferroni") < 0.05)
+```
+
+```
+## [1] 0
+```
+
+```r
+# Controls FDR
+sum(p.adjust(pValues, method = "BH") < 0.05)
+```
+
+```
+## [1] 0
+```
+
+
+
+---
+
+## Case study II: 50% true positives
+
+
+```r
+set.seed(1010093)
+pValues <- rep(NA, 1000)
+for (i in 1:1000) {
+    x <- rnorm(20)
+    # First 500 beta=0, last 500 beta=2
+    if (i <= 500) {
+        y <- rnorm(20)
+    } else {
+        y <- rnorm(20, mean = 2 * x)
+    }
+    pValues[i] <- summary(lm(y ~ x))$coeff[2, 4]
+}
+trueStatus <- rep(c("zero", "not zero"), each = 500)
+table(pValues < 0.05, trueStatus)
+```
+
+```
+##        trueStatus
+##         not zero zero
+##   FALSE        0  476
+##   TRUE       500   24
+```
+
+
+---
+
+
+## Case study II: 50% true positives
+
+
+```r
+# Controls FWER
+table(p.adjust(pValues, method = "bonferroni") < 0.05, trueStatus)
+```
+
+```
+##        trueStatus
+##         not zero zero
+##   FALSE       23  500
+##   TRUE       477    0
+```
+
+```r
+# Controls FDR
+table(p.adjust(pValues, method = "BH") < 0.05, trueStatus)
+```
+
+```
+##        trueStatus
+##         not zero zero
+##   FALSE        0  487
+##   TRUE       500   13
+```
+
+
+
+---
+
+
+## Case study II: 50% true positives
+
+__P-values versus adjusted P-values__
+
+```r
+par(mfrow = c(1, 2))
+plot(pValues, p.adjust(pValues, method = "bonferroni"), pch = 19)
+plot(pValues, p.adjust(pValues, method = "BH"), pch = 19)
+```
+
+![plot of chunk unnamed-chunk-3](assets/fig/unnamed-chunk-3.png) 
+
+
+
+---
+
+
+## Notes and resources
+
+__Notes__:
+* Multiple testing is an entire subfield
+* A basic Bonferroni/BH correction is usually enough
+* If there is strong dependence between tests there may be problems
+  * Consider method="BY"
+
+__Further resources__:
+* [Multiple testing procedures with applications to genomics](http://www.amazon.com/Multiple-Procedures-Applications-Genomics-Statistics/dp/0387493166/ref=sr_1_2/102-3292576-129059?ie=UTF8&s=books&qid=1187394873&sr=1-2)
+* [Statistical significance for genome-wide studies](http://www.pnas.org/content/100/16/9440.full)
+* [Introduction to multiple testing](http://ies.ed.gov/ncee/pubs/20084018/app_b.asp)
+
diff --git a/06_StatisticalInference/03_05_MultipleTesting/index.pdf b/06_StatisticalInference/03_05_MultipleTesting/index.pdf
index 190c24c34..88d17ad14 100644
Binary files a/06_StatisticalInference/03_05_MultipleTesting/index.pdf and b/06_StatisticalInference/03_05_MultipleTesting/index.pdf differ
diff --git a/06_StatisticalInference/03_06_resampledInference/assets/fig/unnamed-chunk-4.png b/06_StatisticalInference/03_06_resampledInference/assets/fig/unnamed-chunk-4.png
new file mode 100644
index 000000000..d88aaaddd
Binary files /dev/null and b/06_StatisticalInference/03_06_resampledInference/assets/fig/unnamed-chunk-4.png differ
diff --git a/06_StatisticalInference/03_06_resampledInference/assets/fig/unnamed-chunk-5.png b/06_StatisticalInference/03_06_resampledInference/assets/fig/unnamed-chunk-5.png
new file mode 100644
index 000000000..57f92bdae
Binary files /dev/null and b/06_StatisticalInference/03_06_resampledInference/assets/fig/unnamed-chunk-5.png differ
diff --git a/06_StatisticalInference/03_06_resampledInference/assets/fig/unnamed-chunk-7.png b/06_StatisticalInference/03_06_resampledInference/assets/fig/unnamed-chunk-7.png
new file mode 100644
index 000000000..8542ea3a4
Binary files /dev/null and b/06_StatisticalInference/03_06_resampledInference/assets/fig/unnamed-chunk-7.png differ
diff --git a/06_StatisticalInference/03_06_resampledInference/index.Rmd b/06_StatisticalInference/03_06_resampledInference/index.Rmd
index 775099c76..3f25e8c38 100644
--- a/06_StatisticalInference/03_06_resampledInference/index.Rmd
+++ b/06_StatisticalInference/03_06_resampledInference/index.Rmd
@@ -1,259 +1,243 @@
----
-title       : Resampled inference
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
-
----
-```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F, results='hide'}
-# make this an external chunk that can be included in any file
-options(width = 100)
-opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/')
-
-options(xtable.type = 'html')
-knit_hooks$set(inline = function(x) {
-  if(is.numeric(x)) {
-    round(x, getOption('digits'))
-  } else {
-    paste(as.character(x), collapse = ', ')
-  }
-})
-knit_hooks$set(plot = knitr:::hook_plot_html)
-runif(1)
-```
-
-## The jackknife
-
-- The jackknife is a tool for estimating standard errors  and the bias of estimators 
-- As its name suggests, the jackknife is a small, handy tool; in contrast to the bootstrap, which is then the moral equivalent of a giant workshop full of tools
-- Both the jackknife and the bootstrap involve *resampling* data; that is, repeatedly creating new data sets from the original data
-
----
-
-## The jackknife
-
-- The jackknife deletes each observation and calculates an estimate based on the remaining $n-1$ of them
-- It uses this collection of estimates to do things like estimate the bias and the standard error
-- Note that estimating the bias and having a standard error are not needed for things like sample means, which we know are unbiased estimates of population means and what their standard errors are
-
----
-
-## The jackknife
-
-- We'll consider the jackknife for univariate data
-- Let $X_1,\ldots,X_n$ be a collection of data used to estimate a parameter $\theta$
-- Let $\hat \theta$ be the estimate based on the full data set
-- Let $\hat \theta_{i}$ be the estimate of $\theta$ obtained by *deleting observation $i$*
-- Let $\bar \theta = \frac{1}{n}\sum_{i=1}^n \hat \theta_{i}$
-
----
-
-## Continued
-
-- Then, the jackknife estimate of the bias is
-   $$
-   (n - 1) \left(\bar \theta - \hat \theta\right)
-   $$
-   (how far the average delete-one estimate is from the actual estimate)
-- The jackknife estimate of the standard error is
-   $$
-   \left[\frac{n-1}{n}\sum_{i=1}^n (\hat \theta_i - \bar\theta )^2\right]^{1/2}
-   $$
-(the deviance of the delete-one estimates from the average delete-one estimate)
-
----
-
-## Example
-### We want to estimate the bias and standard error of the median
-
-```{r, results='hide'}
-library(UsingR)
-data(father.son)
-x <- father.son$sheight
-n <- length(x)
-theta <- median(x)
-jk <- sapply(1 : n,
-             function(i) median(x[-i])
-             )
-thetaBar <- mean(jk)
-biasEst <- (n - 1) * (thetaBar - theta) 
-seEst <- sqrt((n - 1) * mean((jk - thetaBar)^2))
-```
-
----
-
-## Example
-
-```{r}
-c(biasEst, seEst)
-library(bootstrap)
-temp <- jackknife(x, median)
-c(temp$jack.bias, temp$jack.se)
-```
-
----
-
-## Example
-
-- Both methods (of course) yield an estimated bias of `r temp$jack.bias` and a se of `r temp$jack.se`
-- Odd little fact: the jackknife estimate of the bias for the median is always $0$ when the number of observations is even
-- It has been shown that the jackknife is a linear approximation to the bootstrap
-- Generally do not use the jackknife for sample quantiles like the median; as it has been shown to have some poor properties
-
----
-
-## Pseudo observations
-
-- Another interesting way to think about the jackknife uses pseudo observations
-- Let
-$$
-      \mbox{Pseudo Obs} = n \hat \theta - (n - 1) \hat \theta_{i}
-$$
-- Think of these as ``whatever observation $i$ contributes to the estimate of $\theta$''
-- Note when $\hat \theta$ is the sample mean, the pseudo observations are the data themselves
-- Then the sample standard error of these observations is the previous jackknife estimated standard error.
-- The mean of these observations is a bias-corrected estimate of $\theta$
-
----
-
-## The bootstrap
-
-- The bootstrap is a tremendously useful tool for constructing confidence intervals and calculating standard errors for difficult statistics
-- For example, how would one derive a confidence interval for the median?
-- The bootstrap procedure follows from the so called bootstrap principle
-
----
-
-## The bootstrap principle
-
-- Suppose that I have a statistic that estimates some population parameter, but I don't know its sampling distribution
-- The bootstrap principle suggests using the distribution defined by the data to approximate its sampling distribution
-
----
-
-## The bootstrap in practice
-
-- In practice, the bootstrap principle is always carried out using simulation
-- We will cover only a few aspects of bootstrap resampling
-- The general procedure follows by first simulating complete data sets from the observed data with replacement
-
-  - This is approximately drawing from the sampling distribution of that statistic, at least as far as the data is able to approximate the true population distribution
-
-- Calculate the statistic for each simulated data set
-- Use the simulated statistics to either define a confidence interval or take the standard deviation to calculate a standard error
-
----
-## Nonparametric bootstrap algorithm example
-
-- Bootstrap procedure for calculating confidence interval for the median from a data set of $n$ observations
-
-  i. Sample $n$ observations **with replacement** from the observed data resulting in one simulated complete data set
-  
-  ii. Take the median of the simulated data set
-  
-  iii. Repeat these two steps $B$ times, resulting in $B$ simulated medians
-  
-  iv. These medians are approximately drawn from the sampling distribution of the median of $n$ observations; therefore we can
-  
-    - Draw a histogram of them
-    - Calculate their standard deviation to estimate the standard error of the median
-    - Take the $2.5^{th}$ and $97.5^{th}$ percentiles as a confidence interval for the median
-
----
-
-## Example code
-
-```{r}
-B <- 1000
-resamples <- matrix(sample(x,
-                           n * B,
-                           replace = TRUE),
-                    B, n)
-medians <- apply(resamples, 1, median)
-sd(medians)
-quantile(medians, c(.025, .975))
-```
-
----
-## Histogram of bootstrap resamples
-
-```{r, fig.height=5, fig.width=5}
-hist(medians)
-```
-
----
-
-## Notes on the bootstrap
-
-- The bootstrap is non-parametric
-- Better percentile bootstrap confidence intervals correct for bias
-- There are lots of variations on bootstrap procedures; the book "An Introduction to the Bootstrap"" by Efron and Tibshirani is a great place to start for both bootstrap and jackknife information
-
-
----
-## Group comparisons
-- Consider comparing two independent groups.
-- Example, comparing sprays B and C
-
-```{r, fig.height=4, fig.width=4}
-data(InsectSprays)
-boxplot(count ~ spray, data = InsectSprays)
-```
-
----
-## Permutation tests
--  Consider the null hypothesis that the distribution of the observations from each group is the same
--  Then, the group labels are irrelevant
--  We then discard the group levels and permute the combined data
--  Split the permuted data into two groups with $n_A$ and $n_B$
-  observations (say by always treating the first $n_A$ observations as
-  the first group)
--  Evaluate the probability of getting a statistic as large or
-  large than the one observed
--  An example statistic would be the difference in the averages between the two groups;
-  one could also use a t-statistic 
-
----
-## Variations on permutation testing
-Data type | Statistic | Test name 
----|---|---|
-Ranks | rank sum | rank sum test
-Binary | hypergeometric prob | Fisher's exact test
-Raw data | | ordinary permutation test
-
-- Also, so-called *randomization tests* are exactly permutation tests, with a different motivation.
-- For matched data, one can randomize the signs
-  - For ranks, this results in the signed rank test
-- Permutation strategies work for regression as well
-  - Permuting a regressor of interest
-- Permutation tests work very well in multivariate settings
-
----
-## Permutation test for pesticide data
-```{r}
-subdata <- InsectSprays[InsectSprays$spray %in% c("B", "C"),]
-y <- subdata$count
-group <- as.character(subdata$spray)
-testStat <- function(w, g) mean(w[g == "B"]) - mean(w[g == "C"])
-observedStat <- testStat(y, group)
-permutations <- sapply(1 : 10000, function(i) testStat(y, sample(group)))
-observedStat
-mean(permutations > observedStat)
-```
-
----
-## Histogram of permutations
-```{r, echo= FALSE, fig.width=5, fig.height=5}
-hist(permutations)
-```
-
-
+---
+title       : Resampled inference
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+
+---
+
+## The jackknife
+
+- The jackknife is a tool for estimating standard errors  and the bias of estimators 
+- As its name suggests, the jackknife is a small, handy tool; in contrast to the bootstrap, which is then the moral equivalent of a giant workshop full of tools
+- Both the jackknife and the bootstrap involve *resampling* data; that is, repeatedly creating new data sets from the original data
+
+---
+
+## The jackknife
+
+- The jackknife deletes each observation and calculates an estimate based on the remaining $n-1$ of them
+- It uses this collection of estimates to do things like estimate the bias and the standard error
+- Note that estimating the bias and having a standard error are not needed for things like sample means, which we know are unbiased estimates of population means and what their standard errors are
+
+---
+
+## The jackknife
+
+- We'll consider the jackknife for univariate data
+- Let $X_1,\ldots,X_n$ be a collection of data used to estimate a parameter $\theta$
+- Let $\hat \theta$ be the estimate based on the full data set
+- Let $\hat \theta_{i}$ be the estimate of $\theta$ obtained by *deleting observation $i$*
+- Let $\bar \theta = \frac{1}{n}\sum_{i=1}^n \hat \theta_{i}$
+
+---
+
+## Continued
+
+- Then, the jackknife estimate of the bias is
+   $$
+   (n - 1) \left(\bar \theta - \hat \theta\right)
+   $$
+   (how far the average delete-one estimate is from the actual estimate)
+- The jackknife estimate of the standard error is
+   $$
+   \left[\frac{n-1}{n}\sum_{i=1}^n (\hat \theta_i - \bar\theta )^2\right]^{1/2}
+   $$
+(the deviance of the delete-one estimates from the average delete-one estimate)
+
+---
+
+## Example
+### We want to estimate the bias and standard error of the median
+
+```{r, results='hide'}
+library(UsingR)
+data(father.son)
+x <- father.son$sheight
+n <- length(x)
+theta <- median(x)
+jk <- sapply(1 : n,
+             function(i) median(x[-i])
+             )
+thetaBar <- mean(jk)
+biasEst <- (n - 1) * (thetaBar - theta) 
+seEst <- sqrt((n - 1) * mean((jk - thetaBar)^2))
+```
+
+---
+
+## Example test
+
+```{r}
+c(biasEst, seEst)
+library(bootstrap)
+temp <- jackknife(x, median)
+c(temp$jack.bias, temp$jack.se)
+```
+
+---
+
+## Example
+
+- Both methods (of course) yield an estimated bias of `r temp$jack.bias` and a se of `r temp$jack.se`
+- Odd little fact: the jackknife estimate of the bias for the median is always $0$ when the number of observations is even
+- It has been shown that the jackknife is a linear approximation to the bootstrap
+- Generally do not use the jackknife for sample quantiles like the median; as it has been shown to have some poor properties
+
+---
+
+## Pseudo observations
+
+- Another interesting way to think about the jackknife uses pseudo observations
+- Let
+$$
+      \mbox{Pseudo Obs} = n \hat \theta - (n - 1) \hat \theta_{i}
+$$
+- Think of these as ``whatever observation $i$ contributes to the estimate of $\theta$''
+- Note when $\hat \theta$ is the sample mean, the pseudo observations are the data themselves
+- Then the sample standard error of these observations is the previous jackknife estimated standard error.
+- The mean of these observations is a bias-corrected estimate of $\theta$
+
+---
+
+## The bootstrap
+
+- The bootstrap is a tremendously useful tool for constructing confidence intervals and calculating standard errors for difficult statistics
+- For example, how would one derive a confidence interval for the median?
+- The bootstrap procedure follows from the so called bootstrap principle
+
+---
+
+## The bootstrap principle
+
+- Suppose that I have a statistic that estimates some population parameter, but I don't know its sampling distribution
+- The bootstrap principle suggests using the distribution defined by the data to approximate its sampling distribution
+
+---
+
+## The bootstrap in practice
+
+- In practice, the bootstrap principle is always carried out using simulation
+- We will cover only a few aspects of bootstrap resampling
+- The general procedure follows by first simulating complete data sets from the observed data with replacement
+
+  - This is approximately drawing from the sampling distribution of that statistic, at least as far as the data is able to approximate the true population distribution
+
+- Calculate the statistic for each simulated data set
+- Use the simulated statistics to either define a confidence interval or take the standard deviation to calculate a standard error
+
+---
+## Nonparametric bootstrap algorithm example
+
+- Bootstrap procedure for calculating confidence interval for the median from a data set of $n$ observations
+
+  i. Sample $n$ observations **with replacement** from the observed data resulting in one simulated complete data set
+  
+  ii. Take the median of the simulated data set
+  
+  iii. Repeat these two steps $B$ times, resulting in $B$ simulated medians
+  
+  iv. These medians are approximately drawn from the sampling distribution of the median of $n$ observations; therefore we can
+  
+    - Draw a histogram of them
+    - Calculate their standard deviation to estimate the standard error of the median
+    - Take the $2.5^{th}$ and $97.5^{th}$ percentiles as a confidence interval for the median
+
+---
+
+## Example code
+
+```{r}
+B <- 1000
+resamples <- matrix(sample(x,
+                           n * B,
+                           replace = TRUE),
+                    B, n)
+medians <- apply(resamples, 1, median)
+sd(medians)
+quantile(medians, c(.025, .975))
+```
+
+---
+## Histogram of bootstrap resamples
+
+```{r, fig.height=5, fig.width=5}
+hist(medians)
+```
+
+---
+
+## Notes on the bootstrap
+
+- The bootstrap is non-parametric
+- Better percentile bootstrap confidence intervals correct for bias
+- There are lots of variations on bootstrap procedures; the book "An Introduction to the Bootstrap"" by Efron and Tibshirani is a great place to start for both bootstrap and jackknife information
+
+
+---
+## Group comparisons
+- Consider comparing two independent groups.
+- Example, comparing sprays B and C
+
+```{r, fig.height=4, fig.width=4}
+data(InsectSprays)
+boxplot(count ~ spray, data = InsectSprays)
+```
+
+---
+## Permutation tests
+-  Consider the null hypothesis that the distribution of the observations from each group is the same
+-  Then, the group labels are irrelevant
+-  We then discard the group levels and permute the combined data
+-  Split the permuted data into two groups with $n_A$ and $n_B$
+  observations (say by always treating the first $n_A$ observations as
+  the first group)
+-  Evaluate the probability of getting a statistic as large or
+  large than the one observed
+-  An example statistic would be the difference in the averages between the two groups;
+  one could also use a t-statistic 
+
+---
+## Variations on permutation testing
+Data type | Statistic | Test name 
+---|---|---|
+Ranks | rank sum | rank sum test
+Binary | hypergeometric prob | Fisher's exact test
+Raw data | | ordinary permutation test
+
+- Also, so-called *randomization tests* are exactly permutation tests, with a different motivation.
+- For matched data, one can randomize the signs
+  - For ranks, this results in the signed rank test
+- Permutation strategies work for regression as well
+  - Permuting a regressor of interest
+- Permutation tests work very well in multivariate settings
+
+---
+## Permutation test for pesticide data
+```{r}
+subdata <- InsectSprays[InsectSprays$spray %in% c("B", "C"),]
+y <- subdata$count
+group <- as.character(subdata$spray)
+testStat <- function(w, g) mean(w[g == "B"]) - mean(w[g == "C"])
+observedStat <- testStat(y, group)
+permutations <- sapply(1 : 10000, function(i) testStat(y, sample(group)))
+observedStat
+mean(permutations > observedStat)
+```
+
+---
+## Histogram of permutations
+```{r, echo= FALSE, fig.width=5, fig.height=5}
+hist(permutations)
+```
+
+
diff --git a/06_StatisticalInference/03_06_resampledInference/index.html b/06_StatisticalInference/03_06_resampledInference/index.html
index 15515b719..ce1da45ea 100644
--- a/06_StatisticalInference/03_06_resampledInference/index.html
+++ b/06_StatisticalInference/03_06_resampledInference/index.html
@@ -1,614 +1,609 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Resampled inference</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Resampled inference">
-  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
-  
-  <!-- Grab CDN jQuery, fall back to local if offline -->
-  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
-  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
-    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-  
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-        <slide class="title-slide segue nobackground">
-  <aside class="gdbar">
-    <img src="../../assets/img/bloomberg_shield.png">
-  </aside>
-  <hgroup class="auto-fadein">
-    <h1>Resampled inference</h1>
-    <h2>Statistical Inference</h2>
-    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
-  </hgroup>
-  <article></article>  
-</slide>
-    
-
-    <!-- SLIDES -->
-    <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>The jackknife</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The jackknife is a tool for estimating standard errors  and the bias of estimators </li>
-<li>As its name suggests, the jackknife is a small, handy tool; in contrast to the bootstrap, which is then the moral equivalent of a giant workshop full of tools</li>
-<li>Both the jackknife and the bootstrap involve <em>resampling</em> data; that is, repeatedly creating new data sets from the original data</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>The jackknife</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The jackknife deletes each observation and calculates an estimate based on the remaining \(n-1\) of them</li>
-<li>It uses this collection of estimates to do things like estimate the bias and the standard error</li>
-<li>Note that estimating the bias and having a standard error are not needed for things like sample means, which we know are unbiased estimates of population means and what their standard errors are</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>The jackknife</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>We&#39;ll consider the jackknife for univariate data</li>
-<li>Let \(X_1,\ldots,X_n\) be a collection of data used to estimate a parameter \(\theta\)</li>
-<li>Let \(\hat \theta\) be the estimate based on the full data set</li>
-<li>Let \(\hat \theta_{i}\) be the estimate of \(\theta\) obtained by <em>deleting observation \(i\)</em></li>
-<li>Let \(\bar \theta = \frac{1}{n}\sum_{i=1}^n \hat \theta_{i}\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>Continued</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Then, the jackknife estimate of the bias is
-\[
-(n - 1) \left(\bar \theta - \hat \theta\right)
-\]
-(how far the average delete-one estimate is from the actual estimate)</li>
-<li>The jackknife estimate of the standard error is
-\[
-\left[\frac{n-1}{n}\sum_{i=1}^n (\hat \theta_i - \bar\theta )^2\right]^{1/2}
-\]
-(the deviance of the delete-one estimates from the average delete-one estimate)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <h3>We want to estimate the bias and standard error of the median</h3>
-
-<pre><code class="r">library(UsingR)
-data(father.son)
-x &lt;- father.son$sheight
-n &lt;- length(x)
-theta &lt;- median(x)
-jk &lt;- sapply(1 : n,
-             function(i) median(x[-i])
-             )
-thetaBar &lt;- mean(jk)
-biasEst &lt;- (n - 1) * (thetaBar - theta) 
-seEst &lt;- sqrt((n - 1) * mean((jk - thetaBar)^2))
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">c(biasEst, seEst)
-</code></pre>
-
-<pre><code>[1] 0.0000 0.1014
-</code></pre>
-
-<pre><code class="r">library(bootstrap)
-temp &lt;- jackknife(x, median)
-c(temp$jack.bias, temp$jack.se)
-</code></pre>
-
-<pre><code>[1] 0.0000 0.1014
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-7" style="background:;">
-  <hgroup>
-    <h2>Example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Both methods (of course) yield an estimated bias of 0 and a se of 0.1014</li>
-<li>Odd little fact: the jackknife estimate of the bias for the median is always \(0\) when the number of observations is even</li>
-<li>It has been shown that the jackknife is a linear approximation to the bootstrap</li>
-<li>Generally do not use the jackknife for sample quantiles like the median; as it has been shown to have some poor properties</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-8" style="background:;">
-  <hgroup>
-    <h2>Pseudo observations</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Another interesting way to think about the jackknife uses pseudo observations</li>
-<li>Let
-\[
-  \mbox{Pseudo Obs} = n \hat \theta - (n - 1) \hat \theta_{i}
-\]</li>
-<li>Think of these as ``whatever observation \(i\) contributes to the estimate of \(\theta\)&#39;&#39;</li>
-<li>Note when \(\hat \theta\) is the sample mean, the pseudo observations are the data themselves</li>
-<li>Then the sample standard error of these observations is the previous jackknife estimated standard error.</li>
-<li>The mean of these observations is a bias-corrected estimate of \(\theta\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-9" style="background:;">
-  <hgroup>
-    <h2>The bootstrap</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The bootstrap is a tremendously useful tool for constructing confidence intervals and calculating standard errors for difficult statistics</li>
-<li>For example, how would one derive a confidence interval for the median?</li>
-<li>The bootstrap procedure follows from the so called bootstrap principle</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-10" style="background:;">
-  <hgroup>
-    <h2>The bootstrap principle</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Suppose that I have a statistic that estimates some population parameter, but I don&#39;t know its sampling distribution</li>
-<li>The bootstrap principle suggests using the distribution defined by the data to approximate its sampling distribution</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-11" style="background:;">
-  <hgroup>
-    <h2>The bootstrap in practice</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>In practice, the bootstrap principle is always carried out using simulation</li>
-<li>We will cover only a few aspects of bootstrap resampling</li>
-<li><p>The general procedure follows by first simulating complete data sets from the observed data with replacement</p>
-
-<ul>
-<li>This is approximately drawing from the sampling distribution of that statistic, at least as far as the data is able to approximate the true population distribution</li>
-</ul></li>
-<li><p>Calculate the statistic for each simulated data set</p></li>
-<li><p>Use the simulated statistics to either define a confidence interval or take the standard deviation to calculate a standard error</p></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-12" style="background:;">
-  <hgroup>
-    <h2>Nonparametric bootstrap algorithm example</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li><p>Bootstrap procedure for calculating confidence interval for the median from a data set of \(n\) observations</p>
-
-<p>i. Sample \(n\) observations <strong>with replacement</strong> from the observed data resulting in one simulated complete data set</p>
-
-<p>ii. Take the median of the simulated data set</p>
-
-<p>iii. Repeat these two steps \(B\) times, resulting in \(B\) simulated medians</p>
-
-<p>iv. These medians are approximately drawn from the sampling distribution of the median of \(n\) observations; therefore we can</p>
-
-<ul>
-<li>Draw a histogram of them</li>
-<li>Calculate their standard deviation to estimate the standard error of the median</li>
-<li>Take the \(2.5^{th}\) and \(97.5^{th}\) percentiles as a confidence interval for the median</li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-13" style="background:;">
-  <hgroup>
-    <h2>Example code</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">B &lt;- 1000
-resamples &lt;- matrix(sample(x,
-                           n * B,
-                           replace = TRUE),
-                    B, n)
-medians &lt;- apply(resamples, 1, median)
-sd(medians)
-</code></pre>
-
-<pre><code>[1] 0.08546
-</code></pre>
-
-<pre><code class="r">quantile(medians, c(.025, .975))
-</code></pre>
-
-<pre><code> 2.5% 97.5% 
-68.43 68.82 
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-14" style="background:;">
-  <hgroup>
-    <h2>Histogram of bootstrap resamples</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">hist(medians)
-</code></pre>
-
-<div class="rimage center"><img src="fig/unnamed-chunk-4.png" title="plot of chunk unnamed-chunk-4" alt="plot of chunk unnamed-chunk-4" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-15" style="background:;">
-  <hgroup>
-    <h2>Notes on the bootstrap</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>The bootstrap is non-parametric</li>
-<li>Better percentile bootstrap confidence intervals correct for bias</li>
-<li>There are lots of variations on bootstrap procedures; the book &quot;An Introduction to the Bootstrap&quot;&quot; by Efron and Tibshirani is a great place to start for both bootstrap and jackknife information</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-16" style="background:;">
-  <hgroup>
-    <h2>Group comparisons</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li>Consider comparing two independent groups.</li>
-<li>Example, comparing sprays B and C</li>
-</ul>
-
-<pre><code class="r">data(InsectSprays)
-boxplot(count ~ spray, data = InsectSprays)
-</code></pre>
-
-<div class="rimage center"><img src="fig/unnamed-chunk-5.png" title="plot of chunk unnamed-chunk-5" alt="plot of chunk unnamed-chunk-5" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-17" style="background:;">
-  <hgroup>
-    <h2>Permutation tests</h2>
-  </hgroup>
-  <article data-timings="">
-    <ul>
-<li> Consider the null hypothesis that the distribution of the observations from each group is the same</li>
-<li> Then, the group labels are irrelevant</li>
-<li> We then discard the group levels and permute the combined data</li>
-<li> Split the permuted data into two groups with \(n_A\) and \(n_B\)
-observations (say by always treating the first \(n_A\) observations as
-the first group)</li>
-<li> Evaluate the probability of getting a statistic as large or
-large than the one observed</li>
-<li> An example statistic would be the difference in the averages between the two groups;
-one could also use a t-statistic </li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-18" style="background:;">
-  <hgroup>
-    <h2>Variations on permutation testing</h2>
-  </hgroup>
-  <article data-timings="">
-    <table><thead>
-<tr>
-<th>Data type</th>
-<th>Statistic</th>
-<th>Test name</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>Ranks</td>
-<td>rank sum</td>
-<td>rank sum test</td>
-</tr>
-<tr>
-<td>Binary</td>
-<td>hypergeometric prob</td>
-<td>Fisher&#39;s exact test</td>
-</tr>
-<tr>
-<td>Raw data</td>
-<td></td>
-<td>ordinary permutation test</td>
-</tr>
-</tbody></table>
-
-<ul>
-<li>Also, so-called <em>randomization tests</em> are exactly permutation tests, with a different motivation.</li>
-<li>For matched data, one can randomize the signs
-
-<ul>
-<li>For ranks, this results in the signed rank test</li>
-</ul></li>
-<li>Permutation strategies work for regression as well
-
-<ul>
-<li>Permuting a regressor of interest</li>
-</ul></li>
-<li>Permutation tests work very well in multivariate settings</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-19" style="background:;">
-  <hgroup>
-    <h2>Permutation test for pesticide data</h2>
-  </hgroup>
-  <article data-timings="">
-    <pre><code class="r">subdata &lt;- InsectSprays[InsectSprays$spray %in% c(&quot;B&quot;, &quot;C&quot;),]
-y &lt;- subdata$count
-group &lt;- as.character(subdata$spray)
-testStat &lt;- function(w, g) mean(w[g == &quot;B&quot;]) - mean(w[g == &quot;C&quot;])
-observedStat &lt;- testStat(y, group)
-permutations &lt;- sapply(1 : 10000, function(i) testStat(y, sample(group)))
-observedStat
-</code></pre>
-
-<pre><code>[1] 13.25
-</code></pre>
-
-<pre><code class="r">mean(permutations &gt; observedStat)
-</code></pre>
-
-<pre><code>[1] 0
-</code></pre>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-<slide class="" id="slide-20" style="background:;">
-  <hgroup>
-    <h2>Histogram of permutations</h2>
-  </hgroup>
-  <article data-timings="">
-    <div class="rimage center"><img src="fig/unnamed-chunk-7.png" title="plot of chunk unnamed-chunk-7" alt="plot of chunk unnamed-chunk-7" class="plot" /></div>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
-    <ul>
-      <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=1 title='The jackknife'>
-         1
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=2 title='The jackknife'>
-         2
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=3 title='The jackknife'>
-         3
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=4 title='Continued'>
-         4
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=5 title='Example'>
-         5
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=6 title='Example'>
-         6
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=7 title='Example'>
-         7
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=8 title='Pseudo observations'>
-         8
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=9 title='The bootstrap'>
-         9
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=10 title='The bootstrap principle'>
-         10
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=11 title='The bootstrap in practice'>
-         11
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=12 title='Nonparametric bootstrap algorithm example'>
-         12
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=13 title='Example code'>
-         13
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=14 title='Histogram of bootstrap resamples'>
-         14
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=15 title='Notes on the bootstrap'>
-         15
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=16 title='Group comparisons'>
-         16
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=17 title='Permutation tests'>
-         17
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=18 title='Variations on permutation testing'>
-         18
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=19 title='Permutation test for pesticide data'>
-         19
-      </a>
-    </li>
-    <li>
-      <a href="#" target="_self" rel='tooltip' 
-        data-slide=20 title='Histogram of permutations'>
-         20
-      </a>
-    </li>
-  </ul>
-  </div>  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-  <!-- Load Javascripts for Widgets -->
-  
-  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-  <script type="text/x-mathjax-config">
-    MathJax.Hub.Config({
-      tex2jax: {
-        inlineMath: [['$','$'], ['\\(','\\)']],
-        processEscapes: true
-      }
-    });
-  </script>
-  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-  </script> -->
-  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
-  <script>hljs.initHighlightingOnLoad();</script>
-  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
-   
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Resampled inference</title>
+  <meta charset="utf-8">
+  <meta name="description" content="Resampled inference">
+  <meta name="author" content="Brian Caffo, Jeff Leek, Roger Peng">
+  <meta name="generator" content="slidify" />
+  <meta name="apple-mobile-web-app-capable" content="yes">
+  <meta http-equiv="X-UA-Compatible" content="chrome=1">
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/default.css" media="all" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/phone.css" 
+    media="only screen and (max-device-width: 480px)" >
+  <link rel="stylesheet" href="../../librariesNew/frameworks/io2012/css/slidify.css" >
+  <link rel="stylesheet" href="../../librariesNew/highlighters/highlight.js/css/tomorrow.css" />
+  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->  
+  
+  <!-- Grab CDN jQuery, fall back to local if offline -->
+  <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
+  <script>window.jQuery || document.write('<script src="../../librariesNew/widgets/quiz/js/jquery.js"><\/script>')</script> 
+  <script data-main="../../librariesNew/frameworks/io2012/js/slides" 
+    src="../../librariesNew/frameworks/io2012/js/require-1.0.8.min.js">
+  </script>
+  
+  
+
+</head>
+<body style="opacity: 0">
+  <slides class="layout-widescreen">
+    
+    <!-- LOGO SLIDE -->
+        <slide class="title-slide segue nobackground">
+  <aside class="gdbar">
+    <img src="../../assets/img/bloomberg_shield.png">
+  </aside>
+  <hgroup class="auto-fadein">
+    <h1>Resampled inference</h1>
+    <h2>Statistical Inference</h2>
+    <p>Brian Caffo, Jeff Leek, Roger Peng<br/>Johns Hopkins Bloomberg School of Public Health</p>
+  </hgroup>
+  <article></article>  
+</slide>
+    
+
+    <!-- SLIDES -->
+    <slide class="" id="slide-1" style="background:;">
+  <hgroup>
+    <h2>The jackknife</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The jackknife is a tool for estimating standard errors  and the bias of estimators </li>
+<li>As its name suggests, the jackknife is a small, handy tool; in contrast to the bootstrap, which is then the moral equivalent of a giant workshop full of tools</li>
+<li>Both the jackknife and the bootstrap involve <em>resampling</em> data; that is, repeatedly creating new data sets from the original data</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-2" style="background:;">
+  <hgroup>
+    <h2>The jackknife</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The jackknife deletes each observation and calculates an estimate based on the remaining \(n-1\) of them</li>
+<li>It uses this collection of estimates to do things like estimate the bias and the standard error</li>
+<li>Note that estimating the bias and having a standard error are not needed for things like sample means, which we know are unbiased estimates of population means and what their standard errors are</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <hgroup>
+    <h2>The jackknife</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>We&#39;ll consider the jackknife for univariate data</li>
+<li>Let \(X_1,\ldots,X_n\) be a collection of data used to estimate a parameter \(\theta\)</li>
+<li>Let \(\hat \theta\) be the estimate based on the full data set</li>
+<li>Let \(\hat \theta_{i}\) be the estimate of \(\theta\) obtained by <em>deleting observation \(i\)</em></li>
+<li>Let \(\bar \theta = \frac{1}{n}\sum_{i=1}^n \hat \theta_{i}\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <hgroup>
+    <h2>Continued</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Then, the jackknife estimate of the bias is
+\[
+(n - 1) \left(\bar \theta - \hat \theta\right)
+\]
+(how far the average delete-one estimate is from the actual estimate)</li>
+<li>The jackknife estimate of the standard error is
+\[
+\left[\frac{n-1}{n}\sum_{i=1}^n (\hat \theta_i - \bar\theta )^2\right]^{1/2}
+\]
+(the deviance of the delete-one estimates from the average delete-one estimate)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <h3>We want to estimate the bias and standard error of the median</h3>
+
+<pre><code class="r">library(UsingR)
+data(father.son)
+x &lt;- father.son$sheight
+n &lt;- length(x)
+theta &lt;- median(x)
+jk &lt;- sapply(1:n, function(i) median(x[-i]))
+thetaBar &lt;- mean(jk)
+biasEst &lt;- (n - 1) * (thetaBar - theta)
+seEst &lt;- sqrt((n - 1) * mean((jk - thetaBar)^2))
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <hgroup>
+    <h2>Example test</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">c(biasEst, seEst)
+</code></pre>
+
+<pre><code>## [1] 0.0000 0.1014
+</code></pre>
+
+<pre><code class="r">library(bootstrap)
+temp &lt;- jackknife(x, median)
+c(temp$jack.bias, temp$jack.se)
+</code></pre>
+
+<pre><code>## [1] 0.0000 0.1014
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <hgroup>
+    <h2>Example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Both methods (of course) yield an estimated bias of 0 and a se of 0.1014</li>
+<li>Odd little fact: the jackknife estimate of the bias for the median is always \(0\) when the number of observations is even</li>
+<li>It has been shown that the jackknife is a linear approximation to the bootstrap</li>
+<li>Generally do not use the jackknife for sample quantiles like the median; as it has been shown to have some poor properties</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <hgroup>
+    <h2>Pseudo observations</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Another interesting way to think about the jackknife uses pseudo observations</li>
+<li>Let
+\[
+  \mbox{Pseudo Obs} = n \hat \theta - (n - 1) \hat \theta_{i}
+\]</li>
+<li>Think of these as ``whatever observation \(i\) contributes to the estimate of \(\theta\)&#39;&#39;</li>
+<li>Note when \(\hat \theta\) is the sample mean, the pseudo observations are the data themselves</li>
+<li>Then the sample standard error of these observations is the previous jackknife estimated standard error.</li>
+<li>The mean of these observations is a bias-corrected estimate of \(\theta\)</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <hgroup>
+    <h2>The bootstrap</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The bootstrap is a tremendously useful tool for constructing confidence intervals and calculating standard errors for difficult statistics</li>
+<li>For example, how would one derive a confidence interval for the median?</li>
+<li>The bootstrap procedure follows from the so called bootstrap principle</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <hgroup>
+    <h2>The bootstrap principle</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Suppose that I have a statistic that estimates some population parameter, but I don&#39;t know its sampling distribution</li>
+<li>The bootstrap principle suggests using the distribution defined by the data to approximate its sampling distribution</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <hgroup>
+    <h2>The bootstrap in practice</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>In practice, the bootstrap principle is always carried out using simulation</li>
+<li>We will cover only a few aspects of bootstrap resampling</li>
+<li><p>The general procedure follows by first simulating complete data sets from the observed data with replacement</p>
+
+<ul>
+<li>This is approximately drawing from the sampling distribution of that statistic, at least as far as the data is able to approximate the true population distribution</li>
+</ul></li>
+<li><p>Calculate the statistic for each simulated data set</p></li>
+<li><p>Use the simulated statistics to either define a confidence interval or take the standard deviation to calculate a standard error</p></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <hgroup>
+    <h2>Nonparametric bootstrap algorithm example</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li><p>Bootstrap procedure for calculating confidence interval for the median from a data set of \(n\) observations</p>
+
+<p>i. Sample \(n\) observations <strong>with replacement</strong> from the observed data resulting in one simulated complete data set</p>
+
+<p>ii. Take the median of the simulated data set</p>
+
+<p>iii. Repeat these two steps \(B\) times, resulting in \(B\) simulated medians</p>
+
+<p>iv. These medians are approximately drawn from the sampling distribution of the median of \(n\) observations; therefore we can</p>
+
+<ul>
+<li>Draw a histogram of them</li>
+<li>Calculate their standard deviation to estimate the standard error of the median</li>
+<li>Take the \(2.5^{th}\) and \(97.5^{th}\) percentiles as a confidence interval for the median</li>
+</ul></li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-13" style="background:;">
+  <hgroup>
+    <h2>Example code</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">B &lt;- 1000
+resamples &lt;- matrix(sample(x, n * B, replace = TRUE), B, n)
+medians &lt;- apply(resamples, 1, median)
+sd(medians)
+</code></pre>
+
+<pre><code>## [1] 0.08834
+</code></pre>
+
+<pre><code class="r">quantile(medians, c(0.025, 0.975))
+</code></pre>
+
+<pre><code>##  2.5% 97.5% 
+## 68.41 68.82
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-14" style="background:;">
+  <hgroup>
+    <h2>Histogram of bootstrap resamples</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">hist(medians)
+</code></pre>
+
+<p><img src="assets/fig/unnamed-chunk-4.png" alt="plot of chunk unnamed-chunk-4"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-15" style="background:;">
+  <hgroup>
+    <h2>Notes on the bootstrap</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>The bootstrap is non-parametric</li>
+<li>Better percentile bootstrap confidence intervals correct for bias</li>
+<li>There are lots of variations on bootstrap procedures; the book &quot;An Introduction to the Bootstrap&quot;&quot; by Efron and Tibshirani is a great place to start for both bootstrap and jackknife information</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-16" style="background:;">
+  <hgroup>
+    <h2>Group comparisons</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li>Consider comparing two independent groups.</li>
+<li>Example, comparing sprays B and C</li>
+</ul>
+
+<pre><code class="r">data(InsectSprays)
+boxplot(count ~ spray, data = InsectSprays)
+</code></pre>
+
+<p><img src="assets/fig/unnamed-chunk-5.png" alt="plot of chunk unnamed-chunk-5"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-17" style="background:;">
+  <hgroup>
+    <h2>Permutation tests</h2>
+  </hgroup>
+  <article data-timings="">
+    <ul>
+<li> Consider the null hypothesis that the distribution of the observations from each group is the same</li>
+<li> Then, the group labels are irrelevant</li>
+<li> We then discard the group levels and permute the combined data</li>
+<li> Split the permuted data into two groups with \(n_A\) and \(n_B\)
+observations (say by always treating the first \(n_A\) observations as
+the first group)</li>
+<li> Evaluate the probability of getting a statistic as large or
+large than the one observed</li>
+<li> An example statistic would be the difference in the averages between the two groups;
+one could also use a t-statistic </li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-18" style="background:;">
+  <hgroup>
+    <h2>Variations on permutation testing</h2>
+  </hgroup>
+  <article data-timings="">
+    <table><thead>
+<tr>
+<th>Data type</th>
+<th>Statistic</th>
+<th>Test name</th>
+</tr>
+</thead><tbody>
+<tr>
+<td>Ranks</td>
+<td>rank sum</td>
+<td>rank sum test</td>
+</tr>
+<tr>
+<td>Binary</td>
+<td>hypergeometric prob</td>
+<td>Fisher&#39;s exact test</td>
+</tr>
+<tr>
+<td>Raw data</td>
+<td></td>
+<td>ordinary permutation test</td>
+</tr>
+</tbody></table>
+
+<ul>
+<li>Also, so-called <em>randomization tests</em> are exactly permutation tests, with a different motivation.</li>
+<li>For matched data, one can randomize the signs
+
+<ul>
+<li>For ranks, this results in the signed rank test</li>
+</ul></li>
+<li>Permutation strategies work for regression as well
+
+<ul>
+<li>Permuting a regressor of interest</li>
+</ul></li>
+<li>Permutation tests work very well in multivariate settings</li>
+</ul>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-19" style="background:;">
+  <hgroup>
+    <h2>Permutation test for pesticide data</h2>
+  </hgroup>
+  <article data-timings="">
+    <pre><code class="r">subdata &lt;- InsectSprays[InsectSprays$spray %in% c(&quot;B&quot;, &quot;C&quot;), ]
+y &lt;- subdata$count
+group &lt;- as.character(subdata$spray)
+testStat &lt;- function(w, g) mean(w[g == &quot;B&quot;]) - mean(w[g == &quot;C&quot;])
+observedStat &lt;- testStat(y, group)
+permutations &lt;- sapply(1:10000, function(i) testStat(y, sample(group)))
+observedStat
+</code></pre>
+
+<pre><code>## [1] 13.25
+</code></pre>
+
+<pre><code class="r">mean(permutations &gt; observedStat)
+</code></pre>
+
+<pre><code>## [1] 0
+</code></pre>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-20" style="background:;">
+  <hgroup>
+    <h2>Histogram of permutations</h2>
+  </hgroup>
+  <article data-timings="">
+    <p><img src="assets/fig/unnamed-chunk-7.png" alt="plot of chunk unnamed-chunk-7"> </p>
+
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+    <slide class="backdrop"></slide>
+  </slides>
+  <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
+    <ul>
+      <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=1 title='The jackknife'>
+         1
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title='The jackknife'>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title='The jackknife'>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title='Continued'>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title='Example'>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title='Example test'>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title='Example'>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title='Pseudo observations'>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title='The bootstrap'>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title='The bootstrap principle'>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title='The bootstrap in practice'>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title='Nonparametric bootstrap algorithm example'>
+         12
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=13 title='Example code'>
+         13
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=14 title='Histogram of bootstrap resamples'>
+         14
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=15 title='Notes on the bootstrap'>
+         15
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=16 title='Group comparisons'>
+         16
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=17 title='Permutation tests'>
+         17
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=18 title='Variations on permutation testing'>
+         18
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=19 title='Permutation test for pesticide data'>
+         19
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=20 title='Histogram of permutations'>
+         20
+      </a>
+    </li>
+  </ul>
+  </div>  <!--[if IE]>
+    <script 
+      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
+    </script>
+    <script>CFInstall.check({mode: 'overlay'});</script>
+  <![endif]-->
+</body>
+  <!-- Load Javascripts for Widgets -->
+  
+  <!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
+  <script type="text/x-mathjax-config">
+    MathJax.Hub.Config({
+      tex2jax: {
+        inlineMath: [['$','$'], ['\\(','\\)']],
+        processEscapes: true
+      }
+    });
+  </script>
+  <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
+  <!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+  </script> -->
+  <script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../librariesNew/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
+</script>
+<!-- LOAD HIGHLIGHTER JS FILES -->
+  <script src="../../librariesNew/highlighters/highlight.js/highlight.pack.js"></script>
+  <script>hljs.initHighlightingOnLoad();</script>
+  <!-- DONE LOADING HIGHLIGHTER JS FILES -->
+   
   </html>
\ No newline at end of file
diff --git a/06_StatisticalInference/03_06_resampledInference/index.md b/06_StatisticalInference/03_06_resampledInference/index.md
index d1544df72..448841d3a 100644
--- a/06_StatisticalInference/03_06_resampledInference/index.md
+++ b/06_StatisticalInference/03_06_resampledInference/index.md
@@ -1,294 +1,287 @@
----
-title       : Resampled inference
-subtitle    : Statistical Inference
-author      : Brian Caffo, Jeff Leek, Roger Peng
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : tomorrow      # 
-url:
-  lib: ../../librariesNew
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
-
----
-
-
-
-## The jackknife
-
-- The jackknife is a tool for estimating standard errors  and the bias of estimators 
-- As its name suggests, the jackknife is a small, handy tool; in contrast to the bootstrap, which is then the moral equivalent of a giant workshop full of tools
-- Both the jackknife and the bootstrap involve *resampling* data; that is, repeatedly creating new data sets from the original data
-
----
-
-## The jackknife
-
-- The jackknife deletes each observation and calculates an estimate based on the remaining $n-1$ of them
-- It uses this collection of estimates to do things like estimate the bias and the standard error
-- Note that estimating the bias and having a standard error are not needed for things like sample means, which we know are unbiased estimates of population means and what their standard errors are
-
----
-
-## The jackknife
-
-- We'll consider the jackknife for univariate data
-- Let $X_1,\ldots,X_n$ be a collection of data used to estimate a parameter $\theta$
-- Let $\hat \theta$ be the estimate based on the full data set
-- Let $\hat \theta_{i}$ be the estimate of $\theta$ obtained by *deleting observation $i$*
-- Let $\bar \theta = \frac{1}{n}\sum_{i=1}^n \hat \theta_{i}$
-
----
-
-## Continued
-
-- Then, the jackknife estimate of the bias is
-   $$
-   (n - 1) \left(\bar \theta - \hat \theta\right)
-   $$
-   (how far the average delete-one estimate is from the actual estimate)
-- The jackknife estimate of the standard error is
-   $$
-   \left[\frac{n-1}{n}\sum_{i=1}^n (\hat \theta_i - \bar\theta )^2\right]^{1/2}
-   $$
-(the deviance of the delete-one estimates from the average delete-one estimate)
-
----
-
-## Example
-### We want to estimate the bias and standard error of the median
-
-
-```r
-library(UsingR)
-data(father.son)
-x <- father.son$sheight
-n <- length(x)
-theta <- median(x)
-jk <- sapply(1 : n,
-             function(i) median(x[-i])
-             )
-thetaBar <- mean(jk)
-biasEst <- (n - 1) * (thetaBar - theta) 
-seEst <- sqrt((n - 1) * mean((jk - thetaBar)^2))
-```
-
-
----
-
-## Example
-
-
-```r
-c(biasEst, seEst)
-```
-
-```
-[1] 0.0000 0.1014
-```
-
-```r
-library(bootstrap)
-temp <- jackknife(x, median)
-c(temp$jack.bias, temp$jack.se)
-```
-
-```
-[1] 0.0000 0.1014
-```
-
-
----
-
-## Example
-
-- Both methods (of course) yield an estimated bias of 0 and a se of 0.1014
-- Odd little fact: the jackknife estimate of the bias for the median is always $0$ when the number of observations is even
-- It has been shown that the jackknife is a linear approximation to the bootstrap
-- Generally do not use the jackknife for sample quantiles like the median; as it has been shown to have some poor properties
-
----
-
-## Pseudo observations
-
-- Another interesting way to think about the jackknife uses pseudo observations
-- Let
-$$
-      \mbox{Pseudo Obs} = n \hat \theta - (n - 1) \hat \theta_{i}
-$$
-- Think of these as ``whatever observation $i$ contributes to the estimate of $\theta$''
-- Note when $\hat \theta$ is the sample mean, the pseudo observations are the data themselves
-- Then the sample standard error of these observations is the previous jackknife estimated standard error.
-- The mean of these observations is a bias-corrected estimate of $\theta$
-
----
-
-## The bootstrap
-
-- The bootstrap is a tremendously useful tool for constructing confidence intervals and calculating standard errors for difficult statistics
-- For example, how would one derive a confidence interval for the median?
-- The bootstrap procedure follows from the so called bootstrap principle
-
----
-
-## The bootstrap principle
-
-- Suppose that I have a statistic that estimates some population parameter, but I don't know its sampling distribution
-- The bootstrap principle suggests using the distribution defined by the data to approximate its sampling distribution
-
----
-
-## The bootstrap in practice
-
-- In practice, the bootstrap principle is always carried out using simulation
-- We will cover only a few aspects of bootstrap resampling
-- The general procedure follows by first simulating complete data sets from the observed data with replacement
-
-  - This is approximately drawing from the sampling distribution of that statistic, at least as far as the data is able to approximate the true population distribution
-
-- Calculate the statistic for each simulated data set
-- Use the simulated statistics to either define a confidence interval or take the standard deviation to calculate a standard error
-
----
-## Nonparametric bootstrap algorithm example
-
-- Bootstrap procedure for calculating confidence interval for the median from a data set of $n$ observations
-
-  i. Sample $n$ observations **with replacement** from the observed data resulting in one simulated complete data set
-  
-  ii. Take the median of the simulated data set
-  
-  iii. Repeat these two steps $B$ times, resulting in $B$ simulated medians
-  
-  iv. These medians are approximately drawn from the sampling distribution of the median of $n$ observations; therefore we can
-  
-    - Draw a histogram of them
-    - Calculate their standard deviation to estimate the standard error of the median
-    - Take the $2.5^{th}$ and $97.5^{th}$ percentiles as a confidence interval for the median
-
----
-
-## Example code
-
-
-```r
-B <- 1000
-resamples <- matrix(sample(x,
-                           n * B,
-                           replace = TRUE),
-                    B, n)
-medians <- apply(resamples, 1, median)
-sd(medians)
-```
-
-```
-[1] 0.08546
-```
-
-```r
-quantile(medians, c(.025, .975))
-```
-
-```
- 2.5% 97.5% 
-68.43 68.82 
-```
-
-
----
-## Histogram of bootstrap resamples
-
-
-```r
-hist(medians)
-```
-
-<div class="rimage center"><img src="fig/unnamed-chunk-4.png" title="plot of chunk unnamed-chunk-4" alt="plot of chunk unnamed-chunk-4" class="plot" /></div>
-
-
----
-
-## Notes on the bootstrap
-
-- The bootstrap is non-parametric
-- Better percentile bootstrap confidence intervals correct for bias
-- There are lots of variations on bootstrap procedures; the book "An Introduction to the Bootstrap"" by Efron and Tibshirani is a great place to start for both bootstrap and jackknife information
-
-
----
-## Group comparisons
-- Consider comparing two independent groups.
-- Example, comparing sprays B and C
-
-
-```r
-data(InsectSprays)
-boxplot(count ~ spray, data = InsectSprays)
-```
-
-<div class="rimage center"><img src="fig/unnamed-chunk-5.png" title="plot of chunk unnamed-chunk-5" alt="plot of chunk unnamed-chunk-5" class="plot" /></div>
-
-
----
-## Permutation tests
--  Consider the null hypothesis that the distribution of the observations from each group is the same
--  Then, the group labels are irrelevant
--  We then discard the group levels and permute the combined data
--  Split the permuted data into two groups with $n_A$ and $n_B$
-  observations (say by always treating the first $n_A$ observations as
-  the first group)
--  Evaluate the probability of getting a statistic as large or
-  large than the one observed
--  An example statistic would be the difference in the averages between the two groups;
-  one could also use a t-statistic 
-
----
-## Variations on permutation testing
-Data type | Statistic | Test name 
----|---|---|
-Ranks | rank sum | rank sum test
-Binary | hypergeometric prob | Fisher's exact test
-Raw data | | ordinary permutation test
-
-- Also, so-called *randomization tests* are exactly permutation tests, with a different motivation.
-- For matched data, one can randomize the signs
-  - For ranks, this results in the signed rank test
-- Permutation strategies work for regression as well
-  - Permuting a regressor of interest
-- Permutation tests work very well in multivariate settings
-
----
-## Permutation test for pesticide data
-
-```r
-subdata <- InsectSprays[InsectSprays$spray %in% c("B", "C"),]
-y <- subdata$count
-group <- as.character(subdata$spray)
-testStat <- function(w, g) mean(w[g == "B"]) - mean(w[g == "C"])
-observedStat <- testStat(y, group)
-permutations <- sapply(1 : 10000, function(i) testStat(y, sample(group)))
-observedStat
-```
-
-```
-[1] 13.25
-```
-
-```r
-mean(permutations > observedStat)
-```
-
-```
-[1] 0
-```
-
-
----
-## Histogram of permutations
-<div class="rimage center"><img src="fig/unnamed-chunk-7.png" title="plot of chunk unnamed-chunk-7" alt="plot of chunk unnamed-chunk-7" class="plot" /></div>
-
-
-
+---
+title       : Resampled inference
+subtitle    : Statistical Inference
+author      : Brian Caffo, Jeff Leek, Roger Peng
+job         : Johns Hopkins Bloomberg School of Public Health
+logo        : bloomberg_shield.png
+framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
+highlighter : highlight.js  # {highlight.js, prettify, highlight}
+hitheme     : tomorrow      # 
+url:
+  lib: ../../librariesNew
+  assets: ../../assets
+widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
+mode        : selfcontained # {standalone, draft}
+
+---
+
+## The jackknife
+
+- The jackknife is a tool for estimating standard errors  and the bias of estimators 
+- As its name suggests, the jackknife is a small, handy tool; in contrast to the bootstrap, which is then the moral equivalent of a giant workshop full of tools
+- Both the jackknife and the bootstrap involve *resampling* data; that is, repeatedly creating new data sets from the original data
+
+---
+
+## The jackknife
+
+- The jackknife deletes each observation and calculates an estimate based on the remaining $n-1$ of them
+- It uses this collection of estimates to do things like estimate the bias and the standard error
+- Note that estimating the bias and having a standard error are not needed for things like sample means, which we know are unbiased estimates of population means and what their standard errors are
+
+---
+
+## The jackknife
+
+- We'll consider the jackknife for univariate data
+- Let $X_1,\ldots,X_n$ be a collection of data used to estimate a parameter $\theta$
+- Let $\hat \theta$ be the estimate based on the full data set
+- Let $\hat \theta_{i}$ be the estimate of $\theta$ obtained by *deleting observation $i$*
+- Let $\bar \theta = \frac{1}{n}\sum_{i=1}^n \hat \theta_{i}$
+
+---
+
+## Continued
+
+- Then, the jackknife estimate of the bias is
+   $$
+   (n - 1) \left(\bar \theta - \hat \theta\right)
+   $$
+   (how far the average delete-one estimate is from the actual estimate)
+- The jackknife estimate of the standard error is
+   $$
+   \left[\frac{n-1}{n}\sum_{i=1}^n (\hat \theta_i - \bar\theta )^2\right]^{1/2}
+   $$
+(the deviance of the delete-one estimates from the average delete-one estimate)
+
+---
+
+## Example
+### We want to estimate the bias and standard error of the median
+
+
+```r
+library(UsingR)
+data(father.son)
+x <- father.son$sheight
+n <- length(x)
+theta <- median(x)
+jk <- sapply(1:n, function(i) median(x[-i]))
+thetaBar <- mean(jk)
+biasEst <- (n - 1) * (thetaBar - theta)
+seEst <- sqrt((n - 1) * mean((jk - thetaBar)^2))
+```
+
+
+---
+
+## Example test
+
+
+```r
+c(biasEst, seEst)
+```
+
+```
+## [1] 0.0000 0.1014
+```
+
+```r
+library(bootstrap)
+temp <- jackknife(x, median)
+c(temp$jack.bias, temp$jack.se)
+```
+
+```
+## [1] 0.0000 0.1014
+```
+
+
+---
+
+## Example
+
+- Both methods (of course) yield an estimated bias of 0 and a se of 0.1014
+- Odd little fact: the jackknife estimate of the bias for the median is always $0$ when the number of observations is even
+- It has been shown that the jackknife is a linear approximation to the bootstrap
+- Generally do not use the jackknife for sample quantiles like the median; as it has been shown to have some poor properties
+
+---
+
+## Pseudo observations
+
+- Another interesting way to think about the jackknife uses pseudo observations
+- Let
+$$
+      \mbox{Pseudo Obs} = n \hat \theta - (n - 1) \hat \theta_{i}
+$$
+- Think of these as ``whatever observation $i$ contributes to the estimate of $\theta$''
+- Note when $\hat \theta$ is the sample mean, the pseudo observations are the data themselves
+- Then the sample standard error of these observations is the previous jackknife estimated standard error.
+- The mean of these observations is a bias-corrected estimate of $\theta$
+
+---
+
+## The bootstrap
+
+- The bootstrap is a tremendously useful tool for constructing confidence intervals and calculating standard errors for difficult statistics
+- For example, how would one derive a confidence interval for the median?
+- The bootstrap procedure follows from the so called bootstrap principle
+
+---
+
+## The bootstrap principle
+
+- Suppose that I have a statistic that estimates some population parameter, but I don't know its sampling distribution
+- The bootstrap principle suggests using the distribution defined by the data to approximate its sampling distribution
+
+---
+
+## The bootstrap in practice
+
+- In practice, the bootstrap principle is always carried out using simulation
+- We will cover only a few aspects of bootstrap resampling
+- The general procedure follows by first simulating complete data sets from the observed data with replacement
+
+  - This is approximately drawing from the sampling distribution of that statistic, at least as far as the data is able to approximate the true population distribution
+
+- Calculate the statistic for each simulated data set
+- Use the simulated statistics to either define a confidence interval or take the standard deviation to calculate a standard error
+
+---
+## Nonparametric bootstrap algorithm example
+
+- Bootstrap procedure for calculating confidence interval for the median from a data set of $n$ observations
+
+  i. Sample $n$ observations **with replacement** from the observed data resulting in one simulated complete data set
+  
+  ii. Take the median of the simulated data set
+  
+  iii. Repeat these two steps $B$ times, resulting in $B$ simulated medians
+  
+  iv. These medians are approximately drawn from the sampling distribution of the median of $n$ observations; therefore we can
+  
+    - Draw a histogram of them
+    - Calculate their standard deviation to estimate the standard error of the median
+    - Take the $2.5^{th}$ and $97.5^{th}$ percentiles as a confidence interval for the median
+
+---
+
+## Example code
+
+
+```r
+B <- 1000
+resamples <- matrix(sample(x, n * B, replace = TRUE), B, n)
+medians <- apply(resamples, 1, median)
+sd(medians)
+```
+
+```
+## [1] 0.08834
+```
+
+```r
+quantile(medians, c(0.025, 0.975))
+```
+
+```
+##  2.5% 97.5% 
+## 68.41 68.82
+```
+
+
+---
+## Histogram of bootstrap resamples
+
+
+```r
+hist(medians)
+```
+
+![plot of chunk unnamed-chunk-4](assets/fig/unnamed-chunk-4.png) 
+
+
+---
+
+## Notes on the bootstrap
+
+- The bootstrap is non-parametric
+- Better percentile bootstrap confidence intervals correct for bias
+- There are lots of variations on bootstrap procedures; the book "An Introduction to the Bootstrap"" by Efron and Tibshirani is a great place to start for both bootstrap and jackknife information
+
+
+---
+## Group comparisons
+- Consider comparing two independent groups.
+- Example, comparing sprays B and C
+
+
+```r
+data(InsectSprays)
+boxplot(count ~ spray, data = InsectSprays)
+```
+
+![plot of chunk unnamed-chunk-5](assets/fig/unnamed-chunk-5.png) 
+
+
+---
+## Permutation tests
+-  Consider the null hypothesis that the distribution of the observations from each group is the same
+-  Then, the group labels are irrelevant
+-  We then discard the group levels and permute the combined data
+-  Split the permuted data into two groups with $n_A$ and $n_B$
+  observations (say by always treating the first $n_A$ observations as
+  the first group)
+-  Evaluate the probability of getting a statistic as large or
+  large than the one observed
+-  An example statistic would be the difference in the averages between the two groups;
+  one could also use a t-statistic 
+
+---
+## Variations on permutation testing
+Data type | Statistic | Test name 
+---|---|---|
+Ranks | rank sum | rank sum test
+Binary | hypergeometric prob | Fisher's exact test
+Raw data | | ordinary permutation test
+
+- Also, so-called *randomization tests* are exactly permutation tests, with a different motivation.
+- For matched data, one can randomize the signs
+  - For ranks, this results in the signed rank test
+- Permutation strategies work for regression as well
+  - Permuting a regressor of interest
+- Permutation tests work very well in multivariate settings
+
+---
+## Permutation test for pesticide data
+
+```r
+subdata <- InsectSprays[InsectSprays$spray %in% c("B", "C"), ]
+y <- subdata$count
+group <- as.character(subdata$spray)
+testStat <- function(w, g) mean(w[g == "B"]) - mean(w[g == "C"])
+observedStat <- testStat(y, group)
+permutations <- sapply(1:10000, function(i) testStat(y, sample(group)))
+observedStat
+```
+
+```
+## [1] 13.25
+```
+
+```r
+mean(permutations > observedStat)
+```
+
+```
+## [1] 0
+```
+
+
+---
+## Histogram of permutations
+![plot of chunk unnamed-chunk-7](assets/fig/unnamed-chunk-7.png) 
+
+
+
diff --git a/06_StatisticalInference/03_06_resampledInference/index.pdf b/06_StatisticalInference/03_06_resampledInference/index.pdf
index df08fe3df..ce8822a3c 100644
Binary files a/06_StatisticalInference/03_06_resampledInference/index.pdf and b/06_StatisticalInference/03_06_resampledInference/index.pdf differ
diff --git a/06_StatisticalInference/cp.R b/06_StatisticalInference/cp.R
new file mode 100644
index 000000000..bea948ee7
--- /dev/null
+++ b/06_StatisticalInference/cp.R
@@ -0,0 +1,18 @@
+## A program for copying the index.pdf files and naming them
+## appropriately in the lectured directory
+## Brian Caffo
+## 
+## Has to be run within the directory and won't overwrite
+## unless you change this to TRUE
+overwrite = FALSE
+
+## Get the directory names (they all start with 0)
+dirNames <- dir(pattern = "^0")
+
+## Loop over them and copy the files
+sapply(dirNames, function(x) 
+  file.copy(from = paste(x, "/index.pdf", sep = ""),
+            to = paste("lectures/", x, ".pdf", sep = ""),
+            overwrite = overwrite
+              )
+  )
diff --git a/06_StatisticalInference/homework/hw1.Rmd b/06_StatisticalInference/homework/hw1.Rmd
index 7f31c63a0..f5476f5c7 100644
--- a/06_StatisticalInference/homework/hw1.Rmd
+++ b/06_StatisticalInference/homework/hw1.Rmd
@@ -40,22 +40,22 @@ Creating Data Products
 
 --- &radio
 
-Consider influenza epidemics for two parent heterosexual families. Suppose that the probability is 15% that at least one of the parents has contracted the disease. The probability that the father has contracted influenza is 6% while that the mother contracted the disease is 5%. What is the probability that both contracted influenza expressed as a whole number percentage?
+Consider influenza epidemics for two parent heterosexual families. Suppose that the probability is 15% that at least one of the parents has contracted the disease. The probability that the father has contracted influenza is 10% while that the mother contracted the disease is 9%. What is the probability that both contracted influenza expressed as a whole number percentage?
 
 1. 15%
-2. 6%
-3. 5%
-4. _2%_
+2. 10%
+3. 9%
+4. _4%_
 
 *** .hint
-$A = Father$, $P(A) = .06$, $B = Mother$, $P(B) = .05$ 
+$A = Father$, $P(A) = .10$, $B = Mother$, $P(B) = .09$ 
 $P(A\cup B) = .15$, 
 
 *** .explanation
-$P(A\cup B) = P(A) + P(B) - 2 P(AB)$ thus
-$$.15 = .06 + .05 - 2 P(AB)$$
+$P(A\cup B) = P(A) + P(B) - P(AB)$ thus
+$$.15 = .10 + .09 - P(AB)$$
 ```{r}
-(0.15 - .06 - .05) / 2
+.10 + .09 - .15
 ```
 
 ---  &radio
diff --git a/06_StatisticalInference/homework/hw2.Rmd b/06_StatisticalInference/homework/hw2.Rmd
index 3a568425c..b38eab299 100644
--- a/06_StatisticalInference/homework/hw2.Rmd
+++ b/06_StatisticalInference/homework/hw2.Rmd
@@ -157,10 +157,10 @@ Let $p=.5$ and $X$ be binomial
 
 *** .explanation
 
-<span class="answer">`r round(pbinom(4, prob = .5, size = 6, lower.tail = TRUE) * 100, 1)`</span>
+<span class="answer">`r round(pbinom(4, prob = .5, size = 6, lower.tail = FALSE) * 100, 1)`</span>
 
 ```{r}
-round(pbinom(4, prob = .5, size = 6, lower.tail = TRUE) * 100, 1)
+round(pbinom(4, prob = .5, size = 6, lower.tail = FALSE) * 100, 1)
 ```
 
 --- &multitext
@@ -210,9 +210,9 @@ If you roll ten standard dice, take their average, then repeat this process over
 $$Var(\bar X) = \sigma^2 /n$$
 
 *** .explanation
-The answer will be <span class="answer">`r round( mean(1 : 6 - 3.5) ^2 / 100, 3)`</span> 
-since the variance of the sampling distribution of the mean is $\sigma^2/12$
-and the variance of a die roll is 
+The answer will be <span class="answer">`r round( mean( (1 : 6 - 3.5) ^2) / 10, 3)`</span> 
+since the variance of the sampling distribution of the mean is $\sigma^2/10$
+where $\sigma^2$ is the variance of a single die roll, which is 
 
 ```{r}
 mean((1 : 6 - 3.5)^2)
diff --git a/06_StatisticalInference/homework/hw4.Rmd b/06_StatisticalInference/homework/hw4.Rmd
index bf5a8da3b..ae4a27b02 100644
--- a/06_StatisticalInference/homework/hw4.Rmd
+++ b/06_StatisticalInference/homework/hw4.Rmd
@@ -35,3 +35,316 @@ knit_hooks$set(plot = knitr:::hook_plot_html)
 Creating Data Products
 - Please help improve this with pull requests here
 (https://github.com/bcaffo/courses)
+
+
+--- &multitext
+Load the data set `mtcars` in the `datasets` R package. You want
+to test whether the MPG is $\mu_0$ or smaller using a one sided
+5% level test. ($H_0 : \mu = \mu_0$ versus $H_a : \mu < \mu_0$).
+Using that data set and a Z test:
+
+1. what is the smallest value of $\mu_0$ that you would reject for?
+
+Both to two decimal places.
+
+*** .hint
+This is the inversion of a one sided hypothesis test. It yields confidence
+bounds. (Note inverting a two sidded test yields confidence intervals.) 
+Think about the derivation of the confidence interval.
+
+*** .explanation
+We want to solve 
+$$
+\frac{\sqrt{n}(\bar{X} - \mu_0)}{s} = Z_{0.05}
+$$
+Or $$\mu_0 = \bar{X} - Z_{0.05} s / \sqrt{n} = \bar{X} + Z_{0.95} s / \sqrt{n}$$. Note that the quantile is negative.
+
+```{r}
+mn <- mean(mtcars$mpg); s <- sd(mtcars$mpg); z <- qnorm(.05)
+mu0 <- mn - z * s / sqrt(nrow(mtcars))
+```
+Note, it's easy to get tripped up in this problem on signs. If you get a value
+that's less than $\bar X$, then clearly it's wrong, since you'd never reject for
+$H_0:\mu = \mu_0$ versus $H_a : \mu < \mu_0$ if $\mu_0$ was less than your observed mean.
+Also note the answer to "What is the largest value for which you would reject for?" is
+infinity.
+
+<span class="answer">`r round(mu0, 2) `</span<>
+
+
+--- &multitext
+Consider again the `mtcars` dataset. Use a two group t-test to test
+the hypothesis that the 4 and 6 cyl cars have the same mpg.  Use
+a two sided test with unequal variances.
+
+1. Do you reject at the 5% level (enter 0 for no, 1 for yes).
+2. What is the P-value to 4 decimal places expressed as a proportion?
+
+
+*** .hint
+Use `t.test` with the options `var.equal=FALSE`, `paired=FALSE`,  `altnernative` as `two.sided`. 
+
+*** .explanation
+
+```{r}
+m4 <- mtcars$mpg[mtcars$cyl == 4]
+m6 <- mtcars$mpg[mtcars$cyl == 6]
+p <- t.test(m4, m6, paired = FALSE, alternative="two.sided", var.equal=FALSE)$p.value
+```
+The answer to 1. is <span class="answer">`r as.integer(p < .05)`</span> <br>
+The answer to 2. is <span class="answer">`r round(p, 4)`</span>
+
+
+--- &multitext
+A sample of 100 men yielded an average PSA level of 3.0 with a sd of 1.1. What
+are the complete set of values that a 5% two sided Z test of $H_0 : \mu = \mu_0$ 
+would reject the null hypothesis for?
+
+1. Enter the lower value to 2 decimal places.
+2. Enter the upper value to 2 decimal places. 
+
+*** .hint
+This is equivalent to the confidence interval.
+
+*** .explanation
+The answer to 1 is
+ <span class="answer">`r round(3 - qnorm(.975) * 1.1 / sqrt(100), 2)`</span><br>
+The answer to 2 is <span class="answer">`r round(3 - qnorm(.975) * 1.1 / sqrt(100), 2)`</span>
+
+
+--- &multitext
+You believe the coin that you're flipping is biased towards heads. You get 55 heads out of 
+100 flips. 
+
+1. What's the exact relevant pvalue to 4 decimal places (expressed as a proportion)?
+2. Would you reject a 1 sided hypothesis at $\alpha = .05$? (0 for no 1 for yes)?
+
+*** .hint
+Use `pbinom` for a hypothesis that $p=.5$ veruss $p>.5$ where $p$ is the binomial success
+probability.
+
+*** .explanation
+Note you have to start at 54 as it `lower.tail = FALSE` gives the strictly greater than
+probabilities
+```{r}
+ans <- round(pbinom(54, prob = .5, size = 100, lower.tail = FALSE),4)
+```
+The answer to 1 is <span class="answer">`r ans`</span><br>
+The answer to 2 is <span class="answer">`r as.integer(ans < .05)`</span><br>
+
+
+--- &multitext
+
+A web site was monitored for a year and it received 520 hits per day. In the first
+30 days in the next year, the site received 15,800 hits. Assuming that web hits
+are Poisson.
+
+1. Give an exact one sided P-value to the hypothesis that web hits are up this year over last
+to four significant digits (expressed as a proportion).
+2. Does the one sided test reject (0 for no 1 for yes)?
+
+
+
+*** .hint
+Consider using `ppois` with $\lambda=520 * 30$.  Note this is nearly exactly Gaussian, 
+so one could get away with the Gaussian calculation.
+
+*** .explanation
+This test comes with the important assumption that web hits are a Poisson process.
+
+```{r}
+pv <- ppois(15800 - 1, lambda = 520 * 30, lower.tail = FALSE)
+```
+
+The answer to 1 is <span class="answer">`r round(pv, 4)`</span><br>
+The answer to 2 is <span class="answer">`r as.integer(pv < .05)`</span><br>
+
+Also, compare with the Gaussian approximation where $\hat \lambda  \sim N(\lambda, \lambda / t)$
+```{r}
+pnorm(15800 / 30, mean = 520, sd = sqrt(520 / 30), lower.tail = FALSE)
+```
+As $t\rightarrow \infty$ this becomes more Gaussian. The approximation is pretty good for this
+setting.
+
+
+--- &multitext
+
+Suppose that in an AB test, one advertising scheme led to an average of 10 purchases per day for a sample of 100 days, while the other led to 11 purchaces per day, also for a sample of 100 days.
+Assuming a common standard deviation of 4 purchaces per day.
+Assuming that the groups are independent and that they days are iid, perform a Z test of
+equivalence. 
+
+1. What is the P-value reported to 3 digits expressed as a proportion?
+2. Do you reject the test? (O for no 1 for yes).
+
+*** .hint
+The standard error is 
+$$
+s \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}
+$$
+
+*** .explanation
+```{r}
+m1 <- 10; m2 <- 11
+n1 <- n2 <- 100
+s <- 4
+se <- s * sqrt(1 / n1 + 1 / n2)
+ts <- (m2 - m1) / se
+pv <- 2 * pnorm(-abs(ts))
+```
+
+The answer to 1 is <span class="answer">`r round(pv, 3)`</span><br>
+The answer to 2 is <span class="answer">`r as.integer(pv < .05)`</span>
+
+
+--- &radio
+
+A confidence interval for the mean contains:
+
+1. _All of the values of the hypothesized mean for which we would fail to reject with 
+$\alpha = 1 - Conf. Level$._
+2. All of the values of the hypothesized mean for which we would fail to reject with 
+$2 \alpha = 1 - Conf. Level$.
+3. All of the values of the hypothesized mean for which we would reject with 
+$\alpha = 1 - Conf. Level$.
+4. All of the values of the hypothesized mean for which we would fail to reject with 
+$2 \alpha = 1 - Conf. Level$.
+
+*** .hint
+This is directly from the notes. Note that a confidence interval gives
+values of $\mu$ that are supported by the data whereas a test rejects
+for values of $\mu$ different from $\mu_0$. 
+
+*** .explanation
+The only complicated part of this is the 2. Note that a 95% interval corresponds
+to a 5% level two sided test. So it's $\alpha = 1 - Conf.Level$. The confusion is that
+for both the two sided test and confidence interval, one needs to calculate
+$Z_{1 - \alpha / 2}$ (or the relevant T quantile).
+
+
+--- &multitext
+Consider two problems previous. Assuming that 10 purchases per day is a benchmark null value, 
+that days are iid and that the standard deviation is 4 purchases for day. Suppose that you
+plan on sampling 100 days. What would be the power for a one sided 5% 
+Z mean test that purchases per day
+have increased under the alternative of $\mu = 11$ purchase per day?
+
+
+1. Give your result as a proportion to 3 decimal places.
+
+*** .hint
+Under $H_0$ $\bar X \sim N(10, .4)$. Under $H_a$ $\bar X \sim N(11, .4)$. We reject
+when $\bar X \geq 10 + Z_{.95} .4$.
+
+
+*** .explanation
+The hint prettty much gives it away.
+```{r}
+power <- pnorm(10 + qnorm(.95) * .4, mean = 11, sd = .4, lower.tail = FALSE)
+```
+The answer is <span class="answer">`r round(power, 3)`</span>
+
+
+--- &multitext
+Researchers would like to conduct a study of healthy adults to detect a four year mean brain volume loss of .01 mm3. Assume that the standard deviation of four year volume loss in this population is .04 mm3. 
+
+1. What is necessary sample size for the study for a 5% one sided test versus a null hypothesis of no volume loss to acheive 80% power? (Always round up)
+
+
+
+*** .hint
+Under $H_0$ $\bar X$ is $N(0, .05 / \sqrt{n})$ and is $N(.01, .05 / \sqrt{n})$ under $H_a$. 
+We will reject if 
+$$
+\bar X \geq  Z_{.95} s / sqrt{n}
+$$ 
+which has probability 0.05 under $H_0$. Under $H_a$ it has probability
+$$
+P\left( \frac{\bar X - 0.01}{s / \sqrt{n}} \geq  \frac{.01}{s / \sqrt{n}} + z_{.95} \right)
+= P\left( Z \geq \frac{.01}{s / \sqrt{n}} + z_{.95}\right)
+$$
+
+*** .explanation
+Looking at the hint we set
+$$
+\frac{.01}{s / \sqrt{n}} + z_{.95} = z_{.2}
+$$
+$$
+n = \frac{(z_{.95} - z_{.2})^2 s^2}{.01^2} = \frac{ (z_{.95} + z_{.8})^2 s^2}{.01^2}
+$$
+So we get
+```{r}
+n <- (qnorm(.95) + qnorm(.8)) ^ 2 * .04 ^ 2 / .01^2
+```
+The answer is <span class="answer">`r ceiling(n)`</span>
+
+
+--- &radio
+
+In a court of law, all things being equal, if via policy you require a lower
+standard of evidence to convict people then
+
+1. Less guilty people will be convicted.
+2. _More innocent people will be convicted._
+3. More Innocent people will be not convicted.
+
+
+*** .hint
+Think about it.
+
+*** .explanation
+If you require less evidence to convict, then you will convict more people, guilty and
+innocent. Relate this property back to hypothesis tests.
+
+
+--- &multitext
+Consider the `mtcars` data set. 
+
+1. Give the p-value for a t-test for assuming 
+constant variance comparing MPG for 6 and 8 cylinder cars as a proportion to 3 decimal places.
+2. Give the associated P-value for a z test.
+3. Give the common standard deviation estimate for MPG across cylinders to 3 decimal places.
+4. Would the t test reject at the two sided 0.05 level (0 for no 1 for yes)?
+
+
+*** .hint
+```{r}
+mpg8 <- mtcars$mpg[mtcars$cyl == 8]
+mpg6 <- mtcars$mpg[mtcars$cyl == 6]
+m8 <- mean(mpg8); m6 <- mean(mpg6)
+s8 <- sd(mpg8); s6 <- sd(mpg6)
+n8 <- length(mpg8); n6 <- length(mpg6)
+```
+
+*** .explanation
+```{r}
+p <- t.test(mpg8, mpg6, paired = FALSE, alternative="two.sided", var.equal=TRUE)$p.value
+mixprob <- (n8 - 1) / (n8 + n6 - 2)
+s <- sqrt(mixprob * s8 ^ 2  +  (1 - mixprob) * s6 ^ 2)
+z <- (m8 - m6) / (s * sqrt(1 / n8 + 1 / n6))
+pz <- 2 * pnorm(-abs(z))
+## Hand calculating the T just to check
+#2 * pt(-abs(z), df = n8 + n6 - 2)
+````
+
+1. <span class="answer">`r round(p,  3)`</span> <br>
+2. <span class="answer">`r round(pz, 3)`</span><br>
+3. <span class="answer">`r round(s, 3)`</span><br>
+4. <span class="answer">`r as.integer(p < .05)`</span>
+
+
+--- &radio
+The Bonferonni correction controls this
+
+1. False discovery rate
+2. _The familywise error rate_
+3. The rate of true rejections
+4. The number of true rejections
+
+*** .hint
+This is pretty much straight out of the notes
+
+*** .explanation
+The Bonferonni correction is the classic correction for the familywise error rate.
+
+
diff --git a/06_StatisticalInference/homework/hw4.html b/06_StatisticalInference/homework/hw4.html
index 565621a6d..cc90107c4 100644
--- a/06_StatisticalInference/homework/hw4.html
+++ b/06_StatisticalInference/homework/hw4.html
@@ -59,6 +59,500 @@ <h2>About these slides</h2>
   <!-- Presenter Notes -->
 </slide>
 
+<slide class="" id="slide-2" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz-text quiz-multitext well">
+  <p>Load the data set <code>mtcars</code> in the <code>datasets</code> R package. You want
+to test whether the MPG is \(\mu_0\) or smaller using a one sided
+5% level test. (\(H_0 : \mu = \mu_0\) versus \(H_a : \mu < \mu_0\)).
+Using that data set and a Z test:</p>
+
+<ol>
+<li>what is the smallest value of \(\mu_0\) that you would reject for?</li>
+</ol>
+
+<p>Both to two decimal places.</p>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <p>This is the inversion of a one sided hypothesis test. It yields confidence
+bounds. (Note inverting a two sidded test yields confidence intervals.) 
+Think about the derivation of the confidence interval.</p>
+
+</div>
+<div class="quiz-explanation">
+  <p>We want to solve 
+\[
+\frac{\sqrt{n}(\bar{X} - \mu_0)}{s} = Z_{0.05}
+\]
+Or \[\mu_0 = \bar{X} - Z_{0.05} s / \sqrt{n} = \bar{X} + Z_{0.95} s / \sqrt{n}\]. Note that the quantile is negative.</p>
+
+<pre><code class="r">mn &lt;- mean(mtcars$mpg); s &lt;- sd(mtcars$mpg); z &lt;- qnorm(.05)
+mu0 &lt;- mn - z * s / sqrt(nrow(mtcars))
+</code></pre>
+
+<p>Note, it&#39;s easy to get tripped up in this problem on signs. If you get a value
+that&#39;s less than \(\bar X\), then clearly it&#39;s wrong, since you&#39;d never reject for
+\(H_0:\mu = \mu_0\) versus \(H_a : \mu < \mu_0\) if \(\mu_0\) was less than your observed mean.
+Also note the answer to &quot;What is the largest value for which you would reject for?&quot; is
+infinity.</p>
+
+<p><span class="answer">21.84</span<></p>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-3" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz-text quiz-multitext well">
+  <p>Consider again the <code>mtcars</code> dataset. Use a two group t-test to test
+the hypothesis that the 4 and 6 cyl cars have the same mpg.  Use
+a two sided test with unequal variances.</p>
+
+<ol>
+<li>Do you reject at the 5% level (enter 0 for no, 1 for yes).</li>
+<li>What is the P-value to 4 decimal places expressed as a proportion?</li>
+</ol>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <p>Use <code>t.test</code> with the options <code>var.equal=FALSE</code>, <code>paired=FALSE</code>,  <code>altnernative</code> as <code>two.sided</code>. </p>
+
+</div>
+<div class="quiz-explanation">
+  <pre><code class="r">m4 &lt;- mtcars$mpg[mtcars$cyl == 4]
+m6 &lt;- mtcars$mpg[mtcars$cyl == 6]
+p &lt;- t.test(m4, m6, paired = FALSE, alternative=&quot;two.sided&quot;, var.equal=FALSE)$p.value
+</code></pre>
+
+<p>The answer to 1. is <span class="answer">1</span> <br>
+The answer to 2. is <span class="answer">4e-04</span></p>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-4" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz-text quiz-multitext well">
+  <p>A sample of 100 men yielded an average PSA level of 3.0 with a sd of 1.1. What
+are the complete set of values that a 5% two sided Z test of \(H_0 : \mu = \mu_0\) 
+would reject the null hypothesis for?</p>
+
+<ol>
+<li>Enter the lower value to 2 decimal places.</li>
+<li>Enter the upper value to 2 decimal places. </li>
+</ol>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <p>This is equivalent to the confidence interval.</p>
+
+</div>
+<div class="quiz-explanation">
+  <p>The answer to 1 is
+ <span class="answer">2.78</span><br>
+The answer to 2 is <span class="answer">2.78</span></p>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-5" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz-text quiz-multitext well">
+  <p>You believe the coin that you&#39;re flipping is biased towards heads. You get 55 heads out of 
+100 flips. </p>
+
+<ol>
+<li>What&#39;s the exact relevant pvalue to 4 decimal places (expressed as a proportion)?</li>
+<li>Would you reject a 1 sided hypothesis at \(\alpha = .05\)? (0 for no 1 for yes)?</li>
+</ol>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <p>Use <code>pbinom</code> for a hypothesis that \(p=.5\) veruss \(p>.5\) where \(p\) is the binomial success
+probability.</p>
+
+</div>
+<div class="quiz-explanation">
+  <p>Note you have to start at 54 as it <code>lower.tail = FALSE</code> gives the strictly greater than
+probabilities</p>
+
+<pre><code class="r">ans &lt;- round(pbinom(54, prob = .5, size = 100, lower.tail = FALSE),4)
+</code></pre>
+
+<p>The answer to 1 is <span class="answer">0.1841</span><br>
+The answer to 2 is <span class="answer">0</span><br></p>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-6" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz-text quiz-multitext well">
+  <p>A web site was monitored for a year and it received 520 hits per day. In the first
+30 days in the next year, the site received 15,800 hits. Assuming that web hits
+are Poisson.</p>
+
+<ol>
+<li>Give an exact one sided P-value to the hypothesis that web hits are up this year over last
+to four significant digits (expressed as a proportion).</li>
+<li>Does the one sided test reject (0 for no 1 for yes)?</li>
+</ol>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <p>Consider using <code>ppois</code> with \(\lambda=520 * 30\).  Note this is nearly exactly Gaussian, 
+so one could get away with the Gaussian calculation.</p>
+
+</div>
+<div class="quiz-explanation">
+  <p>This test comes with the important assumption that web hits are a Poisson process.</p>
+
+<pre><code class="r">pv &lt;- ppois(15800 - 1, lambda = 520 * 30, lower.tail = FALSE)
+</code></pre>
+
+<p>The answer to 1 is <span class="answer">0.0553</span><br>
+The answer to 2 is <span class="answer">0</span><br></p>
+
+<p>Also, compare with the Gaussian approximation where \(\hat \lambda  \sim N(\lambda, \lambda / t)\)</p>
+
+<pre><code class="r">pnorm(15800 / 30, mean = 520, sd = sqrt(520 / 30), lower.tail = FALSE)
+</code></pre>
+
+<pre><code>[1] 0.05466
+</code></pre>
+
+<p>As \(t\rightarrow \infty\) this becomes more Gaussian. The approximation is pretty good for this
+setting.</p>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-7" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz-text quiz-multitext well">
+  <p>Suppose that in an AB test, one advertising scheme led to an average of 10 purchases per day for a sample of 100 days, while the other led to 11 purchaces per day, also for a sample of 100 days.
+Assuming a common standard deviation of 4 purchaces per day.
+Assuming that the groups are independent and that they days are iid, perform a Z test of
+equivalence. </p>
+
+<ol>
+<li>What is the P-value reported to 3 digits expressed as a proportion?</li>
+<li>Do you reject the test? (O for no 1 for yes).</li>
+</ol>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <p>The standard error is 
+\[
+s \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}
+\]</p>
+
+</div>
+<div class="quiz-explanation">
+  <pre><code class="r">m1 &lt;- 10; m2 &lt;- 11
+n1 &lt;- n2 &lt;- 100
+s &lt;- 4
+se &lt;- s * sqrt(1 / n1 + 1 / n2)
+ts &lt;- (m2 - m1) / se
+pv &lt;- 2 * pnorm(-abs(ts))
+</code></pre>
+
+<p>The answer to 1 is <span class="answer">0.077</span><br>
+The answer to 2 is <span class="answer">0</span></p>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-8" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz quiz-single well ">
+  <p>A confidence interval for the mean contains:</p>
+
+<ol>
+<li><em>All of the values of the hypothesized mean for which we would fail to reject with 
+\(\alpha = 1 - Conf. Level\).</em></li>
+<li>All of the values of the hypothesized mean for which we would fail to reject with 
+\(2 \alpha = 1 - Conf. Level\).</li>
+<li>All of the values of the hypothesized mean for which we would reject with 
+\(\alpha = 1 - Conf. Level\).</li>
+<li>All of the values of the hypothesized mean for which we would fail to reject with 
+\(2 \alpha = 1 - Conf. Level\).</li>
+</ol>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <p>This is directly from the notes. Note that a confidence interval gives
+values of \(\mu\) that are supported by the data whereas a test rejects
+for values of \(\mu\) different from \(\mu_0\). </p>
+
+</div>
+<div class="quiz-explanation">
+  <p>The only complicated part of this is the 2. Note that a 95% interval corresponds
+to a 5% level two sided test. So it&#39;s \(\alpha = 1 - Conf.Level\). The confusion is that
+for both the two sided test and confidence interval, one needs to calculate
+\(Z_{1 - \alpha / 2}\) (or the relevant T quantile).</p>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-9" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz-text quiz-multitext well">
+  <p>Consider two problems previous. Assuming that 10 purchases per day is a benchmark null value, 
+that days are iid and that the standard deviation is 4 purchases for day. Suppose that you
+plan on sampling 100 days. What would be the power for a one sided 5% 
+Z mean test that purchases per day
+have increased under the alternative of \(\mu = 11\) purchase per day?</p>
+
+<ol>
+<li>Give your result as a proportion to 3 decimal places.</li>
+</ol>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <p>Under \(H_0\) \(\bar X \sim N(10, .4)\). Under \(H_a\) \(\bar X \sim N(11, .4)\). We reject
+when \(\bar X \geq 10 + Z_{.95} .4\).</p>
+
+</div>
+<div class="quiz-explanation">
+  <p>The hint prettty much gives it away.</p>
+
+<pre><code class="r">power &lt;- pnorm(10 + qnorm(.95) * .4, mean = 11, sd = .4, lower.tail = FALSE)
+</code></pre>
+
+<p>The answer is <span class="answer">0.804</span></p>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-10" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz-text quiz-multitext well">
+  <p>Researchers would like to conduct a study of healthy adults to detect a four year mean brain volume loss of .01 mm3. Assume that the standard deviation of four year volume loss in this population is .04 mm3. </p>
+
+<ol>
+<li>What is necessary sample size for the study for a 5% one sided test versus a null hypothesis of no volume loss to acheive 80% power? (Always round up)</li>
+</ol>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <p>Under \(H_0\) \(\bar X\) is \(N(0, .05 / \sqrt{n})\) and is \(N(.01, .05 / \sqrt{n})\) under \(H_a\). 
+We will reject if 
+\[
+\bar X \geq  Z_{.95} s / sqrt{n}
+\] 
+which has probability 0.05 under \(H_0\). Under \(H_a\) it has probability
+\[
+P\left( \frac{\bar X - 0.01}{s / \sqrt{n}} \geq  \frac{.01}{s / \sqrt{n}} + z_{.95} \right)
+= P\left( Z \geq \frac{.01}{s / \sqrt{n}} + z_{.95}\right)
+\]</p>
+
+</div>
+<div class="quiz-explanation">
+  <p>Looking at the hint we set
+\[
+\frac{.01}{s / \sqrt{n}} + z_{.95} = z_{.2}
+\]
+\[
+n = \frac{(z_{.95} - z_{.2})^2 s^2}{.01^2} = \frac{ (z_{.95} + z_{.8})^2 s^2}{.01^2}
+\]
+So we get</p>
+
+<pre><code class="r">n &lt;- (qnorm(.95) + qnorm(.8)) ^ 2 * .04 ^ 2 / .01^2
+</code></pre>
+
+<p>The answer is <span class="answer">99</span></p>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-11" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz quiz-single well ">
+  <p>In a court of law, all things being equal, if via policy you require a lower
+standard of evidence to convict people then</p>
+
+<ol>
+<li>Less guilty people will be convicted.</li>
+<li><em>More innocent people will be convicted.</em></li>
+<li>More Innocent people will be not convicted.</li>
+</ol>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <p>Think about it.</p>
+
+</div>
+<div class="quiz-explanation">
+  <p>If you require less evidence to convict, then you will convict more people, guilty and
+innocent. Relate this property back to hypothesis tests.</p>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-12" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz-text quiz-multitext well">
+  <p>Consider the <code>mtcars</code> data set. </p>
+
+<ol>
+<li>Give the p-value for a t-test for assuming 
+constant variance comparing MPG for 6 and 8 cylinder cars as a proportion to 3 decimal places.</li>
+<li>Give the associated P-value for a z test.</li>
+<li>Give the common standard deviation estimate for MPG across cylinders to 3 decimal places.</li>
+<li>Would the t test reject at the two sided 0.05 level (0 for no 1 for yes)?</li>
+</ol>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <pre><code class="r">mpg8 &lt;- mtcars$mpg[mtcars$cyl == 8]
+mpg6 &lt;- mtcars$mpg[mtcars$cyl == 6]
+m8 &lt;- mean(mpg8); m6 &lt;- mean(mpg6)
+s8 &lt;- sd(mpg8); s6 &lt;- sd(mpg6)
+n8 &lt;- length(mpg8); n6 &lt;- length(mpg6)
+</code></pre>
+
+</div>
+<div class="quiz-explanation">
+  <pre><code class="r">p &lt;- t.test(mpg8, mpg6, paired = FALSE, alternative=&quot;two.sided&quot;, var.equal=TRUE)$p.value
+mixprob &lt;- (n8 - 1) / (n8 + n6 - 2)
+s &lt;- sqrt(mixprob * s8 ^ 2  +  (1 - mixprob) * s6 ^ 2)
+z &lt;- (m8 - m6) / (s * sqrt(1 / n8 + 1 / n6))
+pz &lt;- 2 * pnorm(-abs(z))
+## Hand calculating the T just to check
+#2 * pt(-abs(z), df = n8 + n6 - 2)
+</code></pre>
+
+<ol>
+<li><span class="answer">0</span> <br></li>
+<li><span class="answer">0</span><br></li>
+<li><span class="answer">2.27</span><br></li>
+<li><span class="answer">1</span></li>
+</ol>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
+<slide class="" id="slide-13" style="background:;">
+  <article data-timings="">
+    
+<div class="quiz quiz-single well ">
+  <p>The Bonferonni correction controls this</p>
+
+<ol>
+<li>False discovery rate</li>
+<li><em>The familywise error rate</em></li>
+<li>The rate of true rejections</li>
+<li>The number of true rejections</li>
+</ol>
+
+  <button class="quiz-submit btn btn-primary">Submit</button>
+  <button class="quiz-toggle-hint btn btn-info">Show Hint</button>
+  <button class="quiz-show-answer btn btn-success">Show Answer</button>
+  <button class="quiz-clear btn btn-danger">Clear</button>
+  
+  <div class="quiz-hint">
+  <p>This is pretty much straight out of the notes</p>
+
+</div>
+<div class="quiz-explanation">
+  <p>The Bonferonni correction is the classic correction for the familywise error rate.</p>
+
+</div>
+</div>
+  </article>
+  <!-- Presenter Notes -->
+</slide>
+
     <slide class="backdrop"></slide>
   </slides>
   <div class="pagination pagination-small" id='io2012-ptoc' style="display:none;">
@@ -69,6 +563,78 @@ <h2>About these slides</h2>
          1
       </a>
     </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=2 title=''>
+         2
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=3 title=''>
+         3
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=4 title=''>
+         4
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=5 title=''>
+         5
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=6 title=''>
+         6
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=7 title=''>
+         7
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=8 title=''>
+         8
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=9 title=''>
+         9
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=10 title=''>
+         10
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=11 title=''>
+         11
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=12 title=''>
+         12
+      </a>
+    </li>
+    <li>
+      <a href="#" target="_self" rel='tooltip' 
+        data-slide=13 title=''>
+         13
+      </a>
+    </li>
   </ul>
   </div>  <!--[if IE]>
     <script 
diff --git a/06_StatisticalInference/homework/hw4.md b/06_StatisticalInference/homework/hw4.md
index a22e64543..980b0373e 100644
--- a/06_StatisticalInference/homework/hw4.md
+++ b/06_StatisticalInference/homework/hw4.md
@@ -21,3 +21,340 @@ mode        : selfcontained # {standalone, draft}
 Creating Data Products
 - Please help improve this with pull requests here
 (https://github.com/bcaffo/courses)
+
+
+--- &multitext
+Load the data set `mtcars` in the `datasets` R package. You want
+to test whether the MPG is $\mu_0$ or smaller using a one sided
+5% level test. ($H_0 : \mu = \mu_0$ versus $H_a : \mu < \mu_0$).
+Using that data set and a Z test:
+
+1. what is the smallest value of $\mu_0$ that you would reject for?
+
+Both to two decimal places.
+
+*** .hint
+This is the inversion of a one sided hypothesis test. It yields confidence
+bounds. (Note inverting a two sidded test yields confidence intervals.) 
+Think about the derivation of the confidence interval.
+
+*** .explanation
+We want to solve 
+$$
+\frac{\sqrt{n}(\bar{X} - \mu_0)}{s} = Z_{0.05}
+$$
+Or $$\mu_0 = \bar{X} - Z_{0.05} s / \sqrt{n} = \bar{X} + Z_{0.95} s / \sqrt{n}$$. Note that the quantile is negative.
+
+
+```r
+mn <- mean(mtcars$mpg); s <- sd(mtcars$mpg); z <- qnorm(.05)
+mu0 <- mn - z * s / sqrt(nrow(mtcars))
+```
+
+Note, it's easy to get tripped up in this problem on signs. If you get a value
+that's less than $\bar X$, then clearly it's wrong, since you'd never reject for
+$H_0:\mu = \mu_0$ versus $H_a : \mu < \mu_0$ if $\mu_0$ was less than your observed mean.
+Also note the answer to "What is the largest value for which you would reject for?" is
+infinity.
+
+<span class="answer">21.84</span<>
+
+
+--- &multitext
+Consider again the `mtcars` dataset. Use a two group t-test to test
+the hypothesis that the 4 and 6 cyl cars have the same mpg.  Use
+a two sided test with unequal variances.
+
+1. Do you reject at the 5% level (enter 0 for no, 1 for yes).
+2. What is the P-value to 4 decimal places expressed as a proportion?
+
+
+*** .hint
+Use `t.test` with the options `var.equal=FALSE`, `paired=FALSE`,  `altnernative` as `two.sided`. 
+
+*** .explanation
+
+
+```r
+m4 <- mtcars$mpg[mtcars$cyl == 4]
+m6 <- mtcars$mpg[mtcars$cyl == 6]
+p <- t.test(m4, m6, paired = FALSE, alternative="two.sided", var.equal=FALSE)$p.value
+```
+
+The answer to 1. is <span class="answer">1</span> <br>
+The answer to 2. is <span class="answer">4e-04</span>
+
+
+--- &multitext
+A sample of 100 men yielded an average PSA level of 3.0 with a sd of 1.1. What
+are the complete set of values that a 5% two sided Z test of $H_0 : \mu = \mu_0$ 
+would reject the null hypothesis for?
+
+1. Enter the lower value to 2 decimal places.
+2. Enter the upper value to 2 decimal places. 
+
+*** .hint
+This is equivalent to the confidence interval.
+
+*** .explanation
+The answer to 1 is
+ <span class="answer">2.78</span><br>
+The answer to 2 is <span class="answer">2.78</span>
+
+
+--- &multitext
+You believe the coin that you're flipping is biased towards heads. You get 55 heads out of 
+100 flips. 
+
+1. What's the exact relevant pvalue to 4 decimal places (expressed as a proportion)?
+2. Would you reject a 1 sided hypothesis at $\alpha = .05$? (0 for no 1 for yes)?
+
+*** .hint
+Use `pbinom` for a hypothesis that $p=.5$ veruss $p>.5$ where $p$ is the binomial success
+probability.
+
+*** .explanation
+Note you have to start at 54 as it `lower.tail = FALSE` gives the strictly greater than
+probabilities
+
+```r
+ans <- round(pbinom(54, prob = .5, size = 100, lower.tail = FALSE),4)
+```
+
+The answer to 1 is <span class="answer">0.1841</span><br>
+The answer to 2 is <span class="answer">0</span><br>
+
+
+--- &multitext
+
+A web site was monitored for a year and it received 520 hits per day. In the first
+30 days in the next year, the site received 15,800 hits. Assuming that web hits
+are Poisson.
+
+1. Give an exact one sided P-value to the hypothesis that web hits are up this year over last
+to four significant digits (expressed as a proportion).
+2. Does the one sided test reject (0 for no 1 for yes)?
+
+
+
+*** .hint
+Consider using `ppois` with $\lambda=520 * 30$.  Note this is nearly exactly Gaussian, 
+so one could get away with the Gaussian calculation.
+
+*** .explanation
+This test comes with the important assumption that web hits are a Poisson process.
+
+
+```r
+pv <- ppois(15800 - 1, lambda = 520 * 30, lower.tail = FALSE)
+```
+
+
+The answer to 1 is <span class="answer">0.0553</span><br>
+The answer to 2 is <span class="answer">0</span><br>
+
+Also, compare with the Gaussian approximation where $\hat \lambda  \sim N(\lambda, \lambda / t)$
+
+```r
+pnorm(15800 / 30, mean = 520, sd = sqrt(520 / 30), lower.tail = FALSE)
+```
+
+```
+[1] 0.05466
+```
+
+As $t\rightarrow \infty$ this becomes more Gaussian. The approximation is pretty good for this
+setting.
+
+
+--- &multitext
+
+Suppose that in an AB test, one advertising scheme led to an average of 10 purchases per day for a sample of 100 days, while the other led to 11 purchaces per day, also for a sample of 100 days.
+Assuming a common standard deviation of 4 purchaces per day.
+Assuming that the groups are independent and that they days are iid, perform a Z test of
+equivalence. 
+
+1. What is the P-value reported to 3 digits expressed as a proportion?
+2. Do you reject the test? (O for no 1 for yes).
+
+*** .hint
+The standard error is 
+$$
+s \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}
+$$
+
+*** .explanation
+
+```r
+m1 <- 10; m2 <- 11
+n1 <- n2 <- 100
+s <- 4
+se <- s * sqrt(1 / n1 + 1 / n2)
+ts <- (m2 - m1) / se
+pv <- 2 * pnorm(-abs(ts))
+```
+
+
+The answer to 1 is <span class="answer">0.077</span><br>
+The answer to 2 is <span class="answer">0</span>
+
+
+--- &radio
+
+A confidence interval for the mean contains:
+
+1. _All of the values of the hypothesized mean for which we would fail to reject with 
+$\alpha = 1 - Conf. Level$._
+2. All of the values of the hypothesized mean for which we would fail to reject with 
+$2 \alpha = 1 - Conf. Level$.
+3. All of the values of the hypothesized mean for which we would reject with 
+$\alpha = 1 - Conf. Level$.
+4. All of the values of the hypothesized mean for which we would fail to reject with 
+$2 \alpha = 1 - Conf. Level$.
+
+*** .hint
+This is directly from the notes. Note that a confidence interval gives
+values of $\mu$ that are supported by the data whereas a test rejects
+for values of $\mu$ different from $\mu_0$. 
+
+*** .explanation
+The only complicated part of this is the 2. Note that a 95% interval corresponds
+to a 5% level two sided test. So it's $\alpha = 1 - Conf.Level$. The confusion is that
+for both the two sided test and confidence interval, one needs to calculate
+$Z_{1 - \alpha / 2}$ (or the relevant T quantile).
+
+
+--- &multitext
+Consider two problems previous. Assuming that 10 purchases per day is a benchmark null value, 
+that days are iid and that the standard deviation is 4 purchases for day. Suppose that you
+plan on sampling 100 days. What would be the power for a one sided 5% 
+Z mean test that purchases per day
+have increased under the alternative of $\mu = 11$ purchase per day?
+
+
+1. Give your result as a proportion to 3 decimal places.
+
+*** .hint
+Under $H_0$ $\bar X \sim N(10, .4)$. Under $H_a$ $\bar X \sim N(11, .4)$. We reject
+when $\bar X \geq 10 + Z_{.95} .4$.
+
+
+*** .explanation
+The hint prettty much gives it away.
+
+```r
+power <- pnorm(10 + qnorm(.95) * .4, mean = 11, sd = .4, lower.tail = FALSE)
+```
+
+The answer is <span class="answer">0.804</span>
+
+
+--- &multitext
+Researchers would like to conduct a study of healthy adults to detect a four year mean brain volume loss of .01 mm3. Assume that the standard deviation of four year volume loss in this population is .04 mm3. 
+
+1. What is necessary sample size for the study for a 5% one sided test versus a null hypothesis of no volume loss to acheive 80% power? (Always round up)
+
+
+
+*** .hint
+Under $H_0$ $\bar X$ is $N(0, .05 / \sqrt{n})$ and is $N(.01, .05 / \sqrt{n})$ under $H_a$. 
+We will reject if 
+$$
+\bar X \geq  Z_{.95} s / sqrt{n}
+$$ 
+which has probability 0.05 under $H_0$. Under $H_a$ it has probability
+$$
+P\left( \frac{\bar X - 0.01}{s / \sqrt{n}} \geq  \frac{.01}{s / \sqrt{n}} + z_{.95} \right)
+= P\left( Z \geq \frac{.01}{s / \sqrt{n}} + z_{.95}\right)
+$$
+
+*** .explanation
+Looking at the hint we set
+$$
+\frac{.01}{s / \sqrt{n}} + z_{.95} = z_{.2}
+$$
+$$
+n = \frac{(z_{.95} - z_{.2})^2 s^2}{.01^2} = \frac{ (z_{.95} + z_{.8})^2 s^2}{.01^2}
+$$
+So we get
+
+```r
+n <- (qnorm(.95) + qnorm(.8)) ^ 2 * .04 ^ 2 / .01^2
+```
+
+The answer is <span class="answer">99</span>
+
+
+--- &radio
+
+In a court of law, all things being equal, if via policy you require a lower
+standard of evidence to convict people then
+
+1. Less guilty people will be convicted.
+2. _More innocent people will be convicted._
+3. More Innocent people will be not convicted.
+
+
+*** .hint
+Think about it.
+
+*** .explanation
+If you require less evidence to convict, then you will convict more people, guilty and
+innocent. Relate this property back to hypothesis tests.
+
+
+--- &multitext
+Consider the `mtcars` data set. 
+
+1. Give the p-value for a t-test for assuming 
+constant variance comparing MPG for 6 and 8 cylinder cars as a proportion to 3 decimal places.
+2. Give the associated P-value for a z test.
+3. Give the common standard deviation estimate for MPG across cylinders to 3 decimal places.
+4. Would the t test reject at the two sided 0.05 level (0 for no 1 for yes)?
+
+
+*** .hint
+
+```r
+mpg8 <- mtcars$mpg[mtcars$cyl == 8]
+mpg6 <- mtcars$mpg[mtcars$cyl == 6]
+m8 <- mean(mpg8); m6 <- mean(mpg6)
+s8 <- sd(mpg8); s6 <- sd(mpg6)
+n8 <- length(mpg8); n6 <- length(mpg6)
+```
+
+
+*** .explanation
+
+```r
+p <- t.test(mpg8, mpg6, paired = FALSE, alternative="two.sided", var.equal=TRUE)$p.value
+mixprob <- (n8 - 1) / (n8 + n6 - 2)
+s <- sqrt(mixprob * s8 ^ 2  +  (1 - mixprob) * s6 ^ 2)
+z <- (m8 - m6) / (s * sqrt(1 / n8 + 1 / n6))
+pz <- 2 * pnorm(-abs(z))
+## Hand calculating the T just to check
+#2 * pt(-abs(z), df = n8 + n6 - 2)
+```
+
+
+1. <span class="answer">0</span> <br>
+2. <span class="answer">0</span><br>
+3. <span class="answer">2.27</span><br>
+4. <span class="answer">1</span>
+
+
+--- &radio
+The Bonferonni correction controls this
+
+1. False discovery rate
+2. _The familywise error rate_
+3. The rate of true rejections
+4. The number of true rejections
+
+*** .hint
+This is pretty much straight out of the notes
+
+*** .explanation
+The Bonferonni correction is the classic correction for the familywise error rate.
+
+
diff --git a/06_StatisticalInference/lectures/01_01_Introduction.pdf b/06_StatisticalInference/lectures/01_01_Introduction.pdf
new file mode 100644
index 000000000..b50714770
Binary files /dev/null and b/06_StatisticalInference/lectures/01_01_Introduction.pdf differ
diff --git a/06_StatisticalInference/lectures/01_02_Probability.pdf b/06_StatisticalInference/lectures/01_02_Probability.pdf
new file mode 100644
index 000000000..fddebae9e
Binary files /dev/null and b/06_StatisticalInference/lectures/01_02_Probability.pdf differ
diff --git a/06_StatisticalInference/lectures/01_03_Expectations.pdf b/06_StatisticalInference/lectures/01_03_Expectations.pdf
new file mode 100644
index 000000000..aa71b7bd4
Binary files /dev/null and b/06_StatisticalInference/lectures/01_03_Expectations.pdf differ
diff --git a/06_StatisticalInference/lectures/01_04_Independence.pdf b/06_StatisticalInference/lectures/01_04_Independence.pdf
new file mode 100644
index 000000000..fd2201506
Binary files /dev/null and b/06_StatisticalInference/lectures/01_04_Independence.pdf differ
diff --git a/06_StatisticalInference/lectures/01_05_ConditionalProbability.pdf b/06_StatisticalInference/lectures/01_05_ConditionalProbability.pdf
new file mode 100644
index 000000000..a5a7edead
Binary files /dev/null and b/06_StatisticalInference/lectures/01_05_ConditionalProbability.pdf differ
diff --git a/06_StatisticalInference/lectures/02_01_CommonDistributions.pdf b/06_StatisticalInference/lectures/02_01_CommonDistributions.pdf
new file mode 100644
index 000000000..899e8bd29
Binary files /dev/null and b/06_StatisticalInference/lectures/02_01_CommonDistributions.pdf differ
diff --git a/06_StatisticalInference/lectures/02_02_Asymptopia.pdf b/06_StatisticalInference/lectures/02_02_Asymptopia.pdf
new file mode 100644
index 000000000..d43b4b4a8
Binary files /dev/null and b/06_StatisticalInference/lectures/02_02_Asymptopia.pdf differ
diff --git a/06_StatisticalInference/lectures/02_03_tCIs.pdf b/06_StatisticalInference/lectures/02_03_tCIs.pdf
new file mode 100644
index 000000000..19947b5e5
Binary files /dev/null and b/06_StatisticalInference/lectures/02_03_tCIs.pdf differ
diff --git a/06_StatisticalInference/lectures/02_04_Likeklihood.pdf b/06_StatisticalInference/lectures/02_04_Likeklihood.pdf
new file mode 100644
index 000000000..4e3388a80
Binary files /dev/null and b/06_StatisticalInference/lectures/02_04_Likeklihood.pdf differ
diff --git a/06_StatisticalInference/lectures/02_05_Bayes.pdf b/06_StatisticalInference/lectures/02_05_Bayes.pdf
new file mode 100644
index 000000000..ae65bc28f
Binary files /dev/null and b/06_StatisticalInference/lectures/02_05_Bayes.pdf differ
diff --git a/06_StatisticalInference/lectures/03_01_TwoGroupIntervals.pdf b/06_StatisticalInference/lectures/03_01_TwoGroupIntervals.pdf
new file mode 100644
index 000000000..b46a7169f
Binary files /dev/null and b/06_StatisticalInference/lectures/03_01_TwoGroupIntervals.pdf differ
diff --git a/06_StatisticalInference/lectures/03_02_HypothesisTesting.pdf b/06_StatisticalInference/lectures/03_02_HypothesisTesting.pdf
new file mode 100644
index 000000000..c5db5a783
Binary files /dev/null and b/06_StatisticalInference/lectures/03_02_HypothesisTesting.pdf differ
diff --git a/06_StatisticalInference/lectures/03_03_pValues.pdf b/06_StatisticalInference/lectures/03_03_pValues.pdf
new file mode 100644
index 000000000..ba31db25c
Binary files /dev/null and b/06_StatisticalInference/lectures/03_03_pValues.pdf differ
diff --git a/06_StatisticalInference/lectures/03_04_Power.pdf b/06_StatisticalInference/lectures/03_04_Power.pdf
new file mode 100644
index 000000000..b5e3024dc
Binary files /dev/null and b/06_StatisticalInference/lectures/03_04_Power.pdf differ
diff --git a/06_StatisticalInference/lectures/03_05_MultipleTesting.pdf b/06_StatisticalInference/lectures/03_05_MultipleTesting.pdf
new file mode 100644
index 000000000..88d17ad14
Binary files /dev/null and b/06_StatisticalInference/lectures/03_05_MultipleTesting.pdf differ
diff --git a/06_StatisticalInference/lectures/03_06_resampledInference.pdf b/06_StatisticalInference/lectures/03_06_resampledInference.pdf
new file mode 100644
index 000000000..ce8822a3c
Binary files /dev/null and b/06_StatisticalInference/lectures/03_06_resampledInference.pdf differ
diff --git a/06_StatisticalInference/makefile b/06_StatisticalInference/makefile
new file mode 100644
index 000000000..0e6c6de5c
--- /dev/null
+++ b/06_StatisticalInference/makefile
@@ -0,0 +1,28 @@
+DELAY = 1000
+RMD_FILES  = $(wildcard */index.Rmd)
+HTML_FILES = $(patsubst %.Rmd, %.html, $(RMD_FILES))
+PDF_FILES  = $(patsubst %.html, %.pdf, $(HTML_FILES))
+PDF_FILES2 = $(patsubst %/index.pdf, lectures/%.pdf, $(PDF_FILES))
+
+lectures: $(PDF_FILES2)
+lectures/%.pdf: %/index.pdf
+	cp $< $@
+
+files:
+	@echo $(RMD_FILES)
+	@echo $(HTML_FILES)
+	@echo $(PDF_FILES)
+	
+html: $(HTML_FILES)
+pdf: $(PDF_FILES)
+all: html pdf
+
+zip: $(PDF_FILES)
+	zip all_pdf_files.zip $^
+	
+#%/index.pdf: %/index.html
+#	casperjs makepdf.js $< $@ $(DELAY)
+
+#%/index.html: %/index.Rmd
+#	cd $(dir $<) && Rscript -e "slidify::slidify('index.Rmd')" && cd ..
+
diff --git a/06_StatisticalInference/makepdf.js b/06_StatisticalInference/makepdf.js
new file mode 100644
index 000000000..c01526f94
--- /dev/null
+++ b/06_StatisticalInference/makepdf.js
@@ -0,0 +1,10 @@
+var casper = require('casper').create({viewportSize:{width:1500,height:1000}});
+var args = casper.cli.args;
+var imgfile = (args[1] || Math.random().toString(36).slice(2))
+casper.start(args[0], function() {
+  this.wait(args[2], function(){
+    this.captureSelector(imgfile, "slides");
+  });
+});
+ 
+casper.run();
\ No newline at end of file
diff --git a/06_StatisticalInference/mathNotation/index.Rmd b/06_StatisticalInference/mathNotation/index.Rmd
deleted file mode 100644
index b5034e043..000000000
--- a/06_StatisticalInference/mathNotation/index.Rmd
+++ /dev/null
@@ -1,118 +0,0 @@
----
-title       : Math notation in R markdown
-subtitle    : 
-author      :  
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : zenburn   # 
-url:
-  lib: ../../libraries
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F}
-# make this an external chunk that can be included in any file
-options(width = 70)
-opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache = T, cache.path = '.cache/', fig.path = 'fig/')
-
-options(xtable.type = 'html')
-knit_hooks$set(inline = function(x) {
-  if(is.numeric(x)) {
-    round(x, getOption('digits'))
-  } else {
-    paste(as.character(x), collapse = ', ')
-  }
-})
-knit_hooks$set(plot = knitr:::hook_plot_html)
-```
-
-## Why math notation in R markdown?
-
-* Math notation is the standard for technical descriptions of machine learning/statistical models.
-* You may want to intersperse your technical decriptions with plain language descriptions. 
-* Math notation allows you to be precise
-  * "We fit a linear model with terms for age, sex" versus $Y_i = \alpha + \beta_a A_i + \beta_s S_i + \epsilon_i$
-* Math notation allows you to be concise
-  * "We estimated the intercept to be 3.3" versus $\hat{\alpha}=3.3$. 
-
----
-
-## What notation system does R markdown use?
-
-* R markdown uses the same system as [Latex](http://en.wikipedia.org/wiki/LaTeX) 
-* The basic idea:
-  * You write your document in R markdown
-  * You include symbols for math notation
-  * You indicate math by wrapping the symbols with certain text
-* Often the symbols are intuitive: _\alpha_ gives you $\alpha$
-
----
-
-## How to write math inline
-
-Including math in a sentence involves wrapping the symbols in $ symbols. For
-example if you write this in an R markdown file:
-
-<center> "The intercept was estimated as `$\hat{\alpha} = 4$`" </center>
-
-Then you get the following text after running _knit2html_ or _slidify_ on your document. 
-
-<center> "The intercept was estimated as $\hat{\alpha} = 4$" </center>
-
----
-
-## How to write math on a separate line
-
-Sometimes you have several equations you would like to line up. The
-way that you do that is with a double dollar sign \$$ and the align command.
-
-For example if you write
-
-<img class=center src=../../assets/img/mathNotation/aligned.png width='400px'/>
-
-Then you get the following text after running _knit2html_ or _slidify_ on your document. 
-
-$$
-  \begin{aligned}
-  y &= \beta_0 + \beta_1 + x_1 + \epsilon\\
-  x &= \gamma z \\
-  z &\sim N(0,1)
-  \end{aligned}
-$$
-
----
-
-## Common symbols
-
-* Subscripts to get $a_{b}$ write: `$a_{b}$`
-* Superscripts write $a^{b}$ write: `$a^{b}$`
-* Greek letters like $\alpha, \beta, \ldots$ write: `$\alpha, \beta, \ldots$`
-* Sums like $\sum_{n=1}^N$ write: `$\sum_{n=1}^N$`
-* Multiplication like $\times$ write: `$\times$`
-* Products like $\prod_{n=1}^N$ write: `$\prod_{n=1}^N$`
-* Inequalities like $<, \leq, \geq$ write: `$<, \leq, \geq$`
-* Distributed like $\sim$ write: `$\sim$`
-* Hats like $\widehat{\alpha}$ write: `$\widehat{\alpha}$`
-* Averages like $\bar{x}$ write: `$\bar{x}$`
-* Fractions like $\frac{a}{b}$ write: `$\frac{a}{b}$`
-* Big parentheses like $\left(\frac{a}{b}\right)$ write: `$\left(\frac{a}{b}\right)$`
-
-
----
-
-## For more information
-
-* Rstudio's equations page:
-  * http://www.rstudio.com/ide/docs/authoring/using_markdown_equations
-* Lists of Latex symbols
-  * http://www.rpi.edu/dept/arc/training/latex/LaTeX_symbols.pdf
-  * http://www.giss.nasa.gov/tools/latex/ltx-2.html
-  * http://omega.albany.edu:8008/Symbols.html
-
-
-
diff --git a/06_StatisticalInference/mathNotation/index.html b/06_StatisticalInference/mathNotation/index.html
deleted file mode 100644
index 2f9fa1bf3..000000000
--- a/06_StatisticalInference/mathNotation/index.html
+++ /dev/null
@@ -1,220 +0,0 @@
-<!DOCTYPE html>
-<html>
-<head>
-  <title>Math notation in R markdown</title>
-  <meta charset="utf-8">
-  <meta name="description" content="Math notation in R markdown">
-  <meta name="author" content="">
-  <meta name="generator" content="slidify" />
-  <meta name="apple-mobile-web-app-capable" content="yes">
-  <meta http-equiv="X-UA-Compatible" content="chrome=1">
-  <link rel="stylesheet" href="../../libraries/frameworks/io2012/css/default.css" media="all" >
-  <link rel="stylesheet" href="../../libraries/frameworks/io2012/phone.css" 
-    media="only screen and (max-device-width: 480px)" >
-  <link rel="stylesheet" href="../../libraries/frameworks/io2012/css/slidify.css" >
-  <link rel="stylesheet" href="../../libraries/highlighters/highlight.js/css/zenburn.css" />
-  <base target="_blank"> <!-- This amazingness opens all links in a new tab. -->
-  <script data-main="../../libraries/frameworks/io2012/js/slides" 
-    src="../../libraries/frameworks/io2012/js/require-1.0.8.min.js">
-  </script>
-  
-    <link rel="stylesheet" href = "../../assets/css/custom.css">
-<link rel="stylesheet" href = "../../assets/css/custom.css.BACKUP.546.css">
-<link rel="stylesheet" href = "../../assets/css/custom.css.BASE.546.css">
-<link rel="stylesheet" href = "../../assets/css/custom.css.LOCAL.546.css">
-<link rel="stylesheet" href = "../../assets/css/custom.css.orig">
-<link rel="stylesheet" href = "../../assets/css/custom.css.REMOTE.546.css">
-<link rel="stylesheet" href = "../../assets/css/ribbons.css">
-
-</head>
-<body style="opacity: 0">
-  <slides class="layout-widescreen">
-    
-    <!-- LOGO SLIDE -->
-    <!-- END LOGO SLIDE -->
-    
-
-    <!-- TITLE SLIDE -->
-    <!-- Should I move this to a Local Layout File? -->
-    <slide class="title-slide segue nobackground">
-      <aside class="gdbar">
-        <img src="../../assets/img/bloomberg_shield.png">
-      </aside>
-      <hgroup class="auto-fadein">
-        <h1>Math notation in R markdown</h1>
-        <h2></h2>
-        <p><br/>Johns Hopkins Bloomberg School of Public Health</p>
-      </hgroup>
-          </slide>
-
-    <!-- SLIDES -->
-      <slide class="" id="slide-1" style="background:;">
-  <hgroup>
-    <h2>Why math notation in R markdown?</h2>
-  </hgroup>
-  <article>
-    <ul>
-<li>Math notation is the standard for technical descriptions of machine learning/statistical models.</li>
-<li>You may want to intersperse your technical decriptions with plain language descriptions. </li>
-<li>Math notation allows you to be precise
-
-<ul>
-<li>&quot;We fit a linear model with terms for age, sex&quot; versus \(Y_i = \alpha + \beta_a A_i + \beta_s S_i + \epsilon_i\)</li>
-</ul></li>
-<li>Math notation allows you to be concise
-
-<ul>
-<li>&quot;We estimated the intercept to be 3.3&quot; versus \(\hat{\alpha}=3.3\). </li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-      <slide class="" id="slide-2" style="background:;">
-  <hgroup>
-    <h2>What notation system does R markdown use?</h2>
-  </hgroup>
-  <article>
-    <ul>
-<li>R markdown uses the same system as <a href="http://en.wikipedia.org/wiki/LaTeX">Latex</a> </li>
-<li>The basic idea:
-
-<ul>
-<li>You write your document in R markdown</li>
-<li>You include symbols for math notation</li>
-<li>You indicate math by wrapping the symbols with certain text</li>
-</ul></li>
-<li>Often the symbols are intuitive: <em>\alpha</em> gives you \(\alpha\)</li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-      <slide class="" id="slide-3" style="background:;">
-  <hgroup>
-    <h2>How to write math inline</h2>
-  </hgroup>
-  <article>
-    <p>Including math in a sentence involves wrapping the symbols in $ symbols. For
-example if you write this in an R markdown file:</p>
-
-<p><center> &quot;The intercept was estimated as <code>$\hat{\alpha} = 4$</code>&quot; </center></p>
-
-<p>Then you get the following text after running <em>knit2html</em> or <em>slidify</em> on your document. </p>
-
-<p><center> &quot;The intercept was estimated as \(\hat{\alpha} = 4\)&quot; </center></p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-      <slide class="" id="slide-4" style="background:;">
-  <hgroup>
-    <h2>How to write math on a separate line</h2>
-  </hgroup>
-  <article>
-    <p>Sometimes you have several equations you would like to line up. The
-way that you do that is with a double dollar sign $$ and the align command.</p>
-
-<p>For example if you write</p>
-
-<p><img class=center src=../../assets/img/mathNotation/aligned.png width='400px'/></p>
-
-<p>Then you get the following text after running <em>knit2html</em> or <em>slidify</em> on your document. </p>
-
-<p>\[
-  \begin{aligned}
-  y &= \beta_0 + \beta_1 + x_1 + \epsilon\\
-  x &= \gamma z \\
-  z &\sim N(0,1)
-  \end{aligned}
-\]</p>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-      <slide class="" id="slide-5" style="background:;">
-  <hgroup>
-    <h2>Common symbols</h2>
-  </hgroup>
-  <article>
-    <ul>
-<li>Subscripts to get \(a_{b}\) write: <code>$a_{b}$</code></li>
-<li>Superscripts write \(a^{b}\) write: <code>$a^{b}$</code></li>
-<li>Greek letters like \(\alpha, \beta, \ldots\) write: <code>$\alpha, \beta, \ldots$</code></li>
-<li>Sums like \(\sum_{n=1}^N\) write: <code>$\sum_{n=1}^N$</code></li>
-<li>Multiplication like \(\times\) write: <code>$\times$</code></li>
-<li>Products like \(\prod_{n=1}^N\) write: <code>$\prod_{n=1}^N$</code></li>
-<li>Inequalities like \(<, \leq, \geq\) write: <code>$&lt;, \leq, \geq$</code></li>
-<li>Distributed like \(\sim\) write: <code>$\sim$</code></li>
-<li>Hats like \(\widehat{\alpha}\) write: <code>$\widehat{\alpha}$</code></li>
-<li>Averages like \(\bar{x}\) write: <code>$\bar{x}$</code></li>
-<li>Fractions like \(\frac{a}{b}\) write: <code>$\frac{a}{b}$</code></li>
-<li>Big parentheses like \(\left(\frac{a}{b}\right)\) write: <code>$\left(\frac{a}{b}\right)$</code></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-      <slide class="" id="slide-6" style="background:;">
-  <hgroup>
-    <h2>For more information</h2>
-  </hgroup>
-  <article>
-    <ul>
-<li>Rstudio&#39;s equations page:
-
-<ul>
-<li><a href="http://www.rstudio.com/ide/docs/authoring/using_markdown_equations">http://www.rstudio.com/ide/docs/authoring/using_markdown_equations</a></li>
-</ul></li>
-<li>Lists of Latex symbols
-
-<ul>
-<li><a href="http://www.rpi.edu/dept/arc/training/latex/LaTeX_symbols.pdf">http://www.rpi.edu/dept/arc/training/latex/LaTeX_symbols.pdf</a></li>
-<li><a href="http://www.giss.nasa.gov/tools/latex/ltx-2.html">http://www.giss.nasa.gov/tools/latex/ltx-2.html</a></li>
-<li><a href="http://omega.albany.edu:8008/Symbols.html">http://omega.albany.edu:8008/Symbols.html</a></li>
-</ul></li>
-</ul>
-
-  </article>
-  <!-- Presenter Notes -->
-</slide>
-
-    <slide class="backdrop"></slide>
-  </slides>
-
-  <!--[if IE]>
-    <script 
-      src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js">  
-    </script>
-    <script>CFInstall.check({mode: 'overlay'});</script>
-  <![endif]-->
-</body>
-<!-- Grab CDN jQuery, fall back to local if offline -->
-<script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.min.js"></script>
-<script>window.jQuery || document.write('<script src="../../libraries/widgets/quiz/js/jquery-1.7.min.js"><\/script>')</script>
-<!-- Load Javascripts for Widgets -->
-<!-- MathJax: Fall back to local if CDN offline but local image fonts are not supported (saves >100MB) -->
-<script type="text/x-mathjax-config">
-  MathJax.Hub.Config({
-    tex2jax: {
-      inlineMath: [['$','$'], ['\\(','\\)']],
-      processEscapes: true
-    }
-  });
-</script>
-<script type="text/javascript" src="http://cdn.mathjax.org/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
-<!-- <script src="https://c328740.ssl.cf1.rackcdn.com/mathjax/2.0-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
-</script> -->
-<script>window.MathJax || document.write('<script type="text/x-mathjax-config">MathJax.Hub.Config({"HTML-CSS":{imageFont:null}});<\/script><script src="../../libraries/widgets/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"><\/script>')
-</script>
-<!-- LOAD HIGHLIGHTER JS FILES -->
-<script src="../../libraries/highlighters/highlight.js/highlight.pack.js"></script>
-<script>hljs.initHighlightingOnLoad();</script>
-<!-- DONE LOADING HIGHLIGHTER JS FILES -->
-</html>
\ No newline at end of file
diff --git a/06_StatisticalInference/mathNotation/index.md b/06_StatisticalInference/mathNotation/index.md
deleted file mode 100644
index fcb77d8e8..000000000
--- a/06_StatisticalInference/mathNotation/index.md
+++ /dev/null
@@ -1,105 +0,0 @@
----
-title       : Math notation in R markdown
-subtitle    : 
-author      :  
-job         : Johns Hopkins Bloomberg School of Public Health
-logo        : bloomberg_shield.png
-framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
-highlighter : highlight.js  # {highlight.js, prettify, highlight}
-hitheme     : zenburn   # 
-url:
-  lib: ../../libraries
-  assets: ../../assets
-widgets     : [mathjax]            # {mathjax, quiz, bootstrap}
-mode        : selfcontained # {standalone, draft}
----
-
-
-
-
-
-## Why math notation in R markdown?
-
-* Math notation is the standard for technical descriptions of machine learning/statistical models.
-* You may want to intersperse your technical decriptions with plain language descriptions. 
-* Math notation allows you to be precise
-  * "We fit a linear model with terms for age, sex" versus $Y_i = \alpha + \beta_a A_i + \beta_s S_i + \epsilon_i$
-* Math notation allows you to be concise
-  * "We estimated the intercept to be 3.3" versus $\hat{\alpha}=3.3$. 
-
----
-
-## What notation system does R markdown use?
-
-* R markdown uses the same system as [Latex](http://en.wikipedia.org/wiki/LaTeX) 
-* The basic idea:
-  * You write your document in R markdown
-  * You include symbols for math notation
-  * You indicate math by wrapping the symbols with certain text
-* Often the symbols are intuitive: _\alpha_ gives you $\alpha$
-
----
-
-## How to write math inline
-
-Including math in a sentence involves wrapping the symbols in $ symbols. For
-example if you write this in an R markdown file:
-
-<center> "The intercept was estimated as `$\hat{\alpha} = 4$`" </center>
-
-Then you get the following text after running _knit2html_ or _slidify_ on your document. 
-
-<center> "The intercept was estimated as $\hat{\alpha} = 4$" </center>
-
----
-
-## How to write math on a separate line
-
-Sometimes you have several equations you would like to line up. The
-way that you do that is with a double dollar sign \$$ and the align command.
-
-For example if you write
-
-<img class=center src=../../assets/img/mathNotation/aligned.png width='400px'/>
-
-Then you get the following text after running _knit2html_ or _slidify_ on your document. 
-
-$$
-  \begin{aligned}
-  y &= \beta_0 + \beta_1 + x_1 + \epsilon\\
-  x &= \gamma z \\
-  z &\sim N(0,1)
-  \end{aligned}
-$$
-
----
-
-## Common symbols
-
-* Subscripts to get $a_{b}$ write: `$a_{b}$`
-* Superscripts write $a^{b}$ write: `$a^{b}$`
-* Greek letters like $\alpha, \beta, \ldots$ write: `$\alpha, \beta, \ldots$`
-* Sums like $\sum_{n=1}^N$ write: `$\sum_{n=1}^N$`
-* Multiplication like $\times$ write: `$\times$`
-* Products like $\prod_{n=1}^N$ write: `$\prod_{n=1}^N$`
-* Inequalities like $<, \leq, \geq$ write: `$<, \leq, \geq$`
-* Distributed like $\sim$ write: `$\sim$`
-* Hats like $\widehat{\alpha}$ write: `$\widehat{\alpha}$`
-* Averages like $\bar{x}$ write: `$\bar{x}$`
-* Fractions like $\frac{a}{b}$ write: `$\frac{a}{b}$`
-* Big parentheses like $\left(\frac{a}{b}\right)$ write: `$\left(\frac{a}{b}\right)$`
-
-
----
-
-## For more information
-
-* Rstudio's equations page:
-  * http://www.rstudio.com/ide/docs/authoring/using_markdown_equations
-* Lists of Latex symbols
-  * http://www.rpi.edu/dept/arc/training/latex/LaTeX_symbols.pdf
-  * http://www.giss.nasa.gov/tools/latex/ltx-2.html
-  * http://omega.albany.edu:8008/Symbols.html
-
-
-
diff --git a/09_DevelopingDataProducts/rStudioPresent/index.md b/09_DevelopingDataProducts/rStudioPresent/index.md
index b998542ae..6b5c3334e 100644
--- a/09_DevelopingDataProducts/rStudioPresent/index.md
+++ b/09_DevelopingDataProducts/rStudioPresent/index.md
@@ -1,7 +1,7 @@
 RStudio Presenter
 ===
 author: Brian Caffo, Jeff Leek Roger Peng
-date: May 21 2014
+date: May 24 2014
 transition: rotate
 
 <small>