Jekyll2019-05-09T02:58:58+00:00http://www.pwills.com/feed.xmlPeter WillsSource for pwills.comPeter Willspeter@pwills.comTypes as Propositions2018-11-30T00:00:00+00:002018-11-30T00:00:00+00:00http://www.pwills.com/blog/posts/2018/11/30/types<p>Some of the most meaningful mathematical realizations that I’ve had have been unexpected connections between two topics; that is, realizing that two concepts that first appeared quite distinct are in fact one and the same. In our first linear algebra courses, we learn that manipulations of matrices is, in fact, equivalent to solving systems of equations. In quantum mechanics, we see that <a href="https://en.wikipedia.org/wiki/Observable">physically observable quantities</a> are, mathematically speaking, linear operators (I still don’t quite grok this one). And, my personal favorite example, we learn in functional analysis that the linear functionals in the dual space of a Hilbert space are themselves in perfect correspondence with the functions in the original space.<sup id="fnref:fnote1"><a href="#fn:fnote1" class="footnote">1</a></sup></p> <p>Recently, I’ve stumbled upon another such result, which has captured my attention for a while. The result, often referred to as Curry-Howard correspondence, is the statement that propositions in a formal logical system are equivalent to types in the simply typed lambda calculus. Loosely, this means that <strong>logical statements are equivalent to data types</strong>!</p> <p>Let’s unpack that a bit; “propositions” are just statements in a logical system.<sup id="fnref:fnote15"><a href="#fn:fnote15" class="footnote">2</a></sup> In mathematics, for example, one might put forward the proposition “no even numbers are prime,” or “14 is greater than 18”. Note that propositions need not be <em>true</em>; in fact, some logical systems support propositions that cannot even be determined to be true or false.<sup id="fnref:fnote2"><a href="#fn:fnote2" class="footnote">3</a></sup> “Types” can be though of as types in a computing language; <code class="highlighter-rouge">Integer</code>, <code class="highlighter-rouge">Boolean</code>, and so on. We will have much more to say about types as we move forward, but for now, hold in your mind the conventional notion of types as defined in a language such as Java or Python (or better yet, Haskell).</p> <p>How on earth could these two be in correspondence? On the surface, they appear entirely separate concepts. In this post, I’ll spend some time unpacking what this equivalence is actually saying, using a simple example. I am far from a full understanding of it, but as usual, I write about it in the hopes that I’ll be forced to clarify what I <em>do</em> understand, or even better, be corrected by someone more knowledgable than myself.</p> <p>Speaking of those more knowledgable than myself, there are various resources online that I found very helpful in understanding the correspondence: <a href="https://www.youtube.com/watch?v=IOiZatlZtGU&amp;t=1176s">Philip Wadler’s talk</a> on the subject is a great starting point, and there are a number of <a href="http://lambda-the-ultimate.org/node/1532">useful</a> <a href="https://stackoverflow.com/questions/2969140/what-are-the-most-interesting-equivalences-arising-from-the-curry-howard-isomorp">discussions</a> <a href="https://stackoverflow.com/questions/2829347/a-question-about-logic-and-the-curry-howard-correspondence">available</a> on StackExchange and various functional programming forums.</p> <h2 id="an-example">An Example</h2> <p>I was confused by the idea of propositions as types when I first encountered it, and after learning more, I believe that the root of my confusion lies in the fact that types such as <code class="highlighter-rouge">Integer</code>, <code class="highlighter-rouge">Boolean</code>, and <code class="highlighter-rouge">String</code>, which we are familiar with from programming, correspond to very trivial propositions, making them poor examples. We’ll have to introduce something a bit fancier; a <em>conditional type</em>. For example, <code class="highlighter-rouge">OddInt</code> might be odd Integers, and <code class="highlighter-rouge">PrimeInt</code> might be prime integers. We’ll approximate these conditional types with custom classes in Scala. Classes and types are <a href="https://stackoverflow.com/questions/5031640/what-is-the-difference-between-a-class-and-a-type-in-scala-and-java">different beasts</a>, of course, but we will ignore that distinction in this post.<sup id="fnref:fnote3"><a href="#fn:fnote3" class="footnote">4</a></sup></p> <p>Let’s consider one conditional type in particular: <code class="highlighter-rouge">BigInteger</code>. This type (actually a class in this example) is defined as follows:</p> <figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">class</span> <span class="nc">BigInteger</span> <span class="o">(</span><span class="k">val</span> <span class="n">value</span><span class="k">:</span> <span class="kt">Int</span><span class="o">)</span> <span class="o">{</span> <span class="k">private</span> <span class="k">final</span> <span class="k">val</span> <span class="nc">LOWER_BOUND</span> <span class="k">=</span> <span class="mi">10000</span> <span class="k">if</span> <span class="o">(</span><span class="n">value</span> <span class="o">&lt;</span> <span class="nc">LOWER_BOUND</span><span class="o">)</span> <span class="o">{</span> <span class="k">throw</span> <span class="k">new</span> <span class="nc">IllegalArgumentException</span><span class="o">(</span><span class="s">"Too small!"</span><span class="o">)</span> <span class="o">}</span> <span class="k">override</span> <span class="k">def</span> <span class="n">toString</span> <span class="k">=</span> <span class="n">s</span><span class="s">"BigInteger($value)"</span> <span class="o">}</span></code></pre></figure> <p>One could then instantiate a <code class="highlighter-rouge">BigInteger</code> as follows:</p> <figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">val</span> <span class="n">big</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">BigInteger</span><span class="o">(</span><span class="mi">10001</span><span class="o">)</span> <span class="c1">// res0: BigInteger(10001) </span> <span class="k">val</span> <span class="n">small</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">BigInteger</span><span class="o">(</span><span class="mi">500</span><span class="o">)</span> <span class="o">//</span> <span class="n">java</span><span class="o">.</span><span class="n">lang</span><span class="o">.</span><span class="nc">IllegalArgumentException</span><span class="k">:</span> <span class="kt">Too</span> <span class="kt">small!</span></code></pre></figure> <p>Now the fundemanetal question: what proposition corresponds to this type? In simple scenarios like this, the corresponding proposition is that the type can be <em>inhabited</em>; that is, there exists a value that satisfies that type. For example, the type <code class="highlighter-rouge">BigInteger</code> corresponds to the claim “there exists an integer $$i$$ for which $$i &gt; 10,000$$”. Obviously, such an integer exists, and the fact that we can instantiate this type indicates that it corresponds to a true proposition. Alternatively, consider a type <code class="highlighter-rouge">WeirdInteger</code>, which is an integer satisfying <code class="highlighter-rouge">i &lt; 3 &amp;&amp; i &gt; 5</code>. We can define the type well enough, but there are no values which satisfy it; it is an uninhabitable type, and so corresponds to a false proposition.</p> <h2 id="functions-and-implication">Functions and Implication</h2> <p>Let’s make things a little more interesting. In programming languages, there are not only primitive types like <code class="highlighter-rouge">Integer</code> and <code class="highlighter-rouge">Boolean</code>, but there are also <strong>function types</strong>, which are the types of functions. For example, in Scala, the function <code class="highlighter-rouge">def f(x: Int) = x.toString</code> has type <code class="highlighter-rouge">Int =&gt; String</code>, which is to say it is a function that maps integers to strings.</p> <p>What sort of propositions would <em>functions</em> correspond to? It turns out that functions naturally map to <em>implication</em>. In some ways, the correspondence here is very natural. Consider the conditional type <code class="highlighter-rouge">BigInteger</code>, and the conditional type <code class="highlighter-rouge">BiggerInteger</code>. The definition of the latter should look familiar, from above:</p> <figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">class</span> <span class="nc">BiggerInteger</span> <span class="o">(</span><span class="k">val</span> <span class="n">value</span><span class="k">:</span> <span class="kt">Int</span><span class="o">)</span> <span class="o">{</span> <span class="k">private</span> <span class="k">final</span> <span class="k">val</span> <span class="nc">LOWER_BOUND</span> <span class="k">=</span> <span class="mi">20000</span> <span class="k">if</span> <span class="o">(</span><span class="n">value</span> <span class="o">&lt;</span> <span class="nc">LOWER_BOUND</span><span class="o">)</span> <span class="o">{</span> <span class="k">throw</span> <span class="k">new</span> <span class="nc">IllegalArgumentException</span><span class="o">(</span><span class="s">"Too small!"</span><span class="o">)</span> <span class="o">}</span> <span class="k">override</span> <span class="k">def</span> <span class="n">toString</span> <span class="k">=</span> <span class="n">s</span><span class="s">"BiggerInteger($value)"</span> <span class="o">}</span></code></pre></figure> <p>Now, we can write a function that maps <code class="highlighter-rouge">BigInteger</code> to <code class="highlighter-rouge">BiggerInteger</code>:</p> <figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">def</span> <span class="n">makeBigger</span><span class="o">(</span><span class="n">b</span><span class="k">:</span> <span class="kt">BigInteger</span><span class="o">)</span><span class="k">:</span> <span class="kt">BiggerInteger</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">BiggerInteger</span><span class="o">(</span><span class="n">b</span><span class="o">.</span><span class="n">value</span> <span class="o">*</span> <span class="mi">2</span><span class="o">)</span></code></pre></figure> <p>Recall that the proposition corresponding to the type <code class="highlighter-rouge">BigInteger</code> is the statement “there exists an integer greater than 10,000”, and the proposition corresponding to <code class="highlighter-rouge">Bigger</code> is the statement “there exists an integer greater than 20,000”; the proposition corresponding to the function type <code class="highlighter-rouge">BigInteger =&gt; BiggerInteger</code> is then just the statement “the existence of an integer above 10,000 implies the existence of an integer above 20,000”. And note that, as it should be for an implication, we do not care whether there actually <em>does</em> exist an integer above 10,000; we simply know that <em>if</em> one exists, then its existence implies the existence of an integer above 20,000.</p> <p>To be a bit more explicit, the function that we wrote above can be thought of as a <strong>proof</strong> of the implication; in particular, if we suppose that there exists an $$i$$ such that $$i &gt; 10,000$$, then clearly $$2i &gt; 20,000$$, and so if we let $$j=2i$$, then we have proven the existence of a $$j$$ such that $$j &gt; 20,000$$. This is what the theoretical computer scientists mean when they say that “programs are proofs”.</p> <p>Of course, Scala is not a proof-checking language, and cannot tell during compilation that the function <code class="highlighter-rouge">makeBigger</code> is valid; we would need a much richer type system to be able to validate such functions. Consider that the following function compiles with no problem, although there are no input values for which it will not throw a (runtime) exception:</p> <figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">def</span> <span class="n">wonky</span><span class="o">(</span><span class="n">b</span><span class="k">:</span> <span class="kt">BigInteger</span><span class="o">)</span><span class="k">:</span> <span class="kt">BiggerInteger</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">BiggerInteger</span><span class="o">(</span><span class="n">b</span><span class="o">.</span><span class="n">value</span> <span class="o">%</span> <span class="mi">1000</span><span class="o">)</span></code></pre></figure> <h3 id="wait-what">Wait… what?</h3> <p>If you think about it a bit more, it’s sort of a weird example; you could map <em>any</em> type to <code class="highlighter-rouge">BiggerInteger</code>, just by doing <code class="highlighter-rouge">def f[A](a:A): BiggerInteger = new BiggerInteger(20001)</code>. This is because the proposition that corresponds to <code class="highlighter-rouge">BiggerInteger</code> is true (the type is inhabitable), and if B is true, then A implies B for any A at all.</p> <p>Common languages such as Haskell only express very trivial propositions with their types; there does exist one uninhabitable type (<code class="highlighter-rouge">void</code>), but I have not found much use for it in practice. The benefit of using conditional types for these examples is that we can explore at least some types which have corresponding <em>false</em> propositions, such as <code class="highlighter-rouge">WeirdInteger</code>, which are integers <code class="highlighter-rouge">i</code> which satisfy <code class="highlighter-rouge">i &lt; 3 &amp;&amp; i &gt; 5</code>.</p> <h2 id="in-conclusion">In Conclusion</h2> <p>Seeing all this, you can begin to get a sense of how computer-assisted proof techniques might arise out of it. If the fact that a program compiles is equivalent to the truth the corrsponding proposition, then all we need is a language with a rich enough type system to express interesting statements. Examples of languages used in this way include <a href="https://coq.inria.fr/">Coq</a> and <a href="https://en.wikipedia.org/wiki/Agda_(programming_language">Agda</a>. A thorough discussion of such languages is beyond both the scope of this post and my understanding.</p> <p>I think what keeps me interested in this subject is that it still remains quite opaque to me; I’ve struggled to even come up with these simple (and flawed) examples of how Curry-Howard correspondence plays out in practice. I hope that anyone reading this who understand the subject better than I do will leave a detailed list of my misunderstandings, so that I can better grasp this mysterious and fascinating topic.</p> <!-------------------------------- FOOTER ----------------------------> <!-- Wish we could put this in _includes/scripts.html. But it doesn't run from --> <!-- there. It needs to be run at the bottom of the file, rather than at the --> <!-- top; perhaps that has something to do with it. Anyways, I'll just include --> <!-- this chunk of HTML at the footer of all my posts, even though its fugly. --> <div id="disqus_thread"></div> <script> /** * RECOMMENDED CONFIGURATION VARIABLES: EDIT AND UNCOMMENT THE SECTION BELOW TO INSERT DYNAMIC VALUES FROM YOUR PLATFORM OR CMS. * LEARN WHY DEFINING THESE VARIABLES IS IMPORTANT: https://disqus.com/admin/universalcode/#configuration-variables*/ /* var disqus_config = function () { this.page.url = PAGE_URL; // Replace PAGE_URL with your page's canonical URL variable this.page.identifier = PAGE_IDENTIFIER; // Replace PAGE_IDENTIFIER with your page's unique identifier variable }; */ (function() { // DON'T EDIT BELOW THIS LINE var d = document, s = d.createElement('script'); s.src = 'https://pwills-com.disqus.com/embed.js'; s.setAttribute('data-timestamp', +new Date()); (d.head || d.body).appendChild(s); })(); </script> <noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> <div class="footnotes"> <ol> <li id="fn:fnote1"> <p>This statement is difficult to understand without background in functional analysis, but it is in fact one of the most beautiful examples of such an equivalence result. <a href="#fnref:fnote1" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:fnote15"> <p>I’m being a bit sloppy here. The type of logic we’re talking about here is not classical logic, but rather in the sense of <a href="https://en.wikipedia.org/wiki/Natural_deduction">natural deduction</a>. <a href="#fnref:fnote15" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:fnote2"> <p>Such systems are called undecidable; see <a href="https://en.wikipedia.org/wiki/Decidability_(logic)">the wiki entry on decidability</a> for more information. <a href="#fnref:fnote2" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:fnote3"> <p>We won’t be careful about whether the idea of conditional types presented here corresponds well with conditional types as they are actually implemented in programming languages such as <a href="https://github.com/Microsoft/TypeScript/pull/21316">Typescript</a>. <a href="#fnref:fnote3" class="reversefootnote">&#8617;</a></p> </li> </ol> </div>Peter Willspeter@pwills.comSome of the most meaningful mathematical realizations that I’ve had have been unexpected connections between two topics; that is, realizing that two concepts that first appeared quite distinct are in fact one and the same. In our first linear algebra courses, we learn that manipulations of matrices is, in fact, equivalent to solving systems of equations. In quantum mechanics, we see that physically observable quantities are, mathematically speaking, linear operators (I still don’t quite grok this one). And, my personal favorite example, we learn in functional analysis that the linear functionals in the dual space of a Hilbert space are themselves in perfect correspondence with the functions in the original space.1 This statement is difficult to understand without background in &#8617;Inverse Transform Sampling in Python2018-06-24T00:00:00+00:002018-06-24T00:00:00+00:00http://www.pwills.com/blog/posts/2018/06/24/sampling<p>When doing data work, we often need to sample random variables. This is easy to do if one wishes to sample from a Gaussian, or a uniform random variable, or a variety of other common distributions, but what if we want to sample from an arbitrary distribution? There is no obvious way to do this within <code class="highlighter-rouge">scipy.stats</code>. So, I build a small library, <a href="https://www.github.com/peterewills/itsample"><code class="highlighter-rouge">inverse-transform-sample</code></a>, that allows for sampling from arbitrary user provided distributions. In use, it looks like this:</p> <figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span> <span class="n">pdf</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">exp</span><span class="p">(</span><span class="o">-</span><span class="n">x</span><span class="o">**</span><span class="mi">2</span><span class="o">/</span><span class="mi">2</span><span class="p">)</span> <span class="c"># unit Gaussian, not normalized</span> <span class="kn">from</span> <span class="nn">itsample</span> <span class="kn">import</span> <span class="n">sample</span> <span class="n">samples</span> <span class="o">=</span> <span class="n">sample</span><span class="p">(</span><span class="n">pdf</span><span class="p">,</span><span class="mi">1000</span><span class="p">)</span> <span class="c"># generate 1000 samples from pdf </span></code></pre></figure> <p>The code is available <a href="https://www.github.com/peterewills/itsample">on GitHub</a>. In this post, I’ll outline the theory of <a href="https://en.wikipedia.org/wiki/Inverse_transform_sampling">inverse transform sampling</a>, discuss computational details, and outline some of the challenges faced in implementation.</p> <h2 id="introduction-to-inverse-transform-sampling">Introduction to Inverse Transform Sampling</h2> <p>Suppose we have a probability density function $$p(x)$$, which has an associated cumulative density function (CDF) $$F(x)$$, defined as usual by</p> <script type="math/tex; mode=display">F(x) = \int_{-\infty}^x p(s)ds.</script> <p>Recall that the cumulative density function $$F(x)$$ tells us <em>the probability that a random sample from $$p$$ is less than or equal to x</em>.</p> <p>Let’s take a second to notice something here. If we knew, for some x, that $$F(x)=t$$, then drawing $$x$$ from $$p$$ is in some way <strong>equivalent to drawing $$t$$ from a uniform random variable on $$[0,1]$$</strong>, since the CDF for a uniform random variable is $$F_u(t) = t$$.<sup id="fnref:fnote1"><a href="#fn:fnote1" class="footnote">1</a></sup></p> <p>That realization is the basis for inverse transform sampling. The procedure is:</p> <ol> <li>Draw a sample $$t$$ uniformly from the inverval $$[0,1]$$.</li> <li>Solve the equation $$F(x)=t$$ for $$x$$ (invert the CDF).</li> <li>Return the resulting $$x$$ as the sample from $$p$$.</li> </ol> <h2 id="computational-considerations">Computational Considerations</h2> <p>Most of the computational work done in the above algorithm comes in at step 2, in which the CDF is inverted.<sup id="fnref:fnote2"><a href="#fn:fnote2" class="footnote">2</a></sup> Consider Newton’s method, a typical routine for finding numerical solutions to equations: the approach is iterative, and so the function to be inverted, in our case the CDF $$F(x)$$, is evaluated many times. Now, in our case, since $$F$$ is a (numerically computed) integral of $$p$$, this means that we will have to run our numerical quadrature routine once for each evaluation of $$F$$. Since we need <em>many</em> evaluations of $$F$$ for a single sample, this can lead to a significant slowdown in sampling.</p> <p>Again, the pain point here is that our CDF $$F(x)$$ is slow to evaluate, because each evaluation requires numerical quadrature. What we need is an approximation of the CDF that is fast to evaluate, as well as accurate.</p> <h3 id="chebyshev-approximation-of-the-cdf">Chebyshev Approximation of the CDF</h3> <p>I snooped around on the internet a bit, and found <a href="https://github.com/scipy/scipy/issues/3747">this feature request</a> for scipy, which is related to this same issue. Although it never got off the ground, I found an interesting link to <a href="https://arxiv.org/pdf/1307.1223.pdf">a 2013 paper by Olver &amp; Townsend</a>, in which they suggest using Chebyshev polynomials to approximate the PDF. The advantage of this approach is that the integral of a series of Chebyshev polynomials is known analytically - that is, if we know the Chebyshev expansion of the PDF, we automatically know the Chebyshev expansion of the CDF as well. This should allow us to rapidly invert the (Chebyshev approximation of the) CDF, and thus sample from the distribution efficiently.</p> <h3 id="other-approaches">Other Approaches</h3> <p>There are also less mathematically sophisticated approaches that immediately present themselves. One might consider solving $$F(x)=t$$ on a grid of $$t$$ values, and then building the function $$F^{-1}(x)$$ by interpolation. One could even simply transform the provided PDF into a histogram, and then use the functionality built in to <code class="highlighter-rouge">scipy.stats</code> for sampling from a provided histogram (more on that later). However, due to time constraints, <code class="highlighter-rouge">inverse-transform-sample</code> only includes the numerical quadrature and Chebyshev approaches.</p> <h2 id="implementation-in-python">Implementation in Python</h2> <p>The implementation of this approach is not horribly sophisticated, but in exchange it exhibits that wonderful readability characteristic of Python code. The complexity is the highest in the methods implementing the Chebyshev-based approach; those without a background in numerical analysis may wonder, for example, why the function is evaluted on <a href="https://en.wikipedia.org/wiki/Chebyshev_nodes">that particularly strange set of nodes</a>.</p> <p>In the quadrature-based approach, both the numerical quadrature and root-finding are both done via <code class="highlighter-rouge">scipy</code> library (<code class="highlighter-rouge">scipy.integrate.quad</code> and <code class="highlighter-rouge">scipy.optimize.root</code>, respectively). When using this approach, one can set the boundaries of the PDF to be infinite, as <code class="highlighter-rouge">scipy.integrate.quad</code> supports improper integrals. In the <a href="https://github.com/peterewills/itsample/blob/master/example.ipynb">notebook of examples</a>, we show that the samples generated by this approach do, at least in the eyeball norm, conform to the provided PDF. As we expected, this approach is slow - it takes about 7 seconds to generate 5,000 samples from a unit normal.</p> <p>As with the quadrature and root-finding, pre-rolled functional from <code class="highlighter-rouge">scipy</code> was used to both compute and evaluate the Chebyshev approximants. When approximating a PDF using Chebyshev polynomials, finite bounds must be provided. A user-determined tolerance determines the order of the Chebyshev approximation; however, rather than computing a true error, we simply use the size of the last few coefficients of the Chebyshev coefficients as an approximation. Since this approach differs from the previousl only in the way that the CDF is constructed, we use the same function <code class="highlighter-rouge">sample</code> for both approaches; an option <code class="highlighter-rouge">chebyshev=True</code> will generate a Chebyshev approximant of the CDF, rather than using numerical quadrature.</p> <p>I hoped that the Chebyshev approach would improve on this by an order of magnitude or two; however, my hopes were thwarted. The implementation of the Chebyshev approach is faster by perhaps a factor of 2 or 3, but does not offer the kind of improvement I had hoped for. What happened? In testing, a single evaluation of the Chebyshev CDF was not much faster than a single evaluation of the quadrature CDF. The advantage of the Chebyshev CDF comes when one wishes to evaluate a long, vectorized set of inputs; in this case, the Chebyshev CDF is orders of magnitude faster than quadrature. But <code class="highlighter-rouge">scipy.optimize.root</code> does not appear to take advantage of vectorization, which makes sense - in simple iteration schemes, the value at which the next iteration occurs depends on the outcome of the current iteration, so there is not a simple way to vectorize the algorithm.</p> <h2 id="conclusion">Conclusion</h2> <p>I suspect that the reason this feature is absent from large-scale library like <code class="highlighter-rouge">scipy</code> and <code class="highlighter-rouge">numpy</code> is that it is difficult to build a sampler that is both fast and accurate over a large enough class of PDFs. My approach sacrifices speed; other approximation schemes may be very fast, but may not provide the accuracy guarantees needed by some users.</p> <p>What we’re left with is a library that is useful for generating small numbers (less than 100,000) of samples. It’s worth noting that in the work of Olver &amp; Townsend, they seem to be able to use the Chebyshev approach to sample orders of magnitude faster than my impelmentation, but sadly their Matlab code is nowhere to be found in the Matlab library <a href="http://www.chebfun.org/"><code class="highlighter-rouge">chebfun</code></a>, which is the location advertised in their work. Presumably they implemented their own root-finder, or Chebyshev approximation scheme, or both. There’s a lot of space for improvement here, but I simply ran out of time and energy on this one; if you feel inspired, <a href="https://github.com/peterewills/itsample#contributing">fork the repo</a> and submit a pull request!</p> <!-------------------------------- FOOTER ----------------------------> <!-- Wish we could put this in _includes/scripts.html. But it doesn't run from --> <!-- there. It needs to be run at the bottom of the file, rather than at the --> <!-- top; perhaps that has something to do with it. Anyways, I'll just include --> <!-- this chunk of HTML at the footer of all my posts, even though its fugly. --> <div id="disqus_thread"></div> <script> /** * RECOMMENDED CONFIGURATION VARIABLES: EDIT AND UNCOMMENT THE SECTION BELOW TO INSERT DYNAMIC VALUES FROM YOUR PLATFORM OR CMS. * LEARN WHY DEFINING THESE VARIABLES IS IMPORTANT: https://disqus.com/admin/universalcode/#configuration-variables*/ /* var disqus_config = function () { this.page.url = PAGE_URL; // Replace PAGE_URL with your page's canonical URL variable this.page.identifier = PAGE_IDENTIFIER; // Replace PAGE_IDENTIFIER with your page's unique identifier variable }; */ (function() { // DON'T EDIT BELOW THIS LINE var d = document, s = d.createElement('script'); s.src = 'https://pwills-com.disqus.com/embed.js'; s.setAttribute('data-timestamp', +new Date()); (d.head || d.body).appendChild(s); })(); </script> <noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> <div class="footnotes"> <ol> <li id="fn:fnote1"> <p>This is only true for $$t\in [0,1]$$. For $$t&lt;0$$, $$F_u(t)=0$$, and for $$t&gt;1$$, $$F_u(t)=1$$. <a href="#fnref:fnote1" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:fnote2"> <p>The inverse of the CDF is often called the percentile point function, or PPF. <a href="#fnref:fnote2" class="reversefootnote">&#8617;</a></p> </li> </ol> </div>Peter Willspeter@pwills.comWhen doing data work, we often need to sample random variables. This is easy to do if one wishes to sample from a Gaussian, or a uniform random variable, or a variety of other common distributions, but what if we want to sample from an arbitrary distribution? There is no obvious way to do this within scipy.stats. So, I build a small library, inverse-transform-sample, that allows for sampling from arbitrary user provided distributions. In use, it looks like this:The Meaning of Entropy2018-02-06T00:00:00+00:002018-02-06T00:00:00+00:00http://www.pwills.com/blog/posts/2018/02/06/entropy<p><strong>Entropy</strong> is a word that we see a lot in various forms. It’s classical use comes from thermodynamics: e.g. “the entropy in the universe is always increasing.” With the recent boom in statistics and machine learning, the word has also seen a surge in use in information-theoretic contexts: e.g. “minimize the cross-entropy of the validation set.”</p> <p>It’s been an ongoing investigation for me, trying to figure out just what the hell this information-theoretic entropy is all about, and how it connects to the notion I’m familiar with from statistical mechanics. Reading through the wonderful book <a href="https://www.amazon.com/Data-Analysis-Bayesian-Devinderjit-Sivia/dp/0198568320">Data Analysis: a Bayesian Tutorial</a> by D. S. Sivia, I found the first connection between these two notions that really clicked for me. I’m going to run through the basic argument here, in the hope that reframing it in my own words will help me understand it more thoroughly.</p> <h2 id="entropy-in-thermodynamics">Entropy in Thermodynamics</h2> <p>Let’s start with the more intuitive notion, which is that of thermodynamic entropy. This notion, when poorly explained, can seem opaque or quixotic; however, when viewed through the right lens, it is straightforward, and the law of increasing entropy becomes a highly intuitive result.</p> <h3 id="counting-microstates">Counting Microstates</h3> <p>Imagine, if you will, the bedroom of a teenager. We want to talk about the entropy of two different states: the state of being “messy” and the state of being “clean.” We will call these <strong>macrostates</strong>; they describe the macroscopic (large-scale) view of the room. However, there are also many different microstates. One can resolve these on a variety of scales, but let’s just say they correspond to the location/position of each individual object in the room. To review:</p> <table> <thead> <tr> <th>Type</th> <th>Definition</th> <th>Example</th> </tr> </thead> <tbody> <tr> <td>Macrostate</td> <td>Overall Description</td> <td>“Messy”</td> </tr> <tr> <td>Microstate</td> <td>Fine-Scale Description</td> <td>“Underwear on lamp, shoes in bed, etc.”</td> </tr> </tbody> </table> <h3 id="the-boltzmann-entropy">The Boltzmann Entropy</h3> <p>One might notice an interesting fact: that there are many more possible microstates that correspond to “messy” than there are microstates that correspond to “clean.” <strong>This is exactly what we mean when we say that a messy room has higher entropy.</strong> In particular, the entropy of a macrostate is <strong>the log of the number of microstates that correspond to that macrostate.</strong> We call this the Boltzmann entropy, and denote it by $$S_B$$. If there are $$\Omega$$ possible microstates that correspond to the macrostate of being “messy,” then we define the entropy of this state as<sup id="fnref:fnote2"><a href="#fn:fnote2" class="footnote">1</a></sup></p> <script type="math/tex; mode=display">S_B(\text{messy}) = \log(\Omega).</script> <p>This is essentiall all we need to know here.<sup id="fnref:fnote1"><a href="#fn:fnote1" class="footnote">2</a></sup> The entropy tells us how many different ways there are to get a certian state. A pyramid of oranges in a supermarket has lower entropy than the oranges fallen all over the floor, because there are many configurations of oranges that we would call “oranges all over the floor,” but very few that we would call “a nicely organized pyramid of oranges.”</p> <p>In this context, the law of increasing entropy becomes almost tautological. If things are moving around in our bedroom at random, and we call <em>most</em> of those configurations “messy,” then the room will tend towards messyness rather than cleanliness. We sometimes use the terms “order” and “disorder” to refer to states of relatively low and high entropy, respectively.</p> <h2 id="entropy-in-information-theory">Entropy in Information Theory</h2> <p>One also frequently encounters a notion of entropy in statistics and information theory. This is called the <em>Shannon entropy</em>, and the motivation for this post is my persistent puzzlement over the connection between Boltzmann’s notion of entropy and Shannon’s. Previous to reading <a href="https://www.amazon.com/Data-Analysis-Bayesian-Devinderjit-Sivia/dp/0198568320">D. Sivia’s manual</a>, I only knew the definition of Shannon entropy, but his work presented such a clear exposition of the connection to Boltzmann’s ideas that I felt compelled to share it.</p> <h3 id="permutations-and-probabilities">Permutations and Probabilities</h3> <p>We’ll work with a thought experiment.<sup id="fnref:fnote3"><a href="#fn:fnote3" class="footnote">3</a></sup> Suppose we have $$N$$ subjects we organize into $$M$$ groups, with $$N\gg M$$. Let $$n_i$$ indicate the number of subjects that are in the $$i^\text{th}$$ group, for $$i=1,\ldots,M$$. Of course,</p> <script type="math/tex; mode=display">\sum_{i=1}^M n_i = N,</script> <p>and if we choose a person at random the probability that they are in group $$i$$ is</p> <script type="math/tex; mode=display">p_i = \frac{n_i}{N}.</script> <p>The <strong>Shannon entropy</strong> of such a discrete distribution is defined as</p> <script type="math/tex; mode=display">S = -\sum_{i=1}^M p_i\log(p_i)</script> <p>But why? Why $$p\log(p)$$? Let’s look and see.</p> <p>A macrostate of this system is defined by the size of the groups $$n_i$$; equivalently, it is defined as the probability distribution. A microstate of this system is specifying the group of each subject: the specification that subject number $$j$$ is in group $$i$$ for each $$j=1,\ldots,N$$. How many microstates correspond to a given macrostate? For the first group, we can fill it with any of the $$N$$ participants, and we must choose $$n_1$$ members of the group, so the number of ways of assigning participants to this group is</p> <script type="math/tex; mode=display">{N\choose n_1} = \frac{N!}{n_1!(N-n_1)!}</script> <p>For the second group, there are $$N - n_1$$ remaining subjects, and we must assign $$n_2$$ of them, and so on. Thus, the total number of ways of arranging the $$N$$ balls into the groups of size $$n_i$$ is</p> <script type="math/tex; mode=display">\Omega = {N\choose n_1}{N-n_1 \choose n_2}\ldots {N-n_1-\ldots-n_{M-1}\choose n_M}.</script> <p>This horrendous list of binomial coefficients can be simplified down to just</p> <script type="math/tex; mode=display">\Omega = \frac{N!}{n_1!n_2!\ldots n_M!}.</script> <p>The Boltzmann entropy of this macrostate is then</p> <script type="math/tex; mode=display">S_B = \log(\Omega) = \log(N!) - \sum_{i=1}^M \log(n_i!)</script> <h3 id="from-boltzmann-to-shannon">From Boltzmann to Shannon</h3> <p><strong>We will now show that the Boltzmann entropy is (approimxately) a scaling of the Shannon entropy</strong>; in particular, $$S_B \approx N\,S$$. Things are going to get slightly complicated in the algebra, but hang on. If you’d prefer, you can take my word for it, and skip to the next section.</p> <p>We will use the Stirling approximation $$\log(n!)\approx n\log(n)$$<sup id="fnref:fnote4"><a href="#fn:fnote4" class="footnote">4</a></sup> to simplify:</p> <script type="math/tex; mode=display">S_B \approx N\log(N) - \sum_{i=1}^M n_i\log(n_i)</script> <p>Since the probability $$p_i=n_i/N$$, we can re-express $$S_b$$ in terms of $$p_i$$ via</p> <script type="math/tex; mode=display">S_B \approx N\log(N)-N\sum_{i=1}^M p_i\log(Np_i)</script> <p>Since $$\sum_ip_i=1$$, we have</p> <script type="math/tex; mode=display">S_B \approx -N\sum_{i=1}^M p_i\log(p_i) = N \, S.</script> <p>Phew! So, the Boltzmann entropy $$S_b$$ of having $$N$$ students in $$M$$ groups with sized $$n_i$$ is (approximately) $$N$$ times the Shannon entropy.</p> <h2 id="who-cares">Who Cares?</h2> <p>Admittedly, this kind of theoretical revalation will probably not change the way you deploy cross-entropy in your machine learning projects. It is primarily used because its gradients behave well, which is important in the stochastic gradient-descent algorithms favored by modern deep-learning architectures. However, I personally have a strong dislike of using tools that I don’t have both a theoretical understanding of; hopefully you now have a better grip on the theoretical underpinnings of cross entropy, and its relationship to statistical mechanics.</p> <!-------------------------------- FOOTER ----------------------------> <!-- Wish we could put this in _includes/scripts.html. But it doesn't run from --> <!-- there. It needs to be run at the bottom of the file, rather than at the --> <!-- top; perhaps that has something to do with it. Anyways, I'll just include --> <!-- this chunk of HTML at the footer of all my posts, even though its fugly. --> <div id="disqus_thread"></div> <script> /** * RECOMMENDED CONFIGURATION VARIABLES: EDIT AND UNCOMMENT THE SECTION BELOW TO INSERT DYNAMIC VALUES FROM YOUR PLATFORM OR CMS. * LEARN WHY DEFINING THESE VARIABLES IS IMPORTANT: https://disqus.com/admin/universalcode/#configuration-variables*/ /* var disqus_config = function () { this.page.url = PAGE_URL; // Replace PAGE_URL with your page's canonical URL variable this.page.identifier = PAGE_IDENTIFIER; // Replace PAGE_IDENTIFIER with your page's unique identifier variable }; */ (function() { // DON'T EDIT BELOW THIS LINE var d = document, s = d.createElement('script'); s.src = 'https://pwills-com.disqus.com/embed.js'; s.setAttribute('data-timestamp', +new Date()); (d.head || d.body).appendChild(s); })(); </script> <noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> <div class="footnotes"> <ol> <li id="fn:fnote2"> <p>Often a constant will be included in this definition, so that $$S=k_B \log(\Omega)$$. This constant is arbitrary, as it simply rescales the units of our entropy, and it will only serve to get in the way of our analysis, so we omit it. <a href="#fnref:fnote2" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:fnote1"> <p>All we need to know for the purpose of establishing a connection between thermodynamic and information-theoretic entropy; of course there is much more to know, and there are many alternative ways of conceptualizing entropy. However, none of these have ever been intuitive to me in the way that Boltzmann’s definition of entropy is. <a href="#fnref:fnote1" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:fnote3"> <p>We have slightly rephrased Sivia’s presentation to fit our purposes here. <a href="#fnref:fnote3" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:fnote4"> <p>The most commonly used form of Stirling’s approximation is the more precise $$\log(n!)\approx n\log(n)-n$$, but we use a coarser form here. <a href="#fnref:fnote4" class="reversefootnote">&#8617;</a></p> </li> </ol> </div>Peter Willspeter@pwills.comEntropy is a word that we see a lot in various forms. It’s classical use comes from thermodynamics: e.g. “the entropy in the universe is always increasing.” With the recent boom in statistics and machine learning, the word has also seen a surge in use in information-theoretic contexts: e.g. “minimize the cross-entropy of the validation set.” It’s been an ongoing investigation for me, trying to figure out just what the hell this information-theoretic entropy is all about, and how it connects to the notion I’m familiar with from statistical mechanics. Reading through the wonderful book Data Analysis: a Bayesian Tutorial by D. S. Sivia, I found the first connection between these two notions that really clicked for me. I’m going to run through the basic argument here, in the hope that reframing it in my own words will help me understand it more thoroughly. Entropy in ThermodynamicsA Website is Born!2017-12-20T00:00:00+00:002017-12-20T00:00:00+00:00http://www.pwills.com/blog/posts/2017/12/20/website<p>I learned a lot while building this website; I hope to share it so that it might be helpful for anyone trying to do the same. I’m sure you’ll notice that I’m far from an expert in the subjects we’re going to explore here; this is my first foray into web development. If you have any corrections, or things I’ve misunderstood, I’d love to hear about it! Just post a comment.</p> <p>The site is built using <a href="https://jekyllrb.com/">Jekyll</a>, using the theme <a href="https://mmistakes.github.io/minimal-mistakes/">Minimal Mistakes</a>. I host it on <a href="https://pages.github.com/">Github pages</a>, and purchased and manage my domain through <a href="https://domains.google/#/">Google Domains</a>. We’ll go through each of these steps in detail. I’ll assume that you have the up-to-date versions of Ruby and Jekyll on your local machine. I’m going through all this in macOS, which may affect some of the shell commands I give, but translating to Windows shouldn’t be too hard.</p> <h2 id="making-a-site-with-minimal-mistakes">Making a site with Minimal Mistakes</h2> <p>The website for Minimal Mistakes includes a great quick-start guide; I recommend the <a href="https://mmistakes.github.io/minimal-mistakes/docs/quick-start-guide/#starting-from-jekyll-new">Starting with <code class="highlighter-rouge">jekyll new</code></a> section as a place to start. Using this you shoudl be able to establish a base site with some simple demonstration content.</p> <h3 id="enabling-mathjax">Enabling MathJax</h3> <p>In order to enable <a href="https://www.mathjax.org">MathJax</a>, which renders the mathematical equations you see in my posts, you’ll need to edit the file <code class="highlighter-rouge">scripts.html</code> contained in the folder <code class="highlighter-rouge">_includes/</code> to include a line enabling MathJax. However, you’ll want to avoid overwriting the contents of the default <code class="highlighter-rouge">scripts.html</code>.</p> <p>So, we need to find where <code class="highlighter-rouge">bundle</code> is storing the Gem for Minimal Mistakes. To find this, do</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bundle show minimal-mistakes-jekyll </code></pre></div></div> <p>If you just want to navigate directly to that directory, do</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd $(bundle show minimal-mistakes-jekyll) </code></pre></div></div> <p>Now you can copy the default <code class="highlighter-rouge">scripts.html</code> into your site:</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cp _includes/scripts.html /path/to/site/_includes/scripts.html </code></pre></div></div> <p>Open the copied <code class="highlighter-rouge">scripts.html</code> in your editor of choice,<sup id="fnref:fnote1"><a href="#fn:fnote1" class="footnote">1</a></sup> and add the following lines at the end:</p> <figure class="highlight"><pre><code class="language-html" data-lang="html"> {% if page.mathjax %} <span class="nt">&lt;script </span><span class="na">type=</span><span class="s">"text/javascript"</span> <span class="na">async</span> <span class="na">src=</span><span class="s">"https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML"</span><span class="nt">&gt;</span> <span class="nt">&lt;/script&gt;</span> {% endif %} </code></pre></figure> <p>And you’re done!<sup id="fnref:fnote2"><a href="#fn:fnote2" class="footnote">2</a></sup> Now, you can type <code class="highlighter-rouge">$$x_1$$</code> to see <script type="math/tex">x_1</script>, and so on. The <code class="highlighter-rouge">$$...$$</code> syntax will generate inline math if used inline, and will generate a display equation if used on its own line. So, if one enters</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$$f(a) = \frac{1}{2\pi i} \oint_\gamma \frac{f(z)}{z-a} dz$$ </code></pre></div></div> <p>Then the rendered equation appears as so:</p> <script type="math/tex; mode=display">f(a) = \frac{1}{2\pi i} \oint_\gamma \frac{f(z)}{z-a} dz</script> <h3 id="customize-font-sizes">Customize Font Sizes</h3> <p>I found the fonts a bit oversized, so I wanted to change the size for the posts. In order to do this, you need to copy <strong>the entire folder</strong> which contains all the relevant scss files. In order to do this, do</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd$(bundle show minimal-mistakes-jekyll) cp -r _sass /path/to/site </code></pre></div></div> <p>Now, after much digging through the GitHub issues,<sup id="fnref:fnote3"><a href="#fn:fnote3" class="footnote">3</a></sup> I found that the file to edit here is <code class="highlighter-rouge">_sass/_reset.scss</code>. In my site, the relevant chunk of text looks like</p> <figure class="highlight"><pre><code class="language-html" data-lang="html"> @include breakpoint($medium) { font-size: 13px; } @include breakpoint($large) { font-size: 15px; } @include breakpoint(\$x-large) { font-size: 18px; }</code></pre></figure> <p>Once this file has been edited, you should see the font size reduced in your page.</p> <h2 id="getting-it-on-github-pages">Getting it on GitHub Pages</h2> <p>Okay, now we write a bunch of nonsense, find some beautiful pictures at <a href="https://git-scm.com/docs/gittutorial">Unsplash</a> to use as headers, and we’re ready to publish the thing on GitHub Pages. I’ll first go through as though we don’t want to use a custom domain, so that the website will be exposed at <code class="highlighter-rouge">USERNAME.github.io</code>.</p> <h3 id="enabling-jekyll-remote-theme">Enabling <code class="highlighter-rouge">jekyll-remote-theme</code></h3> <p>First of all, make sure that you’re using the <code class="highlighter-rouge">remote-theme</code> jekyll plugin, which allows you to use any jekyll theme that is GitHub hosted, rather than only the few that are officially supported. This process is outlined on the Minimal Mistakes website, but I’ll go through it here.</p> <p>First, <strong>in your <code class="highlighter-rouge">_config.yml</code> file</strong>, enable the plugin by including it in the <code class="highlighter-rouge">plugins</code> list, via</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>plugins: - jekyll-remote-theme </code></pre></div></div> <p>If you have other plugins you want to use (I use <code class="highlighter-rouge">jekyll-feed</code>), then add them to this list as well. Designate the <code class="highlighter-rouge">remote_theme</code> variable, but do so <strong>after setting the theme</strong>, so that you have in your config file</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>theme: "minimal-mistakes-jekyll" remote_theme: "mmistakes/minimal-mistakes" </code></pre></div></div> <p>Finally, in your <code class="highlighter-rouge">Gemfile</code>, add <code class="highlighter-rouge">gem "jekyll-remote-theme"</code>.</p> <h3 id="push-it-to-the-repository">Push it to the repository</h3> <p>GitHub pages looks for a repository that follows the naming convention <code class="highlighter-rouge">USERNAME.github.io</code>. So, for example, since my GitHub username is <code class="highlighter-rouge">peterewills</code>, the repository for the source of this site is at <code class="highlighter-rouge">https://www.github.com/peterewills/peterewills.github.io</code>.</p> <p>Once you’ve created such a repository, initialize a git repo on your site by going into <code class="highlighter-rouge">path/to/your/site</code> and doing <code class="highlighter-rouge">git init</code>. Then, do</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git remote add origin https://www.github.com/USERNAME/USERNAME.github.io </code></pre></div></div> <p>and then commit and push. (If you’re unfamiliar with using git, I recommend <a href="https://git-scm.com/docs/gittutorial">either</a> of <a href="https://try.github.io/levels/1/challenges/1">these</a> tutorials.) You’ll get an email that your page build was successful, but you’re “using an unsupported theme.” Don’t worry about this; it happens whenever you use <code class="highlighter-rouge">remote-theme</code>.</p> <p>You now should be able to navigate to <code class="highlighter-rouge">USERNAME.github.io</code> and see your page!</p> <h2 id="using-a-custom-domain">Using a Custom Domain</h2> <p>Suppose you’d prefer to use a custom domain, such as <code class="highlighter-rouge">mydomain.pizza</code> (this is actually a real, and available, domain name). There are lots of ways to do this; I did it through <a href="https://domains.google.com">Google Domains</a>, so I’ll go through those steps.</p> <p>First, you go to <a href="https://domains.google.com">Google Domains</a>, pick out the domain you want, and register it. For this example, we’ll assume you went with <code class="highlighter-rouge">mydomain.pizza</code>. You should now see it appear under the <strong>My Domains</strong> tab on the right side of the page. You should see a domain called <code class="highlighter-rouge">mydomain.pizza</code> and a <strong>DNS</strong> option. This is what we need to edit.</p> <p>We need to configure the DNS behavior of our domain so that it points at the IP address where GitHub Pages is hosting it. On the DNS page, scroll down to <strong>Custom Resource Records</strong>. You’ll want to add three custom resource records; two “host” resource records (designated by an A) and one “alias” resource record (designated by CNAME). GitHub pages exposes its sites at IP addresses 192.30.252.153 and 192.30.252.154. So, you’ll want to add both of these as host resource records. You’ll want to add your GitHub Pages url <code class="highlighter-rouge">USERNAME.github.io</code> as an alias record. By the time you’ve added the three, your list of resource records should look like the example below.</p> <p><img src="/assets/images/custom_resource.png" alt="" /></p> <p>So, now your url (<code class="highlighter-rouge">mydomain.pizza</code>) knows that it is an alias for <code class="highlighter-rouge">USERNAME.github.io</code>, but we still have to specify this aliasing on the GitHub end of things.</p> <p>To do this, simply make a text file called <code class="highlighter-rouge">CNAME</code> and include on the first line</p> <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mydomain.pizza </code></pre></div></div> <p>This is the entire contents of the text file <code class="highlighter-rouge">CNAME</code>. Once this is pushed to the repository <code class="highlighter-rouge">USERNAME/USERNAME.github.io</code>, the appropriate settings should automatically update themselves. To check this, go to the respository settings, scroll down to the “GitHub Pages” settings, and look under “Custom domain.” You should see something like the following:</p> <p><img src="/assets/images/github_repo.png" alt="" /></p> <p>If the DNS record of your Google domain has not yet been updated, then you will see <code class="highlighter-rouge">Your site is ready to be published mydomain.pizza</code> on a yellow background. Note that it sometimes takes up to 48 hours for DNS records to update, so be patient.</p> <h2 id="conclusion">Conclusion</h2> <p>Once the DNS records have updated, you should be able to see your site at <code class="highlighter-rouge">mydomain.pizza</code>. You can check out <a href="https://www.github.com/peterewills/peterewills.github.io">the repository for my site</a> to see examples of what I’ve gone through here; including my <code class="highlighter-rouge">CNAME</code> file, my <code class="highlighter-rouge">_include/scripts.html</code> file that enables MathJax, and my <code class="highlighter-rouge">_config.yml</code> file. Please let me know, either by email or in the comments, if you have any questions or corrections!</p> <!-------------------------------- FOOTER ----------------------------> <!-- Wish we could put this in _includes/scripts.html. But it doesn't run from --> <!-- there. It needs to be run at the bottom of the file, rather than at the --> <!-- top; perhaps that has something to do with it. Anyways, I'll just include --> <!-- this chunk of HTML at the footer of all my posts, even though its fugly. --> <div id="disqus_thread"></div> <script> /** * RECOMMENDED CONFIGURATION VARIABLES: EDIT AND UNCOMMENT THE SECTION BELOW TO INSERT DYNAMIC VALUES FROM YOUR PLATFORM OR CMS. * LEARN WHY DEFINING THESE VARIABLES IS IMPORTANT: https://disqus.com/admin/universalcode/#configuration-variables*/ /* var disqus_config = function () { this.page.url = PAGE_URL; // Replace PAGE_URL with your page's canonical URL variable this.page.identifier = PAGE_IDENTIFIER; // Replace PAGE_IDENTIFIER with your page's unique identifier variable }; */ (function() { // DON'T EDIT BELOW THIS LINE var d = document, s = d.createElement('script'); s.src = 'https://pwills-com.disqus.com/embed.js'; s.setAttribute('data-timestamp', +new Date()); (d.head || d.body).appendChild(s); })(); </script> <noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> <div class="footnotes"> <ol> <li id="fn:fnote1"> <p>Presumably emacs. <a href="#fnref:fnote1" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:fnote2"> <p>Some <a href="http://dasonk.com/blog/2012/10/09/Using-Jekyll-and-Mathjax">older blog posts</a> discuss the process of adding kramdown as the markdown rendering engine, but this is default behavior for Jekyll 3.x, so there’s no need to do this step. <a href="#fnref:fnote2" class="reversefootnote">&#8617;</a></p> </li> <li id="fn:fnote3"> <p>Michael, the guy who built Minimal Mistakes, is really wonderful about responding to issues on GitHub, which are really used as a support forum for people using the theme who have no experience in web development (such as myself). <a href="#fnref:fnote3" class="reversefootnote">&#8617;</a></p> </li> </ol> </div>Peter Willspeter@pwills.comI learned a lot while building this website; I hope to share it so that it might be helpful for anyone trying to do the same. I’m sure you’ll notice that I’m far from an expert in the subjects we’re going to explore here; this is my first foray into web development. If you have any corrections, or things I’ve misunderstood, I’d love to hear about it! Just post a comment.