DS Interview Study Guide Part II: Software Engineering

This post continues my series on data science interviews. One of the major difficulty of doing data science interviews is that you must show expertise in a w...

DS Interview Study Guide Part I: Statistics

As I have gone through a couple rounds of interviews for data scientist positions, I’ve been compiling notes on what I consider to be the essential areas of ...

New Paper: Metrics For Graph Comparison

I just put a new paper up on the arXiv, and so I thought I would share it here. This was the final paper I wrote for my Ph.D., and it’s the one I’m most prou...

Types as Propositions

Some of the most meaningful mathematical realizations that I’ve had have been unexpected connections between two topics; that is, realizing that two concepts...

Inverse Transform Sampling in Python

When doing data work, we often need to sample random variables. This is easy to do if one wishes to sample from a Gaussian, or a uniform random variable, or ...

Algorithmic Musical Genre Classification

In this project, I construct a data pipeline which intakes raw .wav files, and then uses machine learning to predict the genre of the track. We first do a fr...

The Meaning of Entropy

Entropy is a word that we see a lot in various forms. It’s classical use comes from thermodynamics: e.g. “the entropy in the universe is always increasin...

Building a Personal Site with Jekyll & Minimal Mistakes

I learned a lot while building this website; I hope to share it so that it might be helpful for anyone trying to do the same. I’m sure you’ll notice that I’m...

Anomaly Detection in Dynamic Networks

“Data analysis” is a hugely popular thing these days, for obvious reasons. When most people think of “data,” they think of a table where the columns are vari...

NetComp: Network Comparison in Python

As I worked on my research on network data analysis, it became clear that there was a need for a Python library that implemented the analytical tools I was i...