(A Creative Blog Name Here)

Code, math, and other things I find useful

Variational lower-bound for hidden Markov models

if you define the forward messages as the filtering distribution (like Beal), then the normalization constant from the update is \(p(y_t | y_{1:t-1})\).

If you define the forward messages as the joint \(p(y_{1:t}, x_t)\) like Emily does (and like Matt and we do), then the ...

Convergent Series and lim inf

This little result came up when proving the convergence of a stochastic gradient algorithm and I want to write it down to remember it after discussions with Matt Johnson and Alex Tank.

Let \(a_1, a_2, \ldots\) be a positive sequence of numbers. If \(\sum_{n=1}^\infty \frac{1}{n ...