Stan working with Stan
Stan is a language used for statistical inference. Here we will investigate estimating the unknown length of a pendulum based on measurements of velocity and angle $\theta$. Two good examples to get started with Stan
are
Here Stan will be used to estimate parameters - specifically the length of pendulum. Consider the pendulum equation
$ \frac{d^2 \theta}{d t^2} + \frac{g}{l} \sin(\theta) = 0$,
where $\theta$ is the angle of the pendulum, $g$ gravitational constant, $l$ is the pendulum length and $t$ is time. With damping added, the equation is now
$ \frac{d^2 \theta}{d t^2} + \frac{g}{l} \sin(\theta) + s \frac{d \theta}{d t} = 0$,
where $s$ is a damping constant.
Letting $v = \frac{d \theta}{dt}$, we get the following system of differential equations:
$\frac{d v}{dt} = - \frac{g}{l} \sin(\theta) - s v$
$\frac{d \theta}{dt} = v$.
We can encode the DEs in Stan
as follows:
where dz_dt
returns
$ \begin{bmatrix} \frac{d v}{dt} \\ \frac{d\theta}{dt}\end{bmatrix} $
evaluated at a particular $\theta$ and $v$. x_r
and x_i
are unused arguments that could be used to provide additional arguments to the function. The ODE is solved in
where the arguments to the ode solvers are provided by the data
block:
$N$ in this case is the total measurement counts where the $i^{\text{th}}$ measurement is taken at time ts[i]
. The measurements along with an initial guess known as the prior will be used to estimate the length. The assumed initial conditions provided in y_init
and y
are the measurements of $v$ and $\theta$. In transformed parameters
the first two arguments of rep_array(0.0,0), rep_array(0,0), 1e-6, 1e-5, 1e6
are NULL
equivalent while the rest are ode solver specific. The user provided data
variables y_init
and y
are related to internal parameters z_init
in the model
block.
The model
block takes the state z
solved for in transformed parameters
and adds sensor noise. In addition, it provides initial guess or prior for the assumed pendulum length as well the initial conditions defined in z_init
. The full .stan
file can be found in pendulum.stan.
To estimate parameters we need data to work with - that is the measurements. To generated the measurements, we first need to solve the DE model and then add noise to it. Consider the matlab
script (scavanged version of this):
This script creates a csv file pend_noise.csv
with the noisy measurements of velocity and theta. A plot below shows the curve of $v$ and $\theta$ given that at time $t=0$
$v = 0$
$\theta = \pi - 0.01$
By time $20$ the state has been mostly damped out. We now have everything we need to estimate the length. We run Stan
through R
. The R
script is below and can be found in script.r.
In the terminal, open up R
and run
We now study a couple of cases:
Case 1:
The initial conditions used to generate the measurements and those in the prior are similair. Both the assumed in Stan
’s model
as well as the matlab script:
$\theta = \pi - 0.01$
$v = 0$
In Stan
the conditions have an assumed variance of $0.01^2$ for each parameter. The pendulum length is $1.9$ with variance $0.1^2$ while the true pendulum is of length $2$. The results of the script are
mean se_mean sd 10% 50% 90% n_eff Rhat
params[1] 1.866 0.054 0.076 1.820 1.823 1.998 2 39.819
z_init[1] 0.217 0.302 0.434 -0.038 -0.027 1.049 2 16.054
z_init[2] 1.452 0.527 0.746 1.017 1.023 2.762 2 66.696
The high Rhat
suggests that something is wrong. Debugging Stan
code and correctly interpreting the results is a lengthy topic not fully explored here. Further material can be found in the mc-stan
forums or in books like “Bayesian Data Analysis Third Edition” by Gelman et. al. We run the following script in R
to produce plots in order to further debug:
The first plot is the computed by Stan
posterior distribution of the length, initial velocity and initial theta. The red is the true initial state specified within the matlab script while black is the computed posterior mean. For both theta and velocity we see a multimodal distribution.
The plot shows that there is no clear guess as to what the initial conditions or the lengths is. The overlay of all the computed simulated $\theta$ s by Stan
in the non warm-up phase is below [hence the fat curves].
There are two main simulated trajectories of $\theta$. This is because the initial condition and its variance assumed in Stan
in variable z_init
is too close to $\pi$ as seen in figure below
The red line is at $\pi$. Part of the area is over $\pi$ meaning that some conditions are less than $\pi$ while others are greater than. This results in two different types of behaviours of the pendulum swinging down leftwards or rightwards.
Although through the matlab script we technically know the true pendulum length, that knowledge is unknown to the Stan
script. If the previous figures in Stan
is all that we had to work with then we would have lack confidence in our estimate of the pendulum length.
Case 2:
The variance is now shrunk to avoid having significant probablity for initial $\theta$ greater than $\pi$ to match the actual physical initial conditions defined in matlab script [pendulum falling towards the left]. Variance is now $0.0001^2$ for the prior:
After running Stan
again we obtain results that more closely match the true values of the parameters:
mean se_mean sd 10% 50% 90% n_eff Rhat
params[1] 1.999 0 0.002 1.997 1.999 2.001 63 1.000
z_init[1] 0.000 0 0.000 0.000 0.000 0.000 249 0.998
z_init[2] 3.132 0 0.000 3.132 3.132 3.132 154 1.001
In conclusion, this somewhat contrived examples goes to show that knowledge of the priors and model is very important for conducting a successful experiment.