Kernel density estimation

Sometimes you need to estimate a probability distribution from a set of discrete points. You could build a histogram of the measurements, but that provides little information about the regions in phase space with no measurements (it is very likely you won’t have enough points to span the whole phase space). So the data set must be smoothed, as we did with time series. As in that case, we can describe the smoothing by a convolution with a kernel. In this case, the formula is very simple $$f(x)=\frac{1}{N}\sum_i^N K(x-x_i)$$

The choice of K is an art, but the standard choice is the gaussian kernel as we’ve seen before. Let’s try this out on some simulated data

 

Gaussian mixture we will sample from
Gaussian mixture we will sample from

And now let’s apply KDE to some sampled data.

 

100 samples, sigma 1
100 samples, sigma 1
1000 samples, sigma 1
1000 samples, sigma 1

The choice of standard deviation makes a big difference in the final result. For low amounts of data we want a reasonably high sigma, to smoothen out the large variations in the data. But if we have a lot of points, a lower sigma will more faithfully represent the original distribution:

1000 samples, sigma 0.1
1000 samples, sigma 0.1

 

 

An introduction to smoothing time series in python. Part IV: Particle Filter

Last time we started talking about state space models with the Kalman Filter. Our aim has been to find a smoothed trajectory for some given noisy observed data. In the case of state space models, we incorporate a model for the underlying physical system to calculate a likelihood for each trajectory and select the optimal one. When it comes to implementing this idea, we run into the problem of how to represent the probability distributions for the state of the system: it would be unfeasible to calculate the probability for every single point in phase space at all times. The Kalman filter solves this by approximating the target probability density function (abbreviated pdf) with a Gaussian, which has only two parameters. This approximation works surprisingly well but it might not be optimal for cases where the underlying pdf is multimodal.

Today we will be looking at another idea to implement state space models – the particle filter. Our goal is still to find the maximum of the posterior distribution $P(x|y,\theta)$, given by bayes’ formula: $$P(x|y,\theta)\propto P(y | x, \theta)P(x|\theta)P(\theta)$$ Now the idea is to approximate this distribution by randomly drawing points from it. Because we will only look at one time step at a time, the sequence of points we sample will be a markov chain; and because the method relies on random sampling we call it a markov chain monte carlo (MCMC) method. Continue reading “An introduction to smoothing time series in python. Part IV: Particle Filter”

Simulating tissues with pressure

One small project I did was to code up a simulation of a growing tissue which feels pressure and where each cell has a dynamic state which depends on its neighbors and the pressure it feels. The idea is to reproduce some essential properties of morphogenesis. You can look at the code here. I am going to talk about the most interesting parts of the code.

I initialize the cells in an ordered lattice, with random perturbations in their positions, except those which are in the borders (bottom, left, right). Those are static and do not evolve in the simulation like the others. They are there just to represent the pressure from the rest of the body (huge) on the simulated tissue (tiny). This is not a very realistic assumption because many developmental systems have a size of the order of the body size, but we have to start somewhere!

All cells are connected with springs, which simulate adhesive and pressure forces in the tissue. If left alone, the system relaxes into a hexagonal configuration, since this minimises the spring potential energy. I integrate the harmonic oscillator equations using a fourth order Adams Moulton algorithm.

Now, it is important to realize that there are two time scales in the system: the pressure equilibration and cell lifetimes. We can assume the mechanical pressure equilibrates very fast, while cell divisions take their time. So what we do is run the oscillator system until equlibrium for each time step of the cellular state evolution, which we will talk about later.

Springs connect each cell. The larger colored circle is of the same size as the rest length of the springs. Thus, overlapping circles mean the spring wants to extend, while spaces mean the spring wants to contract. The color denotes the automaton state of the cell.
Springs connect each cell. The larger colored circle is of the same size as the rest length of the springs. Thus, overlapping circles mean the spring wants to extend, while spaces mean the spring wants to contract. The colors are explained below.

Continue reading “Simulating tissues with pressure”