Suppose you are faced with a high dimensional dataset and want to find some structure in the data: often there are only a few causes, but lots of different data points are generated due to noise corruption. How can we infer these causes? Here I’m going to cover the simplest method to do this inference: we will assume the data is generated by a linear transformation of the hidden causes. In this case, it is quite simple to recover the parameters of this transformation and therefore determine the hidden (or latent) variables which represent their cause.

# Tag: code

## Parallel programming with opencl and python: parallel scan

This post will continue the subject of how to implement common algorithms in a parallel processor, which I started to discuss here. Today we come to the second pattern, the scan. An example is the cumulative sum, where you iterate over an array and calculate the sum of all elements up to the current one. Like reduce, the algorithm we’ll talk about is not exclusive for the sum operation, but for any binary associative operation (the max is another common example). There are two ways to do a parallel scan: the hills steele scan, which needs $\log n$ steps; and the blelloch scan, requiring $2 \log n$ steps. The blelloch scan is useful if you have more data than processors, because it only needs to do $\mathcal{O}(n)$ operations (this quantity is also referred to as the work efficiency); while the hillis steele scan needs $\mathcal{O}(n \log n)$ operations. So let’s look at how to implement both of them with opencl kernels.

Continue reading “Parallel programming with opencl and python: parallel scan”

## Parallel programming with opencl and python: parallel reduce

Once you know how to use python to run opencl kernels on your device (read Part I and Part II of this series) you need to start thinking about the programming patterns you will use. While many tasks are inherently parallel (like calculating the value of a function for N different values) and you can just straightforwardly run N copies on your processors, most interesting tasks involve dependencies in the data. For instance if you want to simply sum N numbers in the simplest possible way, the thread doing the summing needs to know about all N numbers, so you can only run one thread, leaving most of your cores unused. So what we need to come up with are clever ways to decompose the problem into individual parts which can be run in parallel, and then combine them all in the end.

Continue reading “Parallel programming with opencl and python: parallel reduce”

## Parallel programming with opencl and python: vectors and concurrency

Last time we saw how to run some simple code on the gpu. Now let’s look at some particular aspects related to parallel programming we should be aware of. Since gpus are massively parallel processors, you’d expect you could write your kernel code for a single data piece, and by running enough copies of the kernel you’d be maximizing your device’s performance. Well, you’d be wrong! I’m going to focus on the three most obvious issues which could hamper your parallel code’s performance:

- Each of the individual cores is actually a vector processor, which means it can perform an operation on multiple numbers at a time.
- At some point the individual threads might need to write to the same position in memory (i.e. to accumulate a value). To make sure the result is correct, they need to take turns doing it, which means they spend time waiting for each other doing nothing.
- Most code is limited by memory bandwidth, not compute performance. This means that the gpu can’t get the data to the processing cores as fast as they can actually perform the computation required.

Continue reading “Parallel programming with opencl and python: vectors and concurrency”

## Parallel programming with opencl and python

In the next few posts I’ll cover my experiences with learning how to program efficient parallel programs on gpus using opencl. Because the machine I got was a mac pro with the top of the line gpus (7 teraflops) I needed to use opencl, which is a bit complex and confusing at first glance. It also requires a lot of boilerplate code which makes it really hard to just jump in and start experimenting. I ultimately decided to use pyopencl, which allows us to do the boring boilerplate stuff in just a few lines of python and focus on the actual parallel programs (the kernels).

First, a few pointers on what I read. A great introduction to the abstract concepts of parallel programming is the udacity course Introduction to parallel programming. They use C and CUDA to illustrate the concepts, which means you can’t directly apply what you see there on a computer with a non nvidia gpu. To learn the opencl api itself, I used the book OpenCL in Action: How to Accelerate Graphics and Computation. As for pyopencl, the documentation is a great place to start. You can also find all the python code I used in github. Continue reading “Parallel programming with opencl and python”

## Quick introduction to gaussian mixture models with python

Usually we like to model probability distributions with gaussian distributions. Not only are they the maximum entropy distributions if we only know the mean and variance of a dataset, the central limit theorem guarantees that random variables which are the result of summing many different random variables will be gaussian distributed too. But what to do when we have multimodal distributions like this one?

A gaussian distribution would not represent this very well. So what’s the next best thing? Add another gaussian! A gaussian mixture model is defined by a sum of gaussians $$P(x)=\sum_i w_i \, \mathcal{G}(\mu_i, \Sigma_i)$$ with means $\mu$ and covariance matrices $\Sigma$. Continue reading “Quick introduction to gaussian mixture models with python”

## An introduction to the metropolis method with python

I already talked about MCMC methods before, but today I want to cover one of the most well known methods of all, Metropolis-Hastings. The goal is to obtain samples according to to the equilibrium distribution of a given physical system, the Boltzmann distribution. Incidentally, we can also rewrite arbitrary probability distributions in this form, which is what allows for the cross pollination of methods between probabilistic inference and statistical mechanics (look at my older post on this). Since we don’t know how to sample directly from the boltzmann distribution in general, we need to use some sampling method. Continue reading “An introduction to the metropolis method with python”

## Pinch to zoom in libgdx

So I was a bit confused how to reproduce the multitouch gesture you often see in mobile gallery apps using libgdx. The idea is to zoom and recenter the viewport such that the points where your fingers are anchored are always the same (in game coordinates). Assuming you don’t need to rotate, here is the code I came up with:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
public class MyGestures implements GestureListener { /* more stuff.... */ @Override public boolean pinch(Vector2 initialPointer1, Vector2 initialPointer2, Vector2 pointer1, Vector2 pointer2) { //grab all the positions touchPos.set(initialPointer1.x, initialPointer1.y, 0); camera.unproject(touchPos); float x1n = touchPos.x; float y1n = touchPos.y; touchPos.set(initialPointer2.x, initialPointer2.y, 0); camera.unproject(touchPos); float x2n = touchPos.x; float y2n = touchPos.y; touchPos.set(pointer1.x, pointer1.y, 0); camera.unproject(touchPos); float x1p = touchPos.x; float y1p = touchPos.y; touchPos.set(pointer2.x, pointer2.y, 0); camera.unproject(touchPos); float x2p = touchPos.x; float y2p = touchPos.y; float dx1 = x1n - x2n; float dy1 = y1n - y2n; float initialDistance = (float) Math.sqrt(dx1*dx1+dy1*dy1); float dx2 = x1p - x2p; float dy2 = y1p - y2p; float distance = (float) Math.sqrt(dx2*dx2+dy2*dy2); if(zooming == false) { zooming = true; cx = (_x1 + _x2)/2; cy = (_y1 + _y2)/2; px = camera.position.x; py = camera.position.y; initZoom = camera.zoom; } else { float nextZoom = (initialDistance/distance)*scale; /* do some ifs here to check if nextZoom is too zoomed in or out*/ camera.zoom = nextZoom; camera.update(); Vector3 pos = new Vector3((pointer1.x + pointer2.x)/2, (pointer1.y + pointer2.y)/2, 0f); camera.unproject(pos); dx = cx - pos.x; dy = cy - pos.y; /* do some ifs here to check if we are in bounds*/ camera.translate(dx, dy); camera.update(); } return false; } } |

Of course, you shouldn’t put all this stuff into this method: each logical piece of code should be in its own method (and in minesweeper most of it is actually on another object, since I like to have only code relating to gesture handling on the gesture handler object)

## How to do inverse transformation sampling in scipy and numpy

Let’s say you have some data which follows a certain probability distribution. You can create a histogram and visualize the probability distribution, but now you want to sample from it. How do you go about doing this with python?

The short answer:

1 2 3 4 5 6 7 8 9 10 |
import numpy as np import scipy.interpolate as interpolate def inverse_transform_sampling(data, n_bins=40, n_samples=1000): hist, bin_edges = np.histogram(data, bins=n_bins, density=True) cum_values = np.zeros(bin_edges.shape) cum_values[1:] = np.cumsum(hist*np.diff(bin_edges)) inv_cdf = interpolate.interp1d(cum_values, bin_edges) r = np.random.rand(n_samples) return inv_cdf(r) |

The long answer:

You do inverse transform sampling, which is just a method to rescale a uniform random variable to have the probability distribution we want. The idea is that the cumulative distribution function for the histogram you have maps the random variable’s space of possible values to the region [0,1]. If you invert it, you can sample uniform random numbers and transform them to your target distribution!

To implement this, we calculate the CDF for each bin in the histogram (red points above) and interpolate it using scipy’s interpolate functions. Then we just need to sample uniform random points and pass them through the inverse CDF! Here is how it looks:

## A lovely new minesweeper on android I made

Today I am finally releasing my little minesweeper for android! I’ve been working on this as a hobby for the past few weekends, and now it is finally smooth enough to let other people see it! The problem with most minesweeper applications in the market is that they are either really ugly or haven’t really figured out how to adapt the original mouse controls to a touchscreen. I set out to solve these two problems so I can play some mines on my phone!

To solve the ugliness problem, I drew some tiles in photoshop in a very minimal style, to disturb the eyes as little as possible and let you focus on the game. Here is how it turned out.

To navigate the board, you can use the normal multitouch gestures like pan and pinch to zoom. To place a flag, you can long press a tile or you can double tap an open tile and drag to a closed tile (these gestures won’t let you win speed competitions, but they’re pretty good if you’re lazily solving the board)

You also get some pretty sweet statistics when you win or lose!