More minimal speech recognition using Tensorflow

A GRU and a fully connected layer with softmax output can be used to recognise words in an audio stream (sequence to sequence). I recorded a random sequence of 100 utterances of 4 words ("left", "right", "stop", "go") with a sampling rate of 11025Hz. Furthermore 300 seconds of background noise were recorded. All audio data was grouped into chunks of 512 samples. The logarithm of the Fourier spectrum of each chunk was used as input for the GRU. The initial state of the GRU (128 units) was set to zero. The network was trained in a sequence-to-sequence setting to recognise words given a sequence of audio chunks.

Speech recognition

The example was trained using gradient descent with learning rate alpha = 0.05. The background noise was cycled randomly and words were inserted with random length pause between them. The network was trained to output the word label 10 times after recognising a word. The training took about one hour on a CPU. The resulting example is more minimalistic than the Tensorflow audio recognition example. The algorithm is very similar to an exercise in Andrew Ng's Deep learning specialization.


See CourseDuck for a curated list of machine learning online courses.

Minimal speech recognition using Tensorflow

An LSTM can be used to recognise words from audio data (sequence to one). I recorded a random sequence of 320 utterances of 5 words ("left", "right", "stop", "go", and "?" for background noise) with a sampling rate of 16kHz. The audio data was split into chunks of 512 samples. The word start and end were identified using a rising and a falling threshold for the root-mean-square value of each chunk. The logarithm of the Fourier spectrum of each chunk was used as input for the LSTM. The initial state and output of the LSTM (16 units each) was set to zero. The final output state of the LSTM was used as input for a fully connected layer with 5 output units followed by a softmax classifier. The network was trained in a sequence-to-one setting to recognise a word given a sequence of audio chunks as shown in the following figure.

Speech recognition

The example was trained using gradient descent with multiplier alpha = 0.001. The training took seven hours on a CPU. The sequence-to-one setting handles words of different lengths gracefully. The resulting example is more minimalistic than the Tensorflow audio recognition example.

In the following video the speech recognition is used to control a robot.

A divide-and-conquer implementation of the GJK algorithm

The Gilbert-Johnson-Keerthi (GJK) algorithm is a method for collision detection of convex polyhedra. The method is used in multiple rigid body simulations for computer games.


  • a set of points defining the convex polyhedron A
  • a set of points defining the convex polyhedron B

The algorithm returns:

  • the distance of the polyhedra
  • two closest points
  • the two simplices (convex subsets of up to 4 points in 3 dimensions) of A and B containing the two closest points

An n-dimensional simplex is the convex hull of n+1 points as shown in the figure below.

n-dimensional simplices for different values of n

The algorithm makes use of the fact, that the distance between two sets A and B is the distance of the Minkowski difference to the origin:


The GJK algorithm iteratively updates a simplex until the closest point to the origin is found.

The algorithm iterates using support points. Given a set M and a vector d, the support point is defined as the furthest point of M in direction d:

The GJK algorithm detects the two closest points of A and B as follows:

  1. Choose any point m in M(A,B) in the Minkowski difference of A and B.
  2. Set the initial simplex w0 to w0={m}.
  3. Let d be the closest point of wk to the origin.
  4. If d s(wk, -d)>=d s(M,-d) then return d.
  5. Set w'k+1=wk∪{s(M,-d)}
  6. Set wk+1 to the smallest simplex in w'k+1 still containing s(M,-d).
  7. Continue with step 3.

Note that step 3 requires finding the closest point of the simplex wk to the origin. Most implementations of the GJK algorithm seem to use the following approach:

  • Check if the origin is contained in the 3D simplex.
  • If not, check whether the origin is near one of the 4 surfaces of the simplex.
  • If the origin is not in the Delaunay region of any surface, check whether the origin is near one of the 6 edges.
  • If not in a Delaunay region of an edge, find the closest point of the 4 points.

A much more compact implementation can be obtained using a divide-and-conquer approach:

  • Let wk={wk0,wk1,...,wkn}
  • Solve the least squares equation system
  • If all ti>=0 and t1+t2+...+tn<=1 (or n=0), then wk0+Ht is the closest point.
  • Otherwise take the closest point of all sub-simplices with n-1 dimensions using the approach above (i.e. recursion).

Note that this approach will visit each edge up to two times, and each point up to three times. The performance is not optimal, but it makes for a much more concise implementation.

The least squares solution is:

Here follows an implementation in Scheme (GNU Guile). By using pairs of points from A and B instead of the Minkowski difference, one can keep track of the information required to determine the pair of closest points of A and B (instead of the closest point of M to the origin).

(use-modules (oop goops) (srfi srfi-1) (srfi srfi-26))

(define-method (- (a <list>))
  "Negate vector."
  (map - a))

(define-method (+ (a <list>) (b <list>))
  "Add two vectors."
  (map + a b))

(define-method (+ (a <list>) (b <real>))
  "Add vector and scalar."
  (map (cut + <> b) a))

(define-method (+ (a <real>) (b <list>))
  "Add vector and scalar."
  (map (cut + a <>) b))

(define-method (- (a <list>) (b <list>))
  "Subtract a vector from another."
  (map - a b))

(define-method (- (a <list>) (b <real>))
  "Subtract a vector from another."
  (map (cut - <> b) a))

(define-method (- (a <real>) (b <list>))
  "Subtract a vector from another."
  (map (cut - a <>) b))

(define-method (* (a <list>) (b <number>))
  "Multiply a vector with a scalar."
  (map (cut * <> b) a))

(define-method (* (a <number>) (b <list>))
  "Multiply a scalar with a vector."
  (map (cut * <> a) b))

(define (argop op fun lst)
  (let* [(vals  (map fun lst))
         (opval (apply op vals))]
    (list-ref (reverse lst) (1- (length (member opval vals))))))

(define (argmin fun lst) (argop min fun lst))

(define (argmax fun lst) (argop max fun lst))

(define (leave-one-out lst)
  (map (lambda (i) (append (take lst i) (drop lst (1+ i)))) (iota (length lst))))

(define (inner-product a b)
  "Compute inner product of two vectors."
  (reduce + 0 (map * a b)))

(define (norm v)
  "Return norm of a vector."
  (sqrt (inner-product v v)))

(define (transpose mat)
  "Transpose a matrix"
  (if (null? mat)
    (map (lambda (i) (map (cut list-ref <> i) mat)) (iota (length (car mat))))))

(define (dot mat vec)
  "Multiply a matrix with another matrix or a vector"
  (map (cut inner-product <> vec) mat))

(define (permutations lst)
  "Return all permutations of list LST. The permutations are ordered so that every alternate permutation is even."
  (if (zero? (length lst))
        (lambda (item index)
          (let [(remaining (delete item lst))
                (order     (if (even? index) identity reverse))]
            (map (cut cons item <>) (permutations (order remaining)))))
        (iota (length lst))))))

(define (determinant mat)
  "Compute determinant of a matrix"
  (let* [(n       (length mat))
         (indices (iota n))
         (perms   (permutations indices))]
    (reduce + 0
        (lambda (perm k)
          (* (reduce * 1 (map (lambda (j i) (list-ref (list-ref mat j) i))
                              indices perm))
             (if (even? k) 1 -1)))
         (iota (length perms))))))

(define (submatrix mat row column)
  "Return submatrix with specified ROW and COLUMN removed."
  (let [(rows    (delete row    (iota (length mat))))
        (columns (delete column (iota (length (car mat)))))]
    (map (lambda (j) (map (lambda (i) (list-ref (list-ref mat j) i)) columns)) rows)))

(define (inverse mat)
  "Compute inverse of matrix"
  (let [(det     (determinant mat))
        (indices (iota (length mat)))
        (sgn     (lambda (v j i) (if (eq? (even? j) (even? i)) v (- v))))]
    (map (lambda (j)
           (map (lambda (i) (sgn (/ (determinant (submatrix mat i j)) det) j i))

(define (least-squares design-matrix observation)
  "Least-squares solver"
  (if (null? design-matrix)
    (let [(design-matrix-transposed (transpose design-matrix))]
      (dot (inverse (dot design-matrix-transposed design-matrix))
           (dot design-matrix-transposed observation)))))

(define (support-point direction points)
  "Get outermost point of POINTS in given DIRECTION."
  (argmax (cut inner-product direction <>) points))

(define (center-of-gravity points)
  "Compute average of given points"
  (* (reduce + #f points) (/ 1 (length points))))

(define (closest-simplex-points simplex-a simplex-b)
  "Determine closest point pair of two simplices"
  (let* [(observation   (- (car simplex-a) (car simplex-b)))
         (design-matrix (- observation (transpose (- (cdr simplex-a)
                                                     (cdr simplex-b)))))
         (factors       (least-squares design-matrix observation))]
      (if (and (every positive? factors) (< (reduce + 0 factors) 1))
        (cons (cons (fold + (car simplex-a)
                          (map * factors
                               (map (cut - <> (car simplex-a)) (cdr simplex-a))))
                    (fold + (car simplex-b)
                          (map * factors
                               (map (cut - <> (car simplex-b)) (cdr simplex-b)))))
              (cons simplex-a simplex-b))
        (argmin (lambda (result) (norm (- (caar result) (cdar result))))
                (map closest-simplex-points
                     (leave-one-out simplex-a)
                     (leave-one-out simplex-b))))))

(define (gjk-algorithm convex-a convex-b)
  "Get pair of closest points of two convex hulls each defined by a set of points"
  (let [(simplex-a '())
        (simplex-b '())
        (closest (cons (center-of-gravity convex-a) (center-of-gravity convex-b)))]
    (while #t
      (let* [(direction  (- (car closest) (cdr closest)))
             (candidates (cons (support-point (- direction) convex-a)
                               (support-point direction convex-b)))]
        (if (>= (+ (inner-product direction (- direction)) 1e-12)
                (inner-product (- (car candidates) (cdr candidates)) (- direction)))
          (break closest))
        (let [(result (closest-simplex-points (cons (car candidates) simplex-a)
                                              (cons (cdr candidates) simplex-b)))]
          (set! closest (car result))
          (set! simplex-a (cadr result))
          (set! simplex-b (cddr result)))))))

Here an example of two colliding cuboids simulated using this implementation is shown:

Any feedback, comments, and suggestions are welcome.



I noticed that Baraff just uses a separating plane algorithm to achieve the same!

Minimal OpenGL example in C

OpenGL is a powerful cross-platform standard for 3D visualisation. OpenGL libraries use a domain specific language (shader language) to describe element-wise operations on vertices (vertex shader) and pixel values (fragment shader). More recent OpenGL versions also support geometry shaders and tesselation shaders (see OpenGL article on Wikipedia).

The learning curve for OpenGL is quite steep at the beginning. The reason is, that a program to draw a triangle is almost as complex as a program drawing thousands of triangles. It is also important to add code for retrieving error messages in order to be able to do development.

I haven't found many minimal examples to understand OpenGL, so I am posting one here. The example draws a coloured triangle on the screen.

#include <math.h>
#include <stdio.h>
#include <GL/glew.h>
#include <GL/glut.h>

const char *vertexSource = "#version 130\n\
in mediump vec3 point;\n\
in mediump vec2 texcoord;\n\
out mediump vec2 UV;\n\
void main()\n\
  gl_Position = vec4(point, 1);\n\
  UV = texcoord;\n\

const char *fragmentSource = "#version 130\n\
in mediump vec2 UV;\n\
out mediump vec3 fragColor;\n\
uniform sampler2D tex;\n\
void main()\n\
  fragColor = texture(tex, UV).rgb;\n\

GLuint vao;
GLuint vbo;
GLuint idx;
GLuint tex;
GLuint program;
int width = 320;
int height = 240;

void onDisplay(void)
  glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
  glDrawElements(GL_TRIANGLES, 3, GL_UNSIGNED_INT, (void *)0);

void onResize(int w, int h)
  width = w; height = h;
  glViewport(0, 0, (GLsizei)w, (GLsizei)h);

void printError(const char *context)
  GLenum error = glGetError();
  if (error != GL_NO_ERROR) {
    fprintf(stderr, "%s: %s\n", context, gluErrorString(error));

void printStatus(const char *step, GLuint context, GLuint status)
  GLint result = GL_FALSE;
  glGetShaderiv(context, status, &result);
  if (result == GL_FALSE) {
    char buffer[1024];
    if (status == GL_COMPILE_STATUS)
      glGetShaderInfoLog(context, 1024, NULL, buffer);
      glGetProgramInfoLog(context, 1024, NULL, buffer);
    if (buffer[0])
      fprintf(stderr, "%s: %s\n", step, buffer);

void printCompileStatus(const char *step, GLuint context)
  printStatus(step, context, GL_COMPILE_STATUS);

void printLinkStatus(const char *step, GLuint context)
  printStatus(step, context, GL_LINK_STATUS);

GLfloat vertices[] = {
   0.5f,  0.5f,  0.0f, 1.0f, 1.0f,
  -0.5f,  0.5f,  0.0f, 0.0f, 1.0f,
  -0.5f, -0.5f,  0.0f, 0.0f, 0.0f

unsigned int indices[] = { 0, 1, 2 };

float pixels[] = {
  0.0f, 0.0f, 1.0f, 0.0f, 1.0f, 0.0f,
  1.0f, 0.0f, 0.0f, 1.0f, 1.0f, 1.0f

int main(int argc, char** argv)
  glutInit(&argc, argv);
  glutInitDisplayMode(GLUT_DOUBLE | GLUT_RGB);
  glutInitWindowSize(width, height);

  glewExperimental = GL_TRUE;

  GLuint vertexShader = glCreateShader(GL_VERTEX_SHADER);
  glShaderSource(vertexShader, 1, &vertexSource, NULL);
  printCompileStatus("Vertex shader", vertexShader);

  GLuint fragmentShader = glCreateShader(GL_FRAGMENT_SHADER);
  glShaderSource(fragmentShader, 1, &fragmentSource, NULL);
  printCompileStatus("Fragment shader", fragmentShader);

  program = glCreateProgram();
  glAttachShader(program, vertexShader);
  glAttachShader(program, fragmentShader);
  printLinkStatus("Shader program", program);

  glGenVertexArrays(1, &vao);

  glGenBuffers(1, &vbo);
  glBindBuffer(GL_ARRAY_BUFFER, vbo);
  glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW);

  glGenBuffers(1, &idx);
  glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, idx);
  glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(indices), indices, GL_STATIC_DRAW);

  glVertexAttribPointer(glGetAttribLocation(program, "point"), 3, GL_FLOAT, GL_FALSE, 5 * sizeof(float), (void *)0);
  glVertexAttribPointer(glGetAttribLocation(program, "texcoord"), 2, GL_FLOAT, GL_FALSE, 5 * sizeof(float), (void *)(3 * sizeof(float)));



  glGenTextures(1, &tex);
  glBindTexture(GL_TEXTURE_2D, tex);
  glUniform1i(glGetUniformLocation(program, "tex"), 0);
  glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, 2, 2, 0, GL_BGR, GL_FLOAT, pixels);



  glBindTexture(GL_TEXTURE_2D, 0);
  glDeleteTextures(1, &tex);

  glDeleteBuffers(1, &idx);

  glBindBuffer(GL_ARRAY_BUFFER, 0);
  glDeleteBuffers(1, &vbo);

  glDeleteVertexArrays(1, &vao);

  glDetachShader(program, vertexShader);
  glDetachShader(program, fragmentShader);
  return 0;

The example uses the widely supported OpenGL version 3.1 (which has the version tag 130). You can download, compile, and run the example as follows:

gcc -o raw-opengl raw-opengl.c -lGL -lGLEW -lGLU -lglut


Any feedback, comments, and suggestions are welcome.


Steps towards a space simulator

I am quite interested in how simulators such as the Orbiter space simulator are implemented. A spacecraft can be seen as a rigid object with a moments of inertia tensor. Without any forces acting on the object, the rotational moment of the object does not change. In general the moments of inertia tensor causes the direction of the rotation vector to be different at each point in time even if the rotational moment is not changing. This motion can be numerically simulated using a higher order integration method such as 4th order Runge-Kutta. Here is a video showing the resulting simulation of a cuboid tumbling in space:

Brian Vincent Mirtich's thesis demonstrates how to simulate collisions of two convex polyhedra. Furthermore micro-collisions are used as a simple but powerful method to simulate resting contacts. If the micro-collisions are sufficietly small, a resting object can be approximated with sufficient accuracy:

One still needs to implement friction (also shown in Mirtich's thesis) which requires a numerical integral to compute the friction occuring during a micro-collision. Collisions of polyhedra are demonstrated in Mirtich's thesis as well, however it might be simpler to make use of the GJK algorithm. Planetary bodies, spacecraft, and other non-convex objects could be handled by dividing them into multiple convex objects. It would also be interesting to integrate soft body physics as shown in Rigs of Rods. However the accuracy of Rigs of Rods is not sufficiently high for space simulation. E.g. an object tumbling in space would not preserve its momentum.


In the following examples, dynamic Coloumb friction with the ground is simulated.