OpenGL Visualization with LWJGL

Using LWJGL’s OpenGL bindings and Fastmath to render data from NASA’s CGI Moon Kit

(Cross posting article published at Clojure Civitas)

Getting dependencies

First we need to get some libraries and we can use add-libs to fetch them.

(add-libs {'org.lwjgl/lwjgl                      {:mvn/version "3.3.6"}
           'org.lwjgl/lwjgl$natives-linux        {:mvn/version "3.3.6"}
           'org.lwjgl/lwjgl-opengl               {:mvn/version "3.3.6"}
           'org.lwjgl/lwjgl-opengl$natives-linux {:mvn/version "3.3.6"}
           'org.lwjgl/lwjgl-glfw                 {:mvn/version "3.3.6"}
           'org.lwjgl/lwjgl-glfw$natives-linux   {:mvn/version "3.3.6"}
           'org.lwjgl/lwjgl-stb                  {:mvn/version "3.3.6"}
           'org.lwjgl/lwjgl-stb$natives-linux    {:mvn/version "3.3.6"}
           'generateme/fastmath                  {:mvn/version "3.0.0-alpha3"}})
(require '[clojure.java.io :as io])
(import '[javax.imageio ImageIO]
        '[org.lwjgl BufferUtils]
        '[org.lwjgl.glfw GLFW]
        '[org.lwjgl.opengl GL GL11 GL13 GL15 GL20 GL30]
        '[org.lwjgl.stb STBImageWrite])

Creating the window

Next we choose the window width and height.

(def window-width 640)
(def window-height 480)
(def radius 1737.4)

We define a function to get the temporary directory.

(defn tmpdir
  []
  (System/getProperty "java.io.tmpdir"))

And then a function to get a temporary file name.

(defn tmpname
  []
  (str (tmpdir) "/civitas-" (java.util.UUID/randomUUID) ".tmp"))

The following function is used to create screenshots for this article. We read the pixels, write them to a temporary file using the STB library and then convert it to an ImageIO object.

(defn screenshot
  []
  (let [filename (tmpname)
        buffer   (java.nio.ByteBuffer/allocateDirect (* 4 window-width window-height))]
    (GL11/glReadPixels 0 0 window-width window-height
                       GL11/GL_RGBA GL11/GL_UNSIGNED_BYTE buffer)
    (STBImageWrite/stbi_write_png filename window-width window-height 4
                                  buffer (* 4 window-width))
    (-> filename io/file (ImageIO/read))))

We need to initialize the GLFW library.

(GLFW/glfwInit)

Now we create an invisible window. You can create a visisble window if you want to by not setting the visibility hint to false.

(def window
  (do
    (GLFW/glfwDefaultWindowHints)
    (GLFW/glfwWindowHint GLFW/GLFW_VISIBLE GLFW/GLFW_FALSE)
    (GLFW/glfwCreateWindow window-width window-height "Invisible Window" 0 0)))

If you have a visible window, you can show it as follows.

(GLFW/glfwShowWindow window)

Note that if you are using a visible window, you always need to swap buffers after rendering.

(GLFW/glfwSwapBuffers window)
(do
  (GLFW/glfwMakeContextCurrent window)
  (GL/createCapabilities))

Basic rendering

Clearing the window

A simple test is to set a clear color and clear the window.

(do
  (GL11/glClearColor 1.0 0.5 0.25 1.0)
  (GL11/glClear GL11/GL_COLOR_BUFFER_BIT)
  (screenshot))

screenshot 0

Creating shader programs

We define a convenience function to compile a shader and handle any errors.

(defn make-shader [source shader-type]
  (let [shader (GL20/glCreateShader shader-type)]
    (GL20/glShaderSource shader source)
    (GL20/glCompileShader shader)
    (when (zero? (GL20/glGetShaderi shader GL20/GL_COMPILE_STATUS))
      (throw (Exception. (GL20/glGetShaderInfoLog shader 1024))))
    shader))

We also define a convenience function to link a program and handle any errors.

(defn make-program [& shaders]
  (let [program (GL20/glCreateProgram)]
    (doseq [shader shaders]
           (GL20/glAttachShader program shader)
           (GL20/glDeleteShader shader))
    (GL20/glLinkProgram program)
    (when (zero? (GL20/glGetProgrami program GL20/GL_LINK_STATUS))
      (throw (Exception. (GL20/glGetProgramInfoLog program 1024))))
    program))

The following code shows a simple vertex shader which passes through vertex coordinates.

(def vertex-source "
#version 130

in vec3 point;

void main()
{
  gl_Position = vec4(point, 1);
}")

In the fragment shader we use the pixel coordinates to output a color ramp. The uniform variable iResolution will later be set to the window resolution.

(def fragment-source "
#version 130

uniform vec2 iResolution;
out vec4 fragColor;

void main()
{
  fragColor = vec4(gl_FragCoord.xy / iResolution.xy, 0, 1);
}")

Let’s compile the shaders and link the program.

(do
  (def vertex-shader (make-shader vertex-source GL20/GL_VERTEX_SHADER))
  (def fragment-shader (make-shader fragment-source GL20/GL_FRAGMENT_SHADER))
  (def program (make-program vertex-shader fragment-shader)))

Note: It is beyond the topic of this talk, but you can set up a Clojure function to test an OpenGL shader function by using a probing fragment shader and rendering to a one pixel texture. Please see my article Test Driven Development with OpenGL for more information!

Creating vertex buffer data

To provide the shader program with vertex data we are going to define just a single quad consisting of four vertices.

First we define a macro and use it to define convenience functions for converting arrays to LWJGL buffer objects.

(defmacro def-make-buffer [method create-buffer]
  `(defn ~method [data#]
     (let [buffer# (~create-buffer (count data#))]
       (.put buffer# data#)
       (.flip buffer#)
       buffer#)))
(do
  (def-make-buffer make-float-buffer BufferUtils/createFloatBuffer)
  (def-make-buffer make-int-buffer BufferUtils/createIntBuffer)
  (def-make-buffer make-byte-buffer BufferUtils/createByteBuffer))

We define a simple background quad spanning the entire window. We use normalised device coordinates (NDC) which are between -1 and 1.

(def vertices
  (float-array [ 1.0  1.0 0.0
                -1.0  1.0 0.0
                -1.0 -1.0 0.0
                 1.0 -1.0 0.0]))

The index array defines the order of the vertices.

(def indices
  (int-array [0 1 2 3]))

Setting up the vertex buffer

We add a convenience function to setup VAO, VBO, and IBO.

  • We define a vertex array object (VAO) which acts like a context for the vertex and index buffer.
  • We define a vertex buffer object (VBO) which contains the vertex data.
  • We also define an index buffer object (IBO) which contains the index data.
(defn setup-vao [vertices indices]
  (let [vao (GL30/glGenVertexArrays)
        vbo (GL15/glGenBuffers)
        ibo (GL15/glGenBuffers)]
    (GL30/glBindVertexArray vao)
    (GL15/glBindBuffer GL15/GL_ARRAY_BUFFER vbo)
    (GL15/glBufferData GL15/GL_ARRAY_BUFFER (make-float-buffer vertices)
                       GL15/GL_STATIC_DRAW)
    (GL15/glBindBuffer GL15/GL_ELEMENT_ARRAY_BUFFER ibo)
    (GL15/glBufferData GL15/GL_ELEMENT_ARRAY_BUFFER (make-int-buffer indices)
                       GL15/GL_STATIC_DRAW)
    {:vao vao :vbo vbo :ibo ibo}))

Now we use the function to setup the VAO, VBO, and IBO.

(def vao (setup-vao vertices indices))

The data of each vertex is defined by 3 floats (x, y, z). We need to specify the layout of the vertex buffer object so that OpenGL knows how to interpret it.

(do
  (GL20/glVertexAttribPointer (GL20/glGetAttribLocation program "point") 3
                              GL11/GL_FLOAT false (* 3 Float/BYTES) (* 0 Float/BYTES))
  (GL20/glEnableVertexAttribArray 0))

Rendering the quad

We select the program and define the uniform variable iResolution.

(do
  (GL20/glUseProgram program)
  (GL20/glUniform2f (GL20/glGetUniformLocation program "iResolution")
                    window-width window-height))

Since the correct VAO is already bound from the earlier example, we are now ready to draw the quad.

(GL11/glDrawElements GL11/GL_QUADS (count indices) GL11/GL_UNSIGNED_INT 0)
(screenshot)

screenshot 1

This time the quad shows a color ramp!

Finishing up

We only delete the program since we are going to reuse the VAO in the next example.

(GL20/glDeleteProgram program)

Rendering a Texture

Getting the NASA data

We define a function to download a file from the web.

(defn download [url target]
  (with-open [in (io/input-stream url)
              out (io/output-stream target)]
    (io/copy in out)))

If it does not exist, we download the lunar color map from the NASA CGI Moon Kit.

(do
  (def moon-tif "src/opengl_visualization/lroc_color_poles_2k.tif")
  (when (not (.exists (io/file moon-tif)))
    (download
      "https://svs.gsfc.nasa.gov/vis/a000000/a004700/a004720/lroc_color_poles_2k.tif"
      moon-tif)))

Create a texture

Next we load the image using ImageIO.

(do
  (def color (ImageIO/read (io/file moon-tif)))
  (def color-raster (.getRaster color))
  (def color-width (.getWidth color-raster))
  (def color-height (.getHeight color-raster))
  (def color-channels (.getNumBands color-raster))
  (def color-pixels (int-array (* color-width color-height color-channels)))
  (.getPixels color-raster 0 0 color-width color-height color-pixels)
  [color-width color-height color-channels])
; [2048 1024 3]

Then we create an OpenGL texture from the RGB data.

(do
  (def texture-color (GL11/glGenTextures))
  (GL11/glBindTexture GL11/GL_TEXTURE_2D texture-color)
  (GL11/glTexImage2D GL11/GL_TEXTURE_2D 0 GL11/GL_RGBA color-width color-height 0
                     GL11/GL_RGB GL11/GL_UNSIGNED_BYTE
                     (make-byte-buffer (byte-array (map unchecked-byte color-pixels))))
  (GL11/glTexParameteri GL11/GL_TEXTURE_2D GL11/GL_TEXTURE_MIN_FILTER GL11/GL_LINEAR)
  (GL11/glTexParameteri GL11/GL_TEXTURE_2D GL11/GL_TEXTURE_MAG_FILTER GL11/GL_LINEAR)
  (GL11/glTexParameteri GL11/GL_TEXTURE_2D GL11/GL_TEXTURE_WRAP_S GL11/GL_REPEAT)
  (GL11/glTexParameteri GL11/GL_TEXTURE_2D GL11/GL_TEXTURE_WRAP_T GL11/GL_REPEAT)
  (GL11/glBindTexture GL11/GL_TEXTURE_2D 0))

Rendering the texture

We are going to use the vertex pass through shader again.

(def vertex-tex "
#version 130

in vec3 point;

void main()
{
  gl_Position = vec4(point, 1);
}")

The fragment shader now uses the texture function to lookup color values from a texture.

(def fragment-tex "
#version 130

uniform vec2 iResolution;
uniform sampler2D moon;
out vec4 fragColor;

void main()
{
  fragColor = texture(moon, gl_FragCoord.xy / iResolution.xy);
}")

We compile and link the shaders to create a program.

(do
  (def vertex-tex-shader (make-shader vertex-tex GL20/GL_VERTEX_SHADER))
  (def fragment-tex-shader (make-shader fragment-tex GL20/GL_FRAGMENT_SHADER))
  (def tex-program (make-program vertex-tex-shader fragment-tex-shader)))

We need to set up the layout of the vertex data again.

(do
  (GL20/glVertexAttribPointer (GL20/glGetAttribLocation tex-program "point") 3
                              GL11/GL_FLOAT false (* 3 Float/BYTES) (* 0 Float/BYTES))
  (GL20/glEnableVertexAttribArray 0))

We set the resolution and bind the texture to the texture slot number 0.

(do
  (GL20/glUseProgram tex-program)
  (GL20/glUniform2f (GL20/glGetUniformLocation tex-program "iResolution")
                    window-width window-height)
  (GL20/glUniform1i (GL20/glGetUniformLocation tex-program "moon") 0)
  (GL13/glActiveTexture GL13/GL_TEXTURE0)
  (GL11/glBindTexture GL11/GL_TEXTURE_2D texture-color))

The quad now is textured!

(do
  (GL11/glDrawElements GL11/GL_QUADS (count indices) GL11/GL_UNSIGNED_INT 0)
  (screenshot))

screenshot 2

Finishing up

We create a convenience function to tear down the VAO, VBO, and IBO.

(defn teardown-vao [{:keys [vao vbo ibo]}]
  (GL15/glBindBuffer GL15/GL_ELEMENT_ARRAY_BUFFER 0)
  (GL15/glDeleteBuffers ibo)
  (GL15/glBindBuffer GL15/GL_ARRAY_BUFFER 0)
  (GL15/glDeleteBuffers vbo)
  (GL30/glBindVertexArray 0)
  (GL15/glDeleteBuffers vao))

We tear down the quad.

(teardown-vao vao)

We also delete the program.

(GL20/glDeleteProgram tex-program)

Render a 3D cube

Create vertex data

If we want to render a cube, we need to define 8 vertices.

(def vertices-cube
  (float-array [-1.0 -1.0 -1.0
                 1.0 -1.0 -1.0
                 1.0  1.0 -1.0
                -1.0  1.0 -1.0
                -1.0 -1.0  1.0
                 1.0 -1.0  1.0
                 1.0  1.0  1.0
                -1.0  1.0  1.0]))

The cube is made up of 6 quads, with 4 vertex indices per quad. So we require 6 * 4 = 24 indices.

(def indices-cube
  (int-array [0 1 2 3
              7 6 5 4
              0 3 7 4
              5 6 2 1
              3 2 6 7
              4 5 1 0]))

Initialize vertex buffer array

We use the function from earlier to set up the VAO, VBO, and IBO.

(def vao-cube (setup-vao vertices-cube indices-cube))

Shader program mapping texture onto cube

We first define a vertex shader, which takes cube coordinates, rotates, translates, and projects them.

(def vertex-moon "
#version 130

uniform float fov;
uniform float alpha;
uniform float beta;
uniform float distance;
uniform vec2 iResolution;
in vec3 point;
out vec3 vpoint;

void main()
{
  // Rotate and translate vertex
  mat3 rot_y = mat3(vec3(cos(alpha), 0, sin(alpha)),
                    vec3(0, 1, 0),
                    vec3(-sin(alpha), 0, cos(alpha)));
  mat3 rot_x = mat3(vec3(1, 0, 0),
                    vec3(0, cos(beta), -sin(beta)),
                    vec3(0, sin(beta), cos(beta)));
  vec3 p = rot_x * rot_y * point + vec3(0, 0, distance);

  // Project vertex creating normalized device coordinates
  float f = 1.0 / tan(fov / 2.0);
  float aspect = iResolution.x / iResolution.y;
  float proj_x = p.x / p.z * f;
  float proj_y = p.y / p.z * f * aspect;
  float proj_z = p.z / (2.0 * distance);

  // Output to shader pipeline.
  gl_Position = vec4(proj_x, proj_y, proj_z, 1);
  vpoint = point;
}")

The fragment shader maps the texture onto the cube.

(def fragment-moon "
#version 130

#define PI 3.1415926535897932384626433832795

uniform sampler2D moon;
in vec3 vpoint;
out vec4 fragColor;

vec2 lonlat(vec3 p)
{
  float lon = atan(p.x, -p.z) / (2.0 * PI) + 0.5;
  float lat = atan(p.y, length(p.xz)) / PI + 0.5;
  return vec2(lon, lat);
}

vec3 color(vec2 lonlat)
{
  return texture(moon, lonlat).rgb;
}

void main()
{
  fragColor = vec4(color(lonlat(vpoint)).rgb, 1);
}")

We compile and link the shaders.

(do
  (def vertex-shader-moon (make-shader vertex-moon GL30/GL_VERTEX_SHADER))
  (def fragment-shader-moon (make-shader fragment-moon GL30/GL_FRAGMENT_SHADER))
  (def program-moon (make-program vertex-shader-moon fragment-shader-moon)))

We need to set up the memory layout again.

(do
  (GL20/glVertexAttribPointer (GL20/glGetAttribLocation program-moon "point") 3
                              GL11/GL_FLOAT false (* 3 Float/BYTES) (* 0 Float/BYTES))
  (GL20/glEnableVertexAttribArray 0))

Rendering the cube

This shader program requires setup of several uniforms and a texture.

(do
  (GL20/glUseProgram program-moon)
  (GL20/glUniform2f (GL20/glGetUniformLocation program-moon "iResolution")
                    window-width window-height)
  (GL20/glUniform1f (GL20/glGetUniformLocation program-moon "fov") (to-radians 25.0))
  (GL20/glUniform1f (GL20/glGetUniformLocation program-moon "alpha") (to-radians 30.0))
  (GL20/glUniform1f (GL20/glGetUniformLocation program-moon "beta") (to-radians -20.0))
  (GL20/glUniform1f (GL20/glGetUniformLocation program-moon "distance") 10.0)
  (GL20/glUniform1i (GL20/glGetUniformLocation program-moon "moon") 0)
  (GL13/glActiveTexture GL13/GL_TEXTURE0)
  (GL11/glBindTexture GL11/GL_TEXTURE_2D texture-color))

We enable back face culling to only render the front faces of the cube. Then we clear the window and render the cube.

(do
  (GL11/glEnable GL11/GL_CULL_FACE)
  (GL11/glCullFace GL11/GL_BACK)
  (GL11/glClearColor 0.0 0.0 0.0 1.0)
  (GL11/glClear GL11/GL_COLOR_BUFFER_BIT)
  (GL11/glDrawElements GL11/GL_QUADS (count indices-cube) GL11/GL_UNSIGNED_INT 0)
  (screenshot))

screenshot 3

This looks interesting but it is not a good approximation of the moon.

Finishing up

To finish up we delete the vertex data for the cube.

(teardown-vao vao-cube)

Approximating a sphere

Creating the vertex data

First we partition the vertex data and convert the triplets to 8 Fastmath vectors.

(def points
  (map #(apply vec3 %)
       (partition 3 vertices-cube)))
points
; ([-1.0 -1.0 -1.0]
;  [1.0 -1.0 -1.0]
;  [1.0 1.0 -1.0]
;  [-1.0 1.0 -1.0]
;  [-1.0 -1.0 1.0]
;  [1.0 -1.0 1.0]
;  [1.0 1.0 1.0]
;  [-1.0 1.0 1.0])

Then we use the index array to get the coordinates of the first corner of each face resulting in 6 Fastmath vectors.

(def corners
  (map (fn [[i _ _ _]] (nth points i))
       (partition 4 indices-cube)))
corners
; ([-1.0 -1.0 -1.0]
;  [-1.0 1.0 1.0]
;  [-1.0 -1.0 -1.0]
;  [1.0 -1.0 1.0]
;  [-1.0 1.0 -1.0]
;  [-1.0 -1.0 1.0])

We get the first spanning vector of each face by subtracting the second corner from the first.

(def u-vectors
  (map (fn [[i j _ _]] (sub (nth points j) (nth points i)))
       (partition 4 indices-cube)))
u-vectors
; ([2.0 0.0 0.0]
;  [2.0 0.0 0.0]
;  [0.0 2.0 0.0]
;  [0.0 2.0 0.0]
;  [2.0 0.0 0.0]
;  [2.0 0.0 0.0])

We get the second spanning vector of each face by subtracting the fourth corner from the first.

(def v-vectors
  (map (fn [[i _ _ l]] (sub (nth points l) (nth points i)))
       (partition 4 indices-cube)))
v-vectors
; ([0.0 2.0 0.0]
;  [0.0 -2.0 0.0]
;  [0.0 0.0 2.0]
;  [0.0 0.0 -2.0]
;  [0.0 0.0 2.0]
;  [0.0 0.0 -2.0])

We can now use vector math to subsample the faces and project the points onto a sphere by normalizing the vectors and multiplying with the moon radius.

(defn sphere-points [n c u v]
  (for [j (range (inc n)) i (range (inc n))]
       (mult (normalize (add c (add (mult u (/ i n)) (mult v (/ j n))))) radius)))

Subdividing once results in 9 corners for a cube face.

(sphere-points 2 (nth corners 0) (nth u-vectors 0) (nth v-vectors 0))
; ([-1003.088357690056 -1003.088357690056 -1003.088357690056]
;  [0.0 -1228.5273216335077 -1228.5273216335077]
;  [1003.088357690056 -1003.088357690056 -1003.088357690056]
;  [-1228.5273216335077 0.0 -1228.5273216335077]
;  [0.0 0.0 -1737.4]
;  [1228.5273216335077 0.0 -1228.5273216335077]
;  [-1003.088357690056 1003.088357690056 -1003.088357690056]
;  [0.0 1228.5273216335077 -1228.5273216335077]
;  [1003.088357690056 1003.088357690056 -1003.088357690056])

We also need a function to generate the indices for the quads.

(defn sphere-indices [n face]
  (for [j (range n) i (range n)]
       (let [offset (+ (* face (inc n) (inc n)) (* j (inc n)) i)]
         [offset (inc offset) (+ offset n 2) (+ offset n 1)])))

Subdividing once results in 4 quads for a cube face.

(sphere-indices 2 0)
; ([0 1 4 3] [1 2 5 4] [3 4 7 6] [4 5 8 7])

Rendering a coarse approximation of the sphere.

We subdivide once (n=2) and create a VAO with the data.

(do
  (def n 2)
  (def vertices-sphere (float-array (flatten (map (partial sphere-points n)
                                                  corners u-vectors v-vectors))))
  (def indices-sphere (int-array (flatten (map (partial sphere-indices n) (range 6)))))
  (def vao-sphere (setup-vao vertices-sphere indices-sphere)))

The layout needs to be configured again.

(do
  (GL20/glVertexAttribPointer (GL20/glGetAttribLocation program-moon "point") 3
                              GL11/GL_FLOAT false (* 3 Float/BYTES) (* 0 Float/BYTES))
  (GL20/glEnableVertexAttribArray 0))

The distance needs to be increased, because the points are on a sphere with the radius of the moon.

(GL20/glUniform1f (GL20/glGetUniformLocation program-moon "distance") (* radius 10.0))

Rendering the mesh now results in a better approximation of a sphere.

(do
  (GL11/glClear GL11/GL_COLOR_BUFFER_BIT)
  (GL11/glDrawElements GL11/GL_QUADS (count indices-sphere) GL11/GL_UNSIGNED_INT 0)
  (screenshot))

screenshot 4

(teardown-vao vao-sphere)

Rendering a fine approximation of the sphere.

To get a high quality approximation we subdivide more and create a VAO with the data. (do

(do
  (def n2 16)
  (def vertices-sphere-high (float-array (flatten (map (partial sphere-points n2) corners u-vectors v-vectors))))
  (def indices-sphere-high (int-array (flatten (map (partial sphere-indices n2) (range 6)))))
  (def vao-sphere-high (setup-vao vertices-sphere-high indices-sphere-high)))

We set up the vertex layout again.

(do
  (GL20/glVertexAttribPointer (GL20/glGetAttribLocation program-moon "point") 3
                              GL11/GL_FLOAT false (* 3 Float/BYTES) (* 0 Float/BYTES))
  (GL20/glEnableVertexAttribArray 0))

Rendering the mesh now results in a spherical mesh with a texture.

(do
  (GL11/glClear GL11/GL_COLOR_BUFFER_BIT)
  (GL11/glDrawElements GL11/GL_QUADS (count indices-sphere-high) GL11/GL_UNSIGNED_INT 0)
  (screenshot))

screenshot 5

(GL20/glDeleteProgram program-moon)

Adding ambient and diffuse reflection

In order to introduce lighting we add ambient and diffuse lighting to the fragment shader. We use the ambient and diffuse lighting from the Phong shading model.

  • The ambient light is a constant value.
  • The diffuse light is calculated using the dot product of the light vector and the normal vector.
(def fragment-moon-diffuse "
#version 130

#define PI 3.1415926535897932384626433832795

uniform vec3 light;
uniform float ambient;
uniform float diffuse;
uniform sampler2D moon;
in vec3 vpoint;
out vec4 fragColor;

vec2 lonlat(vec3 p)
{
  float lon = atan(p.x, -p.z) / (2.0 * PI) + 0.5;
  float lat = atan(p.y, length(p.xz)) / PI + 0.5;
  return vec2(lon, lat);
}

vec3 color(vec2 lonlat)
{
  return texture(moon, lonlat).rgb;
}

void main()
{
  float phong = ambient + diffuse * max(0.0, dot(light, normalize(vpoint)));
  fragColor = vec4(color(lonlat(vpoint)) * phong, 1);
}")

We reuse the vertex shader from the previous example and the new fragment shader.

(do
  (def vertex-shader-diffuse (make-shader vertex-moon GL30/GL_VERTEX_SHADER))
  (def fragment-shader-diffuse (make-shader fragment-moon-diffuse GL30/GL_FRAGMENT_SHADER))
  (def program-diffuse (make-program vertex-shader-diffuse fragment-shader-diffuse)))

We set up the vertex data layout again.

(do
  (GL20/glVertexAttribPointer (GL20/glGetAttribLocation program-diffuse "point") 3
                              GL11/GL_FLOAT false (* 3 Float/BYTES) (* 0 Float/BYTES))
  (GL20/glEnableVertexAttribArray 0))

A normalized light vector is defined.

(def light (normalize (vec3 -1 0 -1)))

Before rendering we need to set up the various uniform values.

(do
  (GL20/glUseProgram program-diffuse)
  (GL20/glUniform2f (GL20/glGetUniformLocation program-diffuse "iResolution")
                    window-width window-height)
  (GL20/glUniform1f (GL20/glGetUniformLocation program-diffuse "fov") (to-radians 20.0))
  (GL20/glUniform1f (GL20/glGetUniformLocation program-diffuse "alpha") (to-radians 0.0))
  (GL20/glUniform1f (GL20/glGetUniformLocation program-diffuse "beta") (to-radians 0.0))
  (GL20/glUniform1f (GL20/glGetUniformLocation program-diffuse "distance") (* radius 10.0))
  (GL20/glUniform1f (GL20/glGetUniformLocation program-diffuse "ambient") 0.0)
  (GL20/glUniform1f (GL20/glGetUniformLocation program-diffuse "diffuse") 1.6)
  (GL20/glUniform3f (GL20/glGetUniformLocation program-diffuse "light")
                    (light 0) (light 1) (light 2))
  (GL20/glUniform1i (GL20/glGetUniformLocation program-diffuse "moon") 0)
  (GL13/glActiveTexture GL13/GL_TEXTURE0)
  (GL11/glBindTexture GL11/GL_TEXTURE_2D texture-color))

Finally we are ready to render the mesh with diffuse shading.

(do
  (GL11/glClear GL11/GL_COLOR_BUFFER_BIT)
  (GL11/glDrawElements GL11/GL_QUADS (count indices-sphere-high) GL11/GL_UNSIGNED_INT 0)
  (screenshot))

screenshot 6

Afterwards we delete the shader program.

(GL20/glDeleteProgram program-diffuse)

Using normal mapping

Load elevation data into texture

In the final section we also want to add normal mapping in order to get realistic shading of craters.

The lunar elevation data is downloaded from NASA’s website.

(do
  (def moon-ldem "src/opengl_visualization/ldem_4.tif")
  (when (not (.exists (io/file moon-ldem)))
    (download "https://svs.gsfc.nasa.gov/vis/a000000/a004700/a004720/ldem_4.tif"
              moon-ldem)))

The image is read using ImageIO and the floating point elevation data is extracted.

(do
  (def ldem (ImageIO/read (io/file moon-ldem)))
  (def ldem-raster (.getRaster ldem))
  (def ldem-width (.getWidth ldem))
  (def ldem-height (.getHeight ldem))
  (def ldem-pixels (float-array (* ldem-width ldem-height)))
  (do (.getPixels ldem-raster 0 0 ldem-width ldem-height ldem-pixels) nil)
  (def resolution (/ (* 2.0 PI radius) ldem-width))
  [ldem-width ldem-height])
; [1440 720]

The floating point pixel data is converted into a texture

(do
  (def texture-ldem (GL11/glGenTextures))
  (GL11/glBindTexture GL11/GL_TEXTURE_2D texture-ldem)
  (GL11/glTexParameteri GL11/GL_TEXTURE_2D GL11/GL_TEXTURE_MIN_FILTER GL11/GL_LINEAR)
  (GL11/glTexParameteri GL11/GL_TEXTURE_2D GL11/GL_TEXTURE_MAG_FILTER GL11/GL_LINEAR)
  (GL11/glTexParameteri GL11/GL_TEXTURE_2D GL11/GL_TEXTURE_WRAP_S GL11/GL_REPEAT)
  (GL11/glTexParameteri GL11/GL_TEXTURE_2D GL11/GL_TEXTURE_WRAP_T GL11/GL_REPEAT)
  (GL11/glTexImage2D GL11/GL_TEXTURE_2D 0 GL30/GL_R32F ldem-width ldem-height 0
                     GL11/GL_RED GL11/GL_FLOAT ldem-pixels))

Create shader program with normal mapping

We reuse the vertex shader from the previous section.

The fragment shader this time is more involved.

  • A horizon matrix with normal, tangent, and bitangent vectors is computed.
  • The elevation is sampled in four directions from the current 3D point.
  • The elevation values are used to create two surface vectors.
  • The cross product of the surface vectors is computed and normalized to get the normal vector.
  • This perturbed normal vector is now used to compute diffuse lighting.
(def fragment-normal "
#version 130

#define PI 3.1415926535897932384626433832795

uniform vec3 light;
uniform float ambient;
uniform float diffuse;
uniform float resolution;
uniform sampler2D moon;
uniform sampler2D ldem;
in vec3 vpoint;
in mat3 horizon;
out vec4 fragColor;

vec3 orthogonal_vector(vec3 n)
{
  vec3 b;
  if (abs(n.x) <= abs(n.y)) {
    if (abs(n.x) <= abs(n.z))
      b = vec3(1, 0, 0);
    else
      b = vec3(0, 0, 1);
  } else {
    if (abs(n.y) <= abs(n.z))
      b = vec3(0, 1, 0);
    else
      b = vec3(0, 0, 1);
  };
  return normalize(cross(n, b));
}

mat3 oriented_matrix(vec3 n)
{
  vec3 o1 = orthogonal_vector(n);
  vec3 o2 = cross(n, o1);
  return mat3(n, o1, o2);
}

vec2 lonlat(vec3 p)
{
  float lon = atan(p.x, -p.z) / (2.0 * PI) + 0.5;
  float lat = atan(p.y, length(p.xz)) / PI + 0.5;
  return vec2(lon, lat);
}

vec3 color(vec2 lonlat)
{
  return texture(moon, lonlat).rgb;
}

float elevation(vec3 p)
{
  return texture(ldem, lonlat(p)).r;
}

vec3 normal(mat3 horizon, vec3 p)
{
  vec3 pl = p + horizon * vec3(0, -1,  0) * resolution;
  vec3 pr = p + horizon * vec3(0,  1,  0) * resolution;
  vec3 pu = p + horizon * vec3(0,  0, -1) * resolution;
  vec3 pd = p + horizon * vec3(0,  0,  1) * resolution;
  vec3 u = horizon * vec3(elevation(pr) - elevation(pl), 2 * resolution, 0);
  vec3 v = horizon * vec3(elevation(pd) - elevation(pu), 0, 2 * resolution);
  return normalize(cross(u, v));
}

void main()
{
  mat3 horizon = oriented_matrix(normalize(vpoint));
  float phong = ambient + diffuse * max(0.0, dot(light, normal(horizon, vpoint)));
  fragColor = vec4(color(lonlat(vpoint)).rgb * phong, 1);
}")

We reuse the vertex shader from the previous example and the new fragment shader.

(do
  (def vertex-shader-normal (make-shader vertex-moon GL30/GL_VERTEX_SHADER))
  (def fragment-shader-normal (make-shader fragment-normal GL30/GL_FRAGMENT_SHADER))
  (def program-normal (make-program vertex-shader-normal fragment-shader-normal)))

We set up the vertex data layout again.

(do
  (GL20/glVertexAttribPointer (GL20/glGetAttribLocation program-normal "point") 3
                              GL11/GL_FLOAT false (* 3 Float/BYTES) (* 0 Float/BYTES))
  (GL20/glEnableVertexAttribArray 0))

Apart from the uniform values we also need to set up two textures this time: the color texture and the elevation texture.

(do
  (GL20/glUseProgram program-normal)
  (GL20/glUniform2f (GL20/glGetUniformLocation program-normal "iResolution")
                    window-width window-height)
  (GL20/glUniform1f (GL20/glGetUniformLocation program-normal "fov") (to-radians 20.0))
  (GL20/glUniform1f (GL20/glGetUniformLocation program-normal "alpha") (to-radians 0.0))
  (GL20/glUniform1f (GL20/glGetUniformLocation program-normal "beta") (to-radians 0.0))
  (GL20/glUniform1f (GL20/glGetUniformLocation program-normal "distance") (* radius 10.0))
  (GL20/glUniform1f (GL20/glGetUniformLocation program-normal "resolution") resolution)
  (GL20/glUniform1f (GL20/glGetUniformLocation program-normal "ambient") 0.0)
  (GL20/glUniform1f (GL20/glGetUniformLocation program-normal "diffuse") 1.6)
  (GL20/glUniform3f (GL20/glGetUniformLocation program-normal "light")
                    (light 0) (light 1) (light 2))
  (GL20/glUniform1i (GL20/glGetUniformLocation program-normal "moon") 0)
  (GL20/glUniform1i (GL20/glGetUniformLocation program-normal "ldem") 1)
  (GL13/glActiveTexture GL13/GL_TEXTURE0)
  (GL11/glBindTexture GL11/GL_TEXTURE_2D texture-color)
  (GL13/glActiveTexture GL13/GL_TEXTURE1)
  (GL11/glBindTexture GL11/GL_TEXTURE_2D texture-ldem))

Finally we are ready to render the mesh with normal mapping.

(do
  (GL11/glClear GL11/GL_COLOR_BUFFER_BIT)
  (GL11/glDrawElements GL11/GL_QUADS (count indices-sphere-high) GL11/GL_UNSIGNED_INT 0)
  (screenshot))

screenshot 7

Afterwards we delete the shader program and the vertex data.

(GL20/glDeleteProgram program-normal)
(teardown-vao vao-sphere-high)

And the textures.

(GL11/glDeleteTextures texture-color)
(GL11/glDeleteTextures texture-ldem)

Finalizing GLFW

When we are finished, we destroy the window.

(GLFW/glfwDestroyWindow window)

Finally we terminate use of the GLFW library.

(GLFW/glfwTerminate)

I hope you liked this 3D graphics example.

Note that in practise you will

  • use higher resolution data and map the data onto texture tiles
  • generate textures containing normal maps offline
  • create a multiresolution map
  • use tessellation to increase the mesh resolution
  • use elevation data to deform the mesh

Thanks to Timothy Pratley for helping getting this post online.

Developing a Space Flight Simulator in Clojure

In 2017 I discovered the free of charge Orbiter 2016 space flight simulator which was proprietary at the time and it inspired me to develop a space flight simulator myself. I prototyped some rigid body physics in C and later in GNU Guile and also prototyped loading and rendering of Wavefront OBJ files. I used GNU Guile (a Scheme implementation) because it has a good native interface and of course it has hygienic macros. Eventually I got interested in Clojure because it has more generic multi-methods as well as fast hash maps and vectors. I finally decided to develop the game for real in Clojure. I have been developing a space flight simulator in Clojure for almost 5 years now. While using Clojure I have come to appreciate the immutable values and safe parallelism using atoms, agents, and refs.

In the beginning I decided to work on the hard parts first, which for me were 3D rendering of a planet, an atmosphere, shadows, and volumetric clouds. I read the OpenGL Superbible to get an understanding on what functionality OpenGL provides. When Orbiter was eventually open sourced and released unter MIT license here, I inspected the source code and discovered that about 90% of the code is graphics-related. So starting with the graphics problems was not a bad decision.

Software dependencies

The following software is used for development. The software libraries run on both GNU/Linux and Microsoft Windows.

  • Clojure the programming language
  • LWJGL provides Java wrappers for various libraries
    • lwjgl-opengl for 3D graphics
    • lwjgl-glfw for windowing and input devices
    • lwjgl-nuklear for graphical user interfaces
    • lwjgl-stb for image I/O and using truetype fonts
    • lwjgl-assimp to load glTF 3D models with animation data
  • Jolt Physics to simulate wheeled vehicles and collisions with meshes
  • Fastmath for fast matrix and vector math as well as spline interpolation
  • Comb for templating shader code
  • Instaparse to parse NASA Planetary Constant Kernel (PCK) files
  • Gloss to parse NASA Double Precision Array Files (DAF)
  • Coffi as a foreign function interface
  • core.memoize for least recently used caching of function results
  • Apache Commons Compress to read map tiles from tar files
  • Malli to add schemas to functions
  • Immuconf to load the configuration file
  • Progrock a progress bar for long running builds
  • Claypoole to implement parallel for loops
  • Midje for test-driven development
  • tools.build to build the project
  • clj-async-profiler Clojure profiler creating flame graphs
  • slf4j-timbre Java logging implementation for Clojure

The deps.edn file contains operating system dependent LWJGL bindings. For example on GNU/Linux the deps.edn file contains the following:

{:deps {; ...
        org.lwjgl/lwjgl {:mvn/version "3.3.6"}
        org.lwjgl/lwjgl$natives-linux {:mvn/version "3.3.6"}
        org.lwjgl/lwjgl-opengl {:mvn/version "3.3.6"}
        org.lwjgl/lwjgl-opengl$natives-linux {:mvn/version "3.3.6"}
        org.lwjgl/lwjgl-glfw {:mvn/version "3.3.6"}
        org.lwjgl/lwjgl-glfw$natives-linux {:mvn/version "3.3.6"}
        org.lwjgl/lwjgl-nuklear {:mvn/version "3.3.6"}
        org.lwjgl/lwjgl-nuklear$natives-linux {:mvn/version "3.3.6"}
        org.lwjgl/lwjgl-stb {:mvn/version "3.3.6"}
        org.lwjgl/lwjgl-stb$natives-linux {:mvn/version "3.3.6"}
        org.lwjgl/lwjgl-assimp {:mvn/version "3.3.6"}
        org.lwjgl/lwjgl-assimp$natives-linux {:mvn/version "3.3.6"}}
        ; ...
        }

In order to manage the different dependencies for Microsoft Windows, a separate Git branch is maintained.

Atmosphere rendering

For the atmosphere, Bruneton’s precomputed atmospheric scattering was used. The implementation uses a 2D transmittance table, a 2D surface scattering table, a 4D Rayleigh scattering, and a 4D Mie scattering table. The tables are computed using several iterations of numerical integration. Higher order functions for integration over a sphere and over a line segment were implemented in Clojure. Integration over a ray in 3D space (using fastmath vectors) was implemented as follows for example:

(defn integral-ray
  "Integrate given function over a ray in 3D space"
  {:malli/schema [:=> [:cat ray N :double [:=> [:cat [:vector :double]] :some]] :some]}
  [{::keys [origin direction]} steps distance fun]
  (let [stepsize      (/ distance steps)
        samples       (mapv #(* (+ 0.5 %) stepsize) (range steps))
        interpolate   (fn interpolate [s] (add origin (mult direction s)))
        direction-len (mag direction)]
    (reduce add (mapv #(-> % interpolate fun (mult (* stepsize direction-len))) samples))))

Precomputing the atmospheric tables takes several hours even though pmap was used. When sampling the multi-dimensional functions, pmap was used as a top-level loop and map was used for interior loops. Using java.nio.ByteBuffer the floating point values were converted to a byte array and then written to disk using a clojure.java.io/output-stream:

(defn floats->bytes
  "Convert float array to byte buffer"
  [^floats float-data]
  (let [n           (count float-data)
        byte-buffer (.order (ByteBuffer/allocate (* n 4)) ByteOrder/LITTLE_ENDIAN)]
    (.put (.asFloatBuffer byte-buffer) float-data)
    (.array byte-buffer)))

(defn spit-bytes
  "Write bytes to a file"
  {:malli/schema [:=> [:cat non-empty-string bytes?] :nil]}
  [^String file-name ^bytes byte-data]
  (with-open [out (io/output-stream file-name)]
    (.write out byte-data)))

(defn spit-floats
  "Write floating point numbers to a file"
  {:malli/schema [:=> [:cat non-empty-string seqable?] :nil]}
  [^String file-name ^floats float-data]
  (spit-bytes file-name (floats->bytes float-data)))

When launching the game, the lookup tables get loaded and copied into OpenGL textures. Shader functions are used to lookup and interpolate values from the tables. When rendering the planet surface or the space craft, the atmosphere essentially gets superimposed using ray tracing. After rendering the planet, a background quad is rendered to display the remaining part of the atmosphere above the horizon.

Templating OpenGL shaders

It is possible to make programming with OpenGL shaders more flexible by using a templating library such as Comb. The following shader defines multiple octaves of noise on a base noise function:

#version 410 core

float <%= base-function %>(vec3 idx);

float <%= method-name %>(vec3 idx)
{
  float result = 0.0;
<% (doseq [multiplier octaves] %>
  result += <%= multiplier %> * <%= base-function %>(idx);
  idx *= 2;
<% ) %>
  return result;
}

One can then for example define the function fbm_noise using octaves of the base function noise as follows:

(def noise-octaves
  "Shader function to sum octaves of noise"
  (template/fn [method-name base-function octaves] (slurp "resources/shaders/core/noise-octaves.glsl")))

; ...

(def fbm-noise-shader (noise-octaves "fbm_noise" "noise" [0.57 0.28 0.15]))

Planet rendering

To render the planet, NASA Bluemarble data, NASA Blackmarble data, and NASA Elevation data was used. The images were converted to a multi resolution pyramid of map tiles. The following functions were implemented for color map tiles and for elevation tiles:

  • a function to load and cache map tiles of given 2D tile index and level of detail
  • a function to extract a pixel from a map tile
  • a function to extract the pixel for a specific longitude and latitude

The functions for extracting a pixel for given longitude and latitude then were used to generate a cube map with a quad tree of tiles for each face. For each tile, the following files were generated:

  • A daytime texture
  • A night time texture
  • An image of 3D vectors defining a surface mesh
  • A water mask
  • A normal map

Altogether 655350 files were generated. Because the Steam ContentBuilder does not support a large number of files, each row of tile data was aggregated into a tar file. The Apache Commons Compress library allows you to open a tar file to get a list of entries and then perform random access on the contents of the tar file. A Clojure LRU cache was used to maintain a cache of open tar files for improved performance.

At run time, a future is created, which returns an updated tile tree, a list of tiles to drop, and a path list of the tiles to load into OpenGL. When the future is realized, the main thread deletes the OpenGL textures from the drop list, and then uses the path list to get the new loaded images from the tile tree, load them into OpenGL textures, and create an updated tile tree with the new OpenGL textures added. The following functions to manipulate quad trees were implemented to realize this:

(defn quadtree-add
  "Add tiles to quad tree"
  {:malli/schema [:=> [:cat [:maybe :map] [:sequential [:vector :keyword]] [:sequential :map]] [:maybe :map]]}
  [tree paths tiles]
  (reduce (fn add-title-to-quadtree [tree [path tile]] (assoc-in tree path tile)) tree (mapv vector paths tiles)))

(defn quadtree-extract
  "Extract a list of tiles from quad tree"
  {:malli/schema [:=> [:cat [:maybe :map] [:sequential [:vector :keyword]]] [:vector :map]]}
  [tree paths]
  (mapv (partial get-in tree) paths))

(defn quadtree-drop
  "Drop tiles specified by path list from quad tree"
  {:malli/schema [:=> [:cat [:maybe :map] [:sequential [:vector :keyword]]] [:maybe :map]]}
  [tree paths]
  (reduce dissoc-in tree paths))

(defn quadtree-update
  "Update tiles with specified paths using a function with optional arguments from lists"
  {:malli/schema [:=> [:cat [:maybe :map] [:sequential [:vector :keyword]] fn? [:* :any]] [:maybe :map]]}
  [tree paths fun & arglists]
  (reduce (fn update-tile-in-quadtree
            [tree [path & args]]
            (apply update-in tree path fun args)) tree (apply map list paths arglists)))

Other topics

Solar system

The astronomy code for getting the position and orientation of planets was implemented according to the Skyfield Python library. The Python library in turn is based on the SPICE toolkit of the NASA JPL. The JPL basically provides sequences of Chebyshev polynomials to interpolate positions of Moon and planets as well as the orientation of the Moon as binary files. Reference coordinate systems and orientations of other bodies are provided in text files which consist of human and machine readable sections. The binary files were parsed using Gloss (see Wiki for some examples) and the text files using Instaparse.

Jolt bindings

The required Jolt functions for wheeled vehicle dynamics and collisions with meshes were wrapped in C functions and compiled into a shared library. The Coffi Clojure library (which is a wrapper for Java’s new Foreign Function & Memory API) was used to make the C functions and data types usable in Clojure.

For example the following code implements a call to the C function add_force:

(defcfn add-force
  "Apply a force in the next physics update"
  add_force [::mem/int ::vec3] ::mem/void)

Here ::vec3 refers to a custom composite type defined using basic types. The memory layout, serialisation, and deserialisation for ::vec3 are defined as follows:

(def vec3-struct
  [::mem/struct
   [[:x ::mem/double]
    [:y ::mem/double]
    [:z ::mem/double]]])


(defmethod mem/c-layout ::vec3
  [_vec3]
  (mem/c-layout vec3-struct))


(defmethod mem/serialize-into ::vec3
  [obj _vec3 segment arena]
  (mem/serialize-into {:x (obj 0) :y (obj 1) :z (obj 2)} vec3-struct segment arena))


(defmethod mem/deserialize-from ::vec3
  [segment _vec3]
  (let [result (mem/deserialize-from segment vec3-struct)]
    (vec3 (:x result) (:y result) (:z result))))

Performance

The clj-async-profiler was used to create flame graphs visualising the performance of the game. In order to get reflection warnings for Java calls without sufficient type declarations, *warn-on-reflection* was set to true.

(set! *warn-on-reflection* true)

Furthermore to discover missing declarations of numerical types, *unchecked-math* was set to :warn-on-boxed.

(set! *unchecked-math* :warn-on-boxed)

To reduce garbage collector pauses, the ZGC low-latency garbage collector for the JVM was used. The following section in deps.edn ensures that the ZGC garbage collector is used when running the project with clj -M:run:

{:deps {; ...
        }
 :aliases {:run {:jvm-opts ["-Xms2g" "-Xmx4g" "--enable-native-access=ALL-UNNAMED" "-XX:+UseZGC"
                            "--sun-misc-unsafe-memory-access=allow"]
                 :main-opts ["-m" "sfsim.core"]}}}

The option to use ZGC is also specified in the Packr JSON file used to deploy the application.

Building the project

In order to build the map tiles, atmospheric lookup tables, and other data files using tools.build, the project source code was made available in the build.clj file using a :local/root dependency:

{:deps {; ...
        }
 :aliases {; ...
           :build {:deps {io.github.clojure/tools.build {:mvn/version "0.10.10"}
                          sfsim/sfsim {:local/root "."}}
                   :ns-default build
                   :exec-fn all
                   :jvm-opts ["-Xms2g" "-Xmx4g" "--sun-misc-unsafe-memory-access=allow"]}}}

Various targets were defined to build the different components of the project. For example the atmospheric lookup tables can be build by specifying clj -T:build atmosphere-lut on the command line.

The following section in the build.clj file was added to allow creating an “Uberjar” JAR file with all dependencies by specifying clj -T:build uber on the command-line.

(defn uber [_]
  (b/copy-dir {:src-dirs ["src/clj"]
               :target-dir class-dir})
  (b/compile-clj {:basis basis
                  :src-dirs ["src/clj"]
                  :class-dir class-dir})
  (b/uber {:class-dir class-dir
           :uber-file "target/sfsim.jar"
           :basis basis
           :main 'sfsim.core}))

To create a Linux executable with Packr, one can then run java -jar packr-all-4.0.0.jar scripts/packr-config-linux.json where the JSON file has the following content:

{
  "platform": "linux64",
  "jdk": "/usr/lib/jvm/jdk-24.0.2-oracle-x64",
  "executable": "sfsim",
  "classpath": ["target/sfsim.jar"],
  "mainclass": "sfsim.core",
  "resources": ["LICENSE", "libjolt.so", "venturestar.glb", "resources"],
  "vmargs": ["Xms2g", "Xmx4g", "XX:+UseZGC"],
  "output": "out-linux"
}

In order to distribute the game on Steam, three depots were created:

  • a data depot with the operating system independent data files
  • a Linux depot with the Linux executable and Uberjar including LWJGL’s Linux native bindings
  • and a Windows depot with the Windows executable and an Uberjar including LWJGL’s Windows native bindings

When updating a depot, the Steam ContentBuilder command line tool creates and uploads a patch in order to preserve storage space and bandwidth.

Future work

Although the hard parts are mostly done, there are still several things to do:

  • control surfaces and thruster graphics
  • launchpad and runway graphics
  • sound effects
  • a 3D cockpit
  • the Moon
  • a space station

It would also be interesting to make the game modable in a safe way (maybe evaluating Clojure files in a sandboxed environment?).

Conclusion

You can find the source code on Github. Currently there is only a playtest build, but if you want to get notified, when the game gets released, you can wishlist it here.

Anyway, let me know any comments and suggestions.

Enjoy!

Updates

  • Submitted for discussion to Reddit here
  • See HackerNews discussion of this project here

Keyestudio Smart Home

Keyestudio Smarthome

A few months ago I bought a Keyestudio Smart Home, assembled it and tried to program it using the Arduino IDE. However I kept getting the following error when trying to upload a sketch to the board.

 avrdude: stk500_getsync() attempt 1 of 10: not in sync: resp=0x2e
 avrdude: stk500_getsync() attempt 2 of 10: not in sync: resp=0x2e
 avrdude: stk500_getsync() attempt 3 of 10: not in sync: resp=0x2e

Initially I thought it was an issue with the QinHeng Electronics CH340 serial converter driver software. After exchanging a few emails with keyestudio support however I was pointed out that the board type of my smart home version was not “Arduino Uno”. The box of the control board says “Keyestudio Control Board for ESP-32” and I had to install version 3.1.3 of the esp32 board software for being able to program the board. I.e. the Keyestudio IoT Smart Home Kit for ESP32 is not to be confused with the Keyestudio Smart Home Kit for Arduino.

The documentation for the Keyestudio smart home using ESP-32 is here. Also the correct version of the smart home sketches are here. Finally you can find many sample projects in the keyestudio blog. Note that in some cases you have to adapt the io pin numbers using the smart home documentation.

Many thanks to Keyestudio support for helping me to get it working.

Orbits with Jolt Physics

I want to simulate an orbiting spacecraft using the Jolt Physics engine (see sfsim homepage for details). The Jolt Physics engine solves difficult problems such as gyroscopic forces, collision detection with linear casting, and special solutions for wheeled vehicles with suspension.

The integration method of the Jolt Physics engine is the semi-implicit Euler method. The following formula shows how speed v and position x are integrated for each time step:

latex formula

The gravitational acceleration by a planet is given by:

latex formula

To test orbiting, one can set the initial conditions of the spacecraft to a perfect circular orbit:

latex formula

The orbital radius R was set to the Earth radius of 6378 km plus 408 km (the height of the ISS). The Earth mass was assumed to be 5.9722e+24 kg. For increased accuracy, the Jolt Physics library was compiled with the option -DDOUBLE_PRECISION=ON.

A full orbit was simulated using different values for the time step. The following plot shows the height deviation from the initial orbital height over time.

Orbits with symplectic Euler

When examining the data one can see that the integration method returns close to the initial after one orbit. The orbital error of the Euler integration method looks like a sine wave. Even for a small timestep of dt = 0.031 s, the maximum orbit deviation is 123.8 m. The following plot shows that for increasing time steps, the maximum error grows linearly.

Euler orbit deviation as a function of time step

For time lapse simulation with a time step of 16 seconds, the errors will exceed 50 km.

A possible solution is to use Runge Kutta 4th order integration instead of symplectic Euler. The 4th order Runge Kutta method can be implemented using a state vector consisting of position and speed:

latex formula

The derivative of the state vector consists of speed and gravitational acceleration:

latex formula

The Runge Kutta 4th order integration method is as follows:

latex formula

The Runge Kutta method can be implemented in Clojure as follows:

(defn runge-kutta
  "Runge-Kutta integration method"
  {:malli/schema [:=> [:cat :some :double [:=> [:cat :some :double] :some] add-schema scale-schema] :some]}
  [y0 dt dy + *]
  (let [dt2 (/ ^double dt 2.0)
        k1  (dy y0                0.0)
        k2  (dy (+ y0 (* dt2 k1)) dt2)
        k3  (dy (+ y0 (* dt2 k2)) dt2)
        k4  (dy (+ y0 (* dt  k3)) dt)]
    (+ y0 (* (/ ^double dt 6.0) (reduce + [k1 (* 2.0 k2) (* 2.0 k3) k4])))))

The following code can be used to test the implementation:

(def add (fn [x y] (+ x y)))
(def scale (fn [s x] (* s x)))

(facts "Runge-Kutta integration method"
       (runge-kutta 42.0 1.0 (fn [_y _dt] 0.0) add scale) => 42.0
       (runge-kutta 42.0 1.0 (fn [_y _dt] 5.0) add scale) => 47.0
       (runge-kutta 42.0 2.0 (fn [_y _dt] 5.0) add scale) => 52.0
       (runge-kutta 42.0 1.0 (fn [_y dt] (* 2.0 dt)) add scale) => 43.0
       (runge-kutta 42.0 2.0 (fn [_y dt] (* 2.0 dt)) add scale) => 46.0
       (runge-kutta 42.0 1.0 (fn [_y dt] (* 3.0 dt dt)) add scale) => 43.0
       (runge-kutta 1.0 1.0 (fn [y _dt] y) add scale) => (roughly (exp 1) 1e-2))

The Jolt Physics library allows to apply impulses to the spacecraft. The idea is to use Runge Kutta 4th order integration to get an accurate estimate of the speed and position of the spacecraft after the next time step. One can apply an impulse before running an Euler step so that the position after the Euler step matches the Runge Kutta estimate. A second impulse then is used after the Euler time step to also make the speed match the Runge Kutta estimate. Given the initial state (x(n), v(n)) and the desired next state (x(n+1), v(n+1)) (obtained from Runge Kutta) the formulas for the two impulses are as follows:

latex formula

The following code shows the implementation of the matching scheme using two speed changes in Clojure:

(defn matching-scheme
  "Use two custom acceleration values to make semi-implicit Euler result match a ground truth after the integration step"
  [y0 dt y1 scale subtract]
  (let [delta-speed0 (scale (/ 1.0 ^double dt) (subtract (subtract (:position y1) (:position y0)) (scale dt (:speed y0))))
        delta-speed1 (subtract (subtract (:speed y1) (:speed y0)) delta-speed0)]
    [delta-speed0 delta-speed1]))

The following plot shows the height deviations observed when using Runge Kutta integration.

Orbits with Runge Kutta 4th order

The following plot of maximum deviation shows that the errors are much smaller.

RK orbit deviation as a function of time step

Although the accuracy of the Runge Kutta matching scheme is higher, a loss of 40 m of height per orbit is undesirable. Inspecting the Jolt Physics source code reveals that the double-precision setting affects position vectors but is not applied to speed and impulse vectors. To test whether double precision speed and impulse vectors would increase the accuracy, a test implementation of the semi-implicit Euler method with Runge Kutta matching scheme was used. The following plot shows that the orbit deviations are now much smaller.

Orbits with Runge Kutta 4th order and double precision

The updated plot of maximum deviation shows that using double precision the error for one orbit is below 1 meter for time steps up to 40 seconds.

RK with double precision orbit deviation as a function of time step

I am currently looking into building a modified Jolt Physics version which uses double precision for speed and impulse vectors. I hope that I will get the Runge Kutta 4th order matching scheme to work so that I get an integrated solution for numerically accurate orbits as well as collision and vehicle simulation.

Update:

Jorrit Rouwé has informed me that he currently does not want to add support for double precision speed values. He also has more detailed information about using Jolt Physics for space simulation on his website.

I have managed to get a prototype working using the moving coordinate system approach. One can perform the Runge Kutta integration using double precision coordinates and speed vectors with the Earth at the centre of the coordinate system. The Jolt Physics integration then happens in a coordinate system which is at the initial position and moving with the initial speed of the spaceship. The first impulse of the matching scheme is applied and then the semi-implicit Euler integration step is performed using Jolt Physics with single precision speed vectors and impulses. Then the second impulse is applied. Finally the position and speed of the double precision moving coordinate system are incremented using the position and speed value of the Jolt Physics body. The position and speed of the Jolt Physics body are then reset to zero and the next iteration begins.

The following plot shows the height deviations observed using this approach:

Orbits using moving coordinate system

The maximum errors for different time steps are shown in the following plot:

Maximum errors with moving coordinate system as a function of time step

Akima splines

Recently I was looking for spline interpolation for creating curves from a set of samples. I knew cubic splines which are piecewise cubic polynomials fitted such that they are continuous up to the second derivative. I almost went ahead and implemented cubic splines using a matrix solver but then I found that the fastmath Clojure library already provides splines. The fastmath spline interpolation module is based on the interpolation module of the Java Smile library. I saved the interpolated samples to a text file and plotted them with Gnuplot.

(require '[fastmath.interpolation :as interpolation])
(use '[clojure.java.shell :only [sh]])

(def px [0 1 3 4 5 8 9])
(def py [0 0 1 7 3 4 6])
(spit "/tmp/points.dat" (apply str (map (fn [x y] (str x " " y "\n")) px py)))

(def cspline (interpolation/cubic-spline px py))
(def x (range 0 9.0 0.01))
(spit "/tmp/cspline.dat" (apply str (map (fn [x y] (str x " " y "\n")) x (map cspline x))))
(sh "gnuplot" "-c" "plot.gp" "/tmp/cspline.png" "/tmp/cspline.dat")
(sh "display" "/tmp/cspline.png")

I used the following Gnuplot script plot.gp for plotting:

set terminal pngcairo size 640,480
set output ARG1
set xlabel "x"
set ylabel "y"
plot ARG2 using 1:2 with lines title "spline", "/tmp/points.dat" using 1:2 with points title "points"

I used a lightweight configuration of the fastmath library without MKL and OpenBLAS. See following deps.edn:

{:deps {org.clojure/clojure {:mvn/version "1.12.1"}
        generateme/fastmath {:mvn/version "2.4.0" :exclusions [com.github.haifengl/smile-mkl org.bytedeco/openblas]}}}

The result is shown in the following figure. One can see that the spline is smooth and passes through all points, however it shows a high degree of oscillation:

cubic spline

However I found another spline algorithm in the fastmath wrappers: The Akima spline. The Akima spline needs at least 5 points and it first computes the gradient of the lines connecting the points. Then for each point it uses a weighted average of the previous and next slope value. The slope values are weighted using the absolute difference of the previous two slopes and the next two slopes, i.e. the curvature. The first and last two points use a special formula: The first and last point use the next or previous slope and the second and second last point use an average of the neighbouring slopes.

(require '[fastmath.interpolation :as interpolation])
(use '[clojure.java.shell :only [sh]])

(def px [0 1 3 4 5 8 9])
(def py [0 0 1 7 3 4 6])
(spit "/tmp/points.dat" (apply str (map (fn [x y] (str x " " y "\n")) px py)))

(def aspline (interpolation/akima-spline px py))
(def x (range 0 9.0 0.01))
(spit "/tmp/aspline.dat" (apply str (map (fn [x y] (str x " " y "\n")) x (map aspline x))))
(sh "gnuplot" "-c" "plot.gp" "/tmp/aspline.png" "/tmp/aspline.dat")
(sh "display" "/tmp/aspline.png")

Akima spline

So if you have a data set which causes cubic splines to oscillate, give Akima splines a try!

Enjoy!

Update: u/joinr showed, how you can use Clojupyter to quickly test a lot of splines.