Mechanical Rigging using Ryan King Art's tutorial

After finishing the Blender Guru donut tutorial I wanted to look into articulated objects. Initially I looked at Blender skeletons but then I found Ryan King Art’s tutorial on mechanical rigging.

Basically for each rigid object the following steps are executed:

  • The 3D cursor is set to a point on the center of the joint axis
  • The origin of the rigid object is set to the 3D cursor
  • The parent of the rigid object is set to the other object it is connected to via the joint (while maintaining the transformation)
  • All forbidden translation axes and rotation axes are locked

The resulting object then can be animated by defining keyframes for joint angles as shown in Blender Guru donut tutorial part 11. Rendering the video using the Cycles engine on my 8 core CPU took 3.5 hours. If you have a GPU supported by Blender this would take much less time.

See below for the resulting video:

Doing the Blender Guru donut tutorial

I finally did the Blender Guru donut tutorial for beginners. Blender is a powerful 3D editor to create models and animate them. Here is a series of images made while progressing through the tutorial.

Initially one creates a torus and distorts it. Furthermore the upper half of the torus is copied, solidified, and sculpted to create the icing of the donut.


One can set the material properties of the donut to make the icing more reflective and add subsurface scattering.


Here is another example using white icing.


One can use texture painting to add a bright ring at the middle height of the torus.


Using geometry nodes one can distribute cylindrical sprinkles on the surface and rotate them randomly around the surface normal. By doing weight painting one can ensure that the sprinkles are less frequent on the sides of the donut.


Furthermore one can select from a set of differently shaped sprinkles and color them randomly using a color map with a few distinct colors.


Using key, fill, and rim lighting, one can illuminate the donut.


Compositing is used to add a background with a blurred white circle creating a color gradient. Also one can add lens distortion and color dispersion to make the image look more like a photo.


See below for the resulting video:

I really enjoyed the tutorial and I hope I will remember the right tool at the right moment when creating models in the future.

All the time there are people finishing the tutorial and posting it on r/BlenderDoughnuts.

OpenGL example in Clojure using LWJGL version 3

Some time ago I published a minimal OpenGL example in Clojure using LWJGL version 2. I used LWJGL 2 because it is available as a Debian package unlike version 3. However one can get LWJGL 3.3.2 with native bindings from the Maven Central Repository.

Note that LWJGL version 3 consists of several packages. Furthermore most packages have native bindings for different platforms. See LWJGL customization page for more information. There is a Leiningen LWJGL project example by Roger Allen which shows how to install all LWJGL libraries using Leiningen. However here we are going to use the more modern approach with a deps.edn file and we are going to use more efficient OpenGL indirect rendering.

The deps.edn file uses symbols of the form <groupID>/<artifactId>$<classifier> to specify libraries. For LWJGL version 3 the classifier is used to install native extensions. The deps.edn file for installing LWJGL with OpenGL and GLFW under GNU/Linux is as follows:

There are more native packages for lwjgl-opengl such as natives-windows and natives-macos.

Displaying a window with GLFW instead of LWJGL 2 is a bit different. The updated code to use lwjgl-glfw and lwjgl-opengl to display a triangle with a small texture is shown below:

If you download the two files, you can run the code as follows:

clj -M raw-opengl-lwjgl3.clj

It should show a triangle like this:


Any feedback, comments, and suggestions are welcome.


Clojure/Java Matrix Library Performance Comparison

This is a quick performance comparison of the Clojure core.matrix library and the Efficient Java Matrix Library. Because core.matrix uses the VectorZ Java library as a backend, direct calls to VectorZ were also included in the comparison. Finally I added fastmath to the comparison after it was pointed out to me by the developer. The criterium 0.4.6 benchmark library was used to measure the performance of common matrix expressions. The Clojure version was 1.11.1 and OpenJDK runtime version was 17.0.6. Here are the results running it on an AMD Ryzen 7 4700U with a turbo speed of 4.1 GHz:

op core. matrix 0.63.0 ejml-all 0.43 vectorz- clj 0.48.0 fastmath 2.2.1
make 4x4 matrix 675 ns 135 ns 50.5 ns 13.1 ns
make 4D vector 299 ns 47.6 ns 9.27 ns 3.67 ns
add 4D vectors 13.5 ns 18.2 ns 9.02 ns 4.29 ns
inverse matrix 439 ns 81.4 ns 440 ns 43.6 ns
element­wise matrix multi­plication 64.9 ns 29.0 ns 29.1 ns 13.7 ns
matrix multi­ plication 102 ns 74.7 ns 100 ns 22.4 ns
matrix-vector multi­plication 20.9 ns 31.2 ns 19.1 ns 6.46 ns
vector dot product 6.56 ns 6.90 ns 4.46 ns 6.36 ns
vector norm 10.1 ns 11.4 ns no support? 3.74 ns
matrix deter­minant 170 ns 7.35 ns 166 ns 7.67 ns
matrix element access 4.14 ns 3.35 ns 3.26 ns 3.53 ns1
get raw data array 12.0 ns 3.00 ns 11.9 ns 13.2 ns1

1requires fastmath 2.2.2-SNAPSHOT or later

See matperf.clj for source code of benchmark script.

Comparing EJML with a mix of core.matrix and direct calls to vectorz:

  • EJML has support for both single and double precision floating point numbers
  • it uses single column matrices to represent vectors leading to slower matrix-vector multiplication
  • it has a fast 4x4 matrix inverse
  • it does not come with a Clojure wrapper
  • it offers fast access to raw data
  • it does not support multi-dimensional arrays

Comparing EJML with fastmath:

  • EJML has support for matrices larger than 4x4
  • EJML gives you access to the matrix as a flat floating point array (fastmath will add support in the future)
  • EJML is mostly slower

The implementations of the libraries are all quite impressive with custom optimisations for small matrices and vectors. Note that I didn’t include Neanderthal in the comparison because it is more suitable for large matrices.

I hope you find this comparison useful.


The large performance difference for matrix inversion is probably because EJML has custom 4x4 matrix classes while VectorZ stops at 3x3. Here is a performance comparison of matrix inverse for 3x3, 4x4, and 5x5 matrices:

op core. matrix 0.63.0 ejml-all 0.43 vectorz- clj 0.48.0 fastmath 2.2.1
3x3 matrix inverse 13.0 ns 48.3 ns 12.2 ns 10.8 ns
4x4 matrix inverse 471 ns 98.3 ns 465 ns 50.3 ns
5x5 matrix inverse 669 ns 172 ns 666 ns not supported

Further updates:

Procedural Volumetric Clouds

Volumetric clouds use 3D density functions to represent clouds in a realistic way. Ray marching is used to generate photorealistic rendering. With modern graphics cards it is possible to do this in realtime.

Sebastian Lague’s video on cloud rendering shows how to generate Worley noise which can be used to generate realistic looking clouds. Worley noise basically is a function which for each location returns the distance to the nearest point of a random set of points. Usually the space is divided into cells with each cell containing one random point. This improves the performance of determining the distance to the nearest point. The following image shows a slice through inverted 3D Worley noise.

Worley slice

Ray marching works by starting a view ray for each render pixel and sampling the cloud volume which is a cube in this example. This ray tracing program can be implemented in OpenGL by rendering a dummy background quad.


The transmittance for a small segment of the cloud is basically the exponent of negative density times step size:

vec3 cloud_scatter = vec3(0, 0, 0);
float transparency = 1.0;
for (int i=0; i<cloud_samples; i++) {
  vec3 c = origin + (i * stepsize + 0.5) * direction;
  float density = cloud_density(c);
  float transmittance_cloud = exp(-density * stepsize);
  float scatter_amount = 1.0;
  cloud_scatter += transparency * (1 - transmittance_cloud) * scatter_amount;
  transparency = transparency * transmittance_cloud;
incoming = incoming * transparency + cloud_scatter;

The resulting sampled cube of Worley noise looks like this:

Worley cube

The amount of scattered light can be changed by using a mix of isotropic scattering and a phase function for approximating Mie scattering. I.e. the amount of scattered light is computed as follows:

  float scatter_amount = anisotropic * phase(0.76, dot(direction, light_direction)) + 1 - anisotropic;

I used the Cornette and Shanks phase function shown below (formula (4) in Bruneton’s paper):

float M_PI = 3.14159265358;
float phase(float g, float mu)
  return 3 * (1 - g * g) * (1 + mu * mu) / (8 * M_PI * (2 + g * g) * pow(1 + g * g - 2 * g * mu, 1.5));

The resulting rendering of the Worley noise now shows a bright halo around the sun:

Anisotropic scattering

The rendering does not yet include self-shadowing. Shadows are usually computed by sampling light rays towards the light source for each sample of the view ray. However a more efficient way is to use deep opacity maps (also see Pixar’s work on deep shadow maps). In a similar fashion to shadow maps, a depth map of the start of the cloud is computed as seen from the light source. While rendering the depth map, several samples of the opacity (or transmittance) behind the depth map are taken with a constant stepsize. I.e. the opacity map consists of a depth (or offset) image and a 3D array of opacity (or transmittance) images.

Deep opacity map

Similar as when performing shadow mapping, one can perform lookups in the opacity map to determine the amount of shading at each sample in the cloud.

Clouds with self-shading

To make the cloud look more realistic, one can add multiple octaves of Worley noise with decreasing amplitude. This is also sometimes called fractal Brownian motion.

Octaves of Worley noise

To reduce sampling artifacts without loss of performance, one can use blue noise offsets for the sample positions when computing shadows as well as when creating the final rendering.

Blue noise sampling offsets

In a previous article I have demonstrated how to generate global cloud cover using curl noise. One can add the global cloud cover with octaves of mixed Perlin and Worley noise and subtract a threshold. Clamping the resulting value creates 2D cloud patterns on a spherical surface.

Scattered global clouds

By restricting the clouds to be between a bottom and top height, one obtains prism-like objects as shown below:

Cloud blocks

Note that at this point it is recommended to use cascaded deep opacity maps instead of a single opacity map. Like cascaded shadow maps, cascaded deep opacity maps are a series of cuboids covering different splits of the view frustum.

One can additionally multiply the clouds with a vertical density profile.

Vertical density profile

Guerilla Games uses a remapping function to introduce high frequency noise on the surfaces of the clouds. The high frequency noise value is remapped using a range defined using the low frequency noise value.

float density = clamp(remap(noise, 1 - base, 1.0, 0.0, cap), 0.0, cap);

The remapping function is defined as follows:

float remap(float value, float original_min, float original_max, float new_min, float new_max)
  return new_min + (value - original_min) / (original_max - original_min) * (new_max - new_min);

The function composing all those noise values is shown here:

uniform samplerCube cover;
float cloud_density(vec3 point, float lod)
  float clouds = perlin_octaves(normalize(point) * radius / cloud_scale);
  float profile = cloud_profile(point);
  float cover_sample = texture(cover, point).r * gradient + clouds * multiplier - threshold;
  float base = cover_sample * profile;
  float noise = cloud_octaves(point / detail_scale, lod);
  float density = clamp(remap(noise, 1 - base, 1.0, 0.0, cap), 0.0, cap);
  return density;

See cloudsonly.clj for source code.

An example obtained using these techniques is shown below:

Remapping of noise

The example was rendered with 28.5 frames per second.

  • an AMD Ryzen 7 4700U with passmark 2034 was used for rendering
  • the resolution of the render window was 640x480
  • 3 deep opacity maps with one 512x512 offset layer a 7x512x512 transmittance array were rendered for each frame
  • 5 octaves of 64x64x64 Worley noise were used for the high frequency detail of the clouds
  • a vertical profile texture with 10 values was used
  • 3 octaves of 64x64x64 Perlin-Worley noise were used for the horizontal 2D shapes of the clouds
  • a 6x512x512 cubemap was used for the global cloud cover

Please let me know any suggestions and improvements!


Update: I removed the clamping operation for the cover sample noise.

Update: Added video below.

Future work

  • add atmospheric scattering with cloud shadows
  • add planet surface and shadow map
  • sampling with adaptive step size
  • Powder sugar effect
  • problems with shadow map at large distances
  • problem with long shadows at dawn