Terminal Renderer Mk. II - Rendering to Text with Compute
-
October 2, 2025
-
This week I brought my terminal renderer to the next level by performing text rendering on the GPU.
-
-
-
-
- The Stanford Dragon, outlined and rendered as Braille characters in a terminal emulator.
-Full video
-
-
-
-
-
Context
-
Unicode Braille
-
- I first messed around with rendering images to the terminal with Braille characters in like 2022 I
- think? I wrote a simple CLI tool
- that applied a threshold to an input image and output it as Braille characters in the terminal. Here's a recording I took back
- when I did it.
-
-
-
-
-
-
-
-
-
0
-
3
-
-
-
1
-
4
-
-
-
2
-
5
-
-
-
6
-
7
-
-
-
-
- The corresponding bit position for each braille dot.
-
- This effect is pretty cool, and it was pretty easy to implement as well. The trick lies in how the
- Unicode Braille block
- is laid out. Every 8-dot Braille combination happens to add up to 256 combinations, the perfect amount to
- fit in the range between 0x2800 (⠀) and 0x28FF (⣿). In other words, every
- character
- within the block can be represented by changing the value of a single byte.
-
-
- The lowest 6 bits of the pattern map on to a 6-dot braille pattern. However, due
- to historical reasons the 8-dot values were tacked on after the fact, which adds
- a slightly annoying mapping to the conversion process. Either way, it's a lot easier
- than it could be to just read a pixel value, check its brightness, and then use a
- bitwise operation to set/clear a dot.
-
-
Ordered Dithering
-
- Comparing the brightnes of a pixel against a constant threshold is a fine way to
- display black and white images, but it's far from ideal and often results in the loss
- of a lot of detail from the original image.
-
-
-
-
-
-
-
- From left to right: Original image, threshold, and ordered dither. Wikipedia
-
-
By using ordered dithering,
- we
- can preserve much more of the subtleties of the original image. While not the "truest" version of
- dithering possible,
- ordered dithering (and Bayer dithering in particular) provides a few advantages that make it very
- well suited to realtime computer graphics:
-
-
Each pixel is dithered independent of any other pixel in the image, making it extremely
- parallelizable and good for shaders.
-
It's visually stable, changes to one part of the image won't disturb other areas.
-
It's dead simple.
-
- Feel free to read up on the specifics of threshold maps and stuff, but for the purposes of this little
- explanation it's
- enough to know that it's basically just a matrix of 𝓃⨉𝓃 values between 0 and 1, and then to determine
- whether a pixel (𝓍,𝓎)
- is white or black, you check the brightness against the threshold value at (𝓍%𝓃,𝓎%𝓃) in the map.
-
-
-
-
The old way™
-
- My first attempt at realtime terminal graphics with ordered dithering
- (I put a video up at the time)
- ran entirely on the CPU. I pre-calculated the threshold map at the beginning of execution and ran each
- frame
- through a sequential function to dither it and convert it to Braille characters.
-
-
- To be honest, I never noticed
- any significant performance issues doing this, as you can imagine the image size required to fill a
- terminal
- screen is signficantly smaller than a normal window. However, I knew I could easily perform the
- dithering on the GPU
- as a post-processing effect, so I eventually wrote a shader to do that. In combination with another
- effect I used to
- add outlines to objects, I was able to significantly improve the visual fidelity of the experience. A
- good example of
- where the renderer was at until like a week ago can be seen in this video.
-
-
- Until now I hadn't really considered moving the text conversion to the GPU. I mean, GPU is for
- graphics,
- right? I just copied the entire framebuffer back onto the CPU after dithering
- and used the same sequential conversion algorithm. Then I had an idea that would drastically reduce the
- amount
- of copying necessary.
-
-
-
-
Compute post-processing
-
- What if, instead of extracting and copying the framebuffer every single frame, we "rendered" the text on
- the GPU
- and read that back instead? Assuming each pixel in a texture is 32 bits (RGBA8), and knowing that
- each braille
- character is a block of 8 pixels, could we not theoretically shave off at least 7/8 of the bytes
- copied?
-
-
- As it turns out, it's remarkably easy to do. I'm using the Bevy engine,
- and hooking in a compute node to my existing post-processing render pipeline worked right out of the
- box.
- I allocated a storage buffer large enough to hold the necessary amount of characters, read it back each
- frame, and dumped
- the contents into the terminal.
-
-
- I used UTF-32 encoding on the storage buffer because I knew I could easily convert a "wide string" into
- UTF-8 before printing it, and
- 32 bits provides a consistent space to fill for each workgroup in the shader versus a variable-length
- encoding like UTF-8. Here's a video of the new renderer working.
- Although now that I think about it, I could probably switch to using UTF-16 since all the Braille
- characters could be represented
- in 2 bytes, and that would be half the size of the UTF-32 text, which is half empty bytes anyways.
-
-
- Okay so I went and tried that but remembered that shaders only accept 32-bit primitive types, so it doesn't matter anyways. This little side quest has been a part of my
- broader efforts to revive a project I
- spent a lot of time on. I'm taking the opportunity to really dig in and rework some of the stuff I'm not
- totally happy with. So there might be quite a few of this kind of post in the near future. Stay tuned.
-