Akshay M's Avatar

Akshay M

@akshayrt

Tinkerer. Gamedev / graphics / GPUs.

12
Followers
35
Following
34
Posts
13.08.2025
Joined
Posts Following

Latest posts by Akshay M @akshayrt

Finally, it makes me a lot less worried about the AI-doomer narrative. This is still tractable. The basics are not that different from what we knew about DL 8-10 years ago. The scale is different, of course! And as I am working on gamedev, I was reassured that I could deal with it, if need be!

03.03.2026 06:27 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

But I'm glad I did this end-to-end. Learnt a lot about the theory of deep-learning (refreshed my 10-year old knowledge as well on SGD, validation metrics etc) and the practical difficulties of doing this on a larger scale with distributed compute.

03.03.2026 06:25 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

...but CUDA-based workflows are just way shorter time-to-get-to-training. On CUDA, I don't have to debug my sharding of data across workers, think too deeply about my compute graph yada yada - all of which I do have to take care of on TPUs. Configuring for deployment becomes a pain!

03.03.2026 06:23 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

4. You can't get NVIDIA GPUs on AWS or GCP at all (I don't have a long billing history with those services)! I wanted to push context lengths higher with higher VRAM or more TPUs but the higher end configs have no availability (and you get quota limited). Had to do with v5 TPU pods, which work...

03.03.2026 06:21 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

3. With an RTX 5090, it is eminently possible to train a model with 1024 embedding dimensions, 512 token context size, 32ish feedforward layers, 4 MTP depth and let it run on 17-20GB of tokenized English Wikipedia with a 131k vocabulary. Let it run for a couple of days. No latent attention.

03.03.2026 06:18 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Some key takeaways are:
1. Clean datasets matter a LOT for training good models. More than model-size/architecture. A 40-50M parameter model might outperform a 10x bigger model if trained on a nicer dataset.
2. Pre-compile your layers! Use inductor (GPUs) or openxla (TPUs etc). 6-15x speedups!

03.03.2026 06:15 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Spent the last 20 days obsessively learning about Transformers and LLMs. The bulk of which was learning about Pytorch, Lightning, OpenXLA. And parsing (and cleaning) wikipedia as a training dataset. Was finally able to train models of 10M-500M parameters and 32-1024 token context on a GPU and TPUs.

03.03.2026 06:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
We Are The Art | Brandon Sanderson’s Keynote Speech
We Are The Art | Brandon Sanderson’s Keynote Speech YouTube video by Brandon Sanderson

This is such a powerful message on AI and art from Brandon Sanderson: www.youtube.com/watch?v=mb3u...

09.02.2026 07:37 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Hikari Physics Spheres
Hikari Physics Spheres YouTube video by GameDevArcana

More progress made on the Hikari engine. After implementing per-frame BVH-updates, I now have collision-detection (and some minor physics) in place. 1000 balls bouncing around a terrain with 5 animated meshes. Pretty decent realtime perf. #gamedev #gameengine #gamephysics youtu.be/UGjnM480xtI

26.01.2026 08:04 πŸ‘ 6 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Switching to Release mode, the thing now refuses to go below 60FPS even with 80 such meshes, even when I switch to battery mode on my laptop! Good lesson as I am now going to optimize things in Debug mode at "low power" profiles so that the real thing has large safety factors built-in. #gamedev

14.01.2026 17:45 πŸ‘ 3 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

I have a stress test for the game-engine I'm writing in which I have 20 animated meshes with 1m vertices with per-frame BVH calcs 3-levels deep on the CPU. Was at 1-2 FPS and a couple of months of optimizations led to ~20FPS. Then I realized I was compiling in Debug mode.πŸ€¦β€β™‚οΈ #gamedev

14.01.2026 17:45 πŸ‘ 3 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

...into them motivated by human reasoning. It is not clear whether the training of these models would have led to CoT reasoning being "discovered" spontaneously by these models.

12.01.2026 07:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Repeat after me: Transformers are just glorified statistical distribution learners and still represent memory rather than logic. What they have demonstrated that a lot of what we do with text and images might be similar to this. But even the "reasoning" models have gating criteria built...

12.01.2026 07:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Sometimes I wonder why people keep making the claims about LLM intelligence that they do (Gemini 3 Pro used here). This is well within the reach of an 8-9 year old (with some patience!) whose energy budget, even while calculating this, is far below that of the hardware that runs these AI models.

12.01.2026 07:10 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Also, finished the Golden Kamuy manga (the final season of the anime is airing). Was pretty good till the middle but then stretched the story for far too long, imho. Nevertheless, very good overall and a very unique setting which educated me a lot about the history of Hokkaido and nearby regions.

07.01.2026 22:00 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Back from the holidays and my codebase feels completely alien to me.

07.01.2026 18:50 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

I believe that, in 2025, gaming on Linux has taken an irreversible leap. I am not an early adopter by any means (I like stable platforms as a dev), but this is my Steam replay for 2025. #linuxgaming #steamonlinux

18.12.2025 08:08 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

It dawned upon me last evening that cities tend to to outlast countries, kingdoms and even civilisations. And they are able to preserve part of their existing identity while amalgamating new ones. There is something very resilient about cities as a construct.

28.11.2025 18:36 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Finished the remaster of Myst (2021), my first time ever playing the franchise - I admit I had to use walkthroughs on more than one occasion! But what a game and quite a reminder of how minimalism can still lead to a thoroughly engrossing game! #myst #riven

17.10.2025 21:10 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Perhaps there will be drastically different stuff down the line, but the more I use AIs, the more I become convinced that the current crop will not replace us. The capabilities of the SotA LLMs seem to be saturating very quickly and they are becoming (very useful) tools rather than replacements.

04.10.2025 00:02 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Video thumbnail

Implemented real-time collision-detection in my engine using per-frame BVH calculation for animated meshes. The colliding objects have their BVH drawn in orange with the intersecting nodes highlighted in cyan. (Also some basic heightmaps and fog). Next step: actual physics! #gamedev #gameengine

22.09.2025 16:23 πŸ‘ 9 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0

Finished playing #Exit8 - very innovative and very inspiring if you are an indie gamedev.

15.09.2025 07:30 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Video thumbnail

Finally, glTF morph and skeletal animations implemented within the engine. And now a dynamic bounding volume hierarchy for both static and animated meshes being calculated in realtime. #gamedev #gameengine

09.09.2025 13:18 πŸ‘ 15 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0
Post image

Mt. Fuji while landing into Tokyo Haneda airport today. #tokyo #mtfuji

02.09.2025 11:48 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

My hacky solution for this has been to generate single-pixel dummy textures even for meshes that do not need them and slotting them in manually (eg, an object may not need an emissive or occlusion texture but the shader has those samplers declared). Not a huge performance/memory cost but ugly.

25.08.2025 21:19 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

And, as is the case for most shader bugs, zero feedback from the GPU, especially my NVidia one. Surprisingly, it was on an Apple M4 Pro GPU that I got some feedback from the driver alerting me to the root cause (ie, the shader expecting a slotted texture because a sampler uniform was declared).

25.08.2025 21:15 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Had a lot of trouble figuring out a weird bug where, if you declare a sampler2D texture-sampler uniform var in GLSL and there are no textures slotted in, the shader goes kaput even if the sampler itself is never used (gated by conditionals). And that was my main PBR shader for most meshes!

25.08.2025 21:12 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Added Apple Neural engine support as well through python bindings to whisper.cpp (via pywhispercpp). Feels like it helps battery life more as compared to using the MPS backend on pytorch. #pytorch #whispercpp

19.08.2025 06:47 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Video thumbnail

Got real-time bounding-volume hierarchy creation in place in my engine. The left mesh with the cans and drums has 300k triangles. #gamedev

16.08.2025 19:22 πŸ‘ 4 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0
Making sure you're not a bot!

Just saw on HN that ffmpeg added whisper support! Darn it! But kudos to them!: code.ffmpeg.org/FFmpeg/FFmpe...

14.08.2025 02:51 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0