https://assets.learnk8s.io/linkedin-174.png
Just landed: Learn Kubernetes weekly 174! My top picks:
๐ฎ Making and Scaling a Game Server with Agones
๐๏ธ Zero-Downtime PostgreSQL Migration
๐ From Chaos to 99.9% Uptime
๐ k8s-d2: Kubernetes visualization
Read it here: https://kube.today/issues/174
11.03.2026 12:11
๐ 8
๐ 0
๐ฌ 0
๐ 0
What worked was turning the specification into code that checks the current state, compares it to what should exist, and blocks progress until earlier steps are done.
The specification isn't a document. It's a build system.
https://danielepolencic.com/specification-is-the-product
10.03.2026 13:36
๐ 0
๐ 0
๐ฌ 0
๐ 0
I tried several specification languages against a workflow I actually maintain.
The AI understood every step. It just skipped the ones it didn't feel like doing.
Improving the description changed nothing.
10.03.2026 13:36
๐ 0
๐ 0
๐ฌ 1
๐ 0
AI coding tools are getting faster. Some people run them directly on production, skipping reviews and checks entirely.
Others build chains of requirements docs & architecture decisions.
One camp says code is the artifact. The other says specifications.
10.03.2026 13:36
๐ 1
๐ 0
๐ฌ 1
๐ 0
Each one either got bypassed โ the agent runs as me, same UID, same permissions โ or locked the agent out so hard it couldn't do its job.
I came to the conclusion that the credentials shouldn't be on the machine at all.
https://danielepolencic.com/hiding-secrets-from-ai-agents
06.03.2026 13:31
๐ 3
๐ 1
๐ฌ 0
๐ 0
I tried five ways to stop this:
- Encrypted the files
- Moved secrets to Keychain
- Gated with Touch ID
- Built a compiled native addon
- Ran the agent in a sandbox
06.03.2026 13:31
๐ 0
๐ 0
๐ฌ 1
๐ 0
I built a few CLI tools for myself: one searches Gmail, another reads GDocs.
My AI agent needed to download an email attachment. My CLI didn't have that command. So it found the credentials on disk, called the API directly, and leaked my refresh token.
06.03.2026 13:31
๐ 1
๐ 0
๐ฌ 1
๐ 0
https://assets.learnk8s.io/linkedin-173.png
Just landed: Learn Kubernetes weekly 173! My top picks:
๐งช Integration Testing with Kubernetes
๐ Vault OIDC Authentication
๐ก๏ธ Admission & Runtime Guardrails
โ
Kogaro Config Hygiene Agent
Read it here: https://kube.today/issues/173
04.03.2026 12:11
๐ 8
๐ 0
๐ฌ 0
๐ 0
I'm presenting a live session this Thursday with vCluster:
GPU Multi-Tenancy: When to Share, When to Separate
Register here: ku.bz/multitenant26
03.03.2026 14:06
๐ 0
๐ 0
๐ฌ 0
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772544389/gpu-sharing-problems-2026/slide-9.png
The worst case: two process contexts each believe there's enough memory. Hidden reservations and runtime overhead keep shrinking real headroom.
When another workload arrives, both crash with out-of-memory errors.
03.03.2026 14:06
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772544388/gpu-sharing-problems-2026/slide-8.png
Even after memory is freed, allocation patterns can leave fragmented gaps. You may have free VRAM in total and still fail the next allocation.
That's a failure mode that doesn't show up in your dashboard until it happens.
03.03.2026 14:06
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772544387/gpu-sharing-problems-2026/slide-7.png
nvidia-smi is useful, but the driver uses a pooling allocator. Reserved memory, active model memory, and temporary workspace don't cleanly add up to one number.
You get useful signals. Not a precise per-workload memory bill.
03.03.2026 14:06
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772544386/gpu-sharing-problems-2026/slide-6.png
So if a batch job launches long kernels, latency-sensitive requests queue behind it.
Average utilization can still look fine. P95 and P99 latency tells a different story.
03.03.2026 14:06
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772544385/gpu-sharing-problems-2026/slide-5.png
CPUs preempt tasks constantly. They pause one task and rotate work across cores. That creates fair turn-taking.
GPU kernels typically run to completion. The next workload waits at kernel boundaries instead of getting a fair slice of time.
03.03.2026 14:06
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772544384/gpu-sharing-problems-2026/slide-4.png
GPUs work differently. The driver is in charge of what the kernel would normally handle: memory allocation, execution sequencing, and runtime coordination.
Your real sharing boundaries are defined by driver behavior, not kernel primitives.
03.03.2026 14:06
๐ 1
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772544383/gpu-sharing-problems-2026/slide-3.png
For CPU and memory, Kubernetes uses cgroups. A container asks for a fraction of CPU or a fixed memory limit, and the Linux kernel enforces it.
That gives predictable limits and fair sharing between workloads.
03.03.2026 14:06
๐ 1
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772544380/gpu-sharing-problems-2026/slide-1.png
You want to share GPUs: one team runs inference, another trains models, and both need the same expensive cards.
The problem is that GPUs don't behave like CPU and RAM under contention.
(I will cover this on Thursday: ku.bz/multitenant26 )
๐งต
03.03.2026 14:06
๐ 9
๐ 8
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772004215/kubex-book-2026/slide-10.png
The book covers measurement, architecture decisions, and full-stack right-sizing across 4 chapters.
Free download: ku.bz/KL4jRvsL4
02.03.2026 12:41
๐ 0
๐ 0
๐ฌ 0
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772004212/kubex-book-2026/slide-9.png
When you rent a GPU node, you also pay for the CPU and memory that comes with it. It's a bundle.
If the GPU is fully reserved but your workloads barely touch the CPU and memory, most of what you're paying for sits idle.
02.03.2026 12:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772004210/kubex-book-2026/slide-8.png
You can also split a GPU into separate sections with hard boundaries.
Each gets its own compute and memory. No interference between workloads.
A training job on one section reaches 89% efficiency โ almost the same as having the whole GPU.
02.03.2026 12:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772004207/kubex-book-2026/slide-7.png
You can share a GPU by giving pods turns on the same hardware. Sounds efficient.
But GPUs don't multitask like CPUs. Each job runs to completion before the next starts.
A training job at 92% efficiency alone drops to 47% when sharing.
02.03.2026 12:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772004204/kubex-book-2026/slide-6.png
A pod using almost no CPU and RAM can still lock an entire GPU node.
GPUs are the scheduling bottleneck. The remaining CPU and memory sit idle โ and you're paying for all of it.
02.03.2026 12:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772004202/kubex-book-2026/slide-5.png
The metrics that actually matter:
SM Active โ are compute cores busy or waiting?
DRAM Active โ is memory bandwidth the bottleneck?
Tensor pipeline โ is mixed-precision hitting the fast path?
02.03.2026 12:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772004199/kubex-book-2026/slide-4.png
nvidia-smi's GPU-Util is a time-based "busy" signal โ the percent of time any kernel was running.
A pod doing nothing useful can show 54%.
02.03.2026 12:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772004196/kubex-book-2026/slide-3.png
Three layers, three different answers.
Kubernetes: "we're full."
nvidia-smi: "2% utilization."
The app: "6.67 requests per second."
All correct. None tells you if the GPU is efficient.
02.03.2026 12:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772004194/kubex-book-2026/slide-2.png
Your dashboards show 4/4 GPUs allocated. Everyone assumes they're being used.
But allocation just means "reserved." It says nothing about whether the GPU is actually doing work.
02.03.2026 12:41
๐ 0
๐ 0
๐ฌ 1
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1772004192/kubex-book-2026/slide-1.png
Gulcan and I wrote a free book on right-sizing GPUs in Kubernetes.
Here's the short version (thread)
02.03.2026 12:41
๐ 10
๐ 8
๐ฌ 1
๐ 0
https://assets.learnk8s.io/linkedin-172.png
Just landed: Learn Kubernetes weekly 172! My top picks:
๐ Data Streaming: Kafka + Flink Baggage Tracker
๐ฅง Raspberry Pi Home Kubernetes Cluster
๐ค AI Document Processing with Ray
๐ฐ Wozz: Kubernetes Cost Tool
Read it here: https://kube.today/issues/172
25.02.2026 12:11
๐ 7
๐ 0
๐ฌ 0
๐ 0
learnkube.com/etcd-breaks-at-scale
24.02.2026 13:21
๐ 1
๐ 0
๐ฌ 0
๐ 0
https://res.cloudinary.com/learnk8s/image/upload/v1771937075/etcd-breaks-at-scale-2026/slide-8.png
K3s ships Kine, a shim that speaks the etcd API but stores data in SQLite or PostgreSQL.
AWS replaced Raft with a journal service. Google swapped in Spanner. All kept the etcd API.
24.02.2026 13:21
๐ 2
๐ 0
๐ฌ 1
๐ 0