Matteo Collina's Avatar

Matteo Collina

@nodeland.dev

Platformatic.dev Co-Founder & CTO, Node.js TSC member, Lead maintainer Fastify, Board OpenJS, Conference Speaker, Ph.D. Views are my own.

4,362
Followers
356
Following
1,301
Posts
17.03.2023
Joined
Posts Following

Latest posts by Matteo Collina @nodeland.dev

Skew Protection is available now in ICC as an experimental feature.

If your team wants to try it in a real enterprise setup, let me know; my DMs are open.

Full deep-dive blog post: blog.platformatic.dev/skew-protect...

06.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

The deployment lifecycle is a clean state machine.

ICC monitors traffic on draining versions. When there's zero traffic (or the grace period elapses), it removes routing rules, scales to zero, and optionally deletes the old Deployment.

06.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

How it works:

- Each app version runs as a separate, immutable K8s Deployment
- ICC detects new versions via label-based discovery
- A __plt_dpl cookie pins users to their deployment version
- Old versions drain gracefully, then get cleaned up automatically

06.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

Our solution: ICC pins each user session to the version they started with.

User starts on version N? All their requests go to version N, even after you deploy version N+1.

We use the Kubernetes Gateway API for version-aware routing, with ICC as the control plane.

06.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

This is a distributed systems problem that slows teams down.

Fear of breaking changes leads to larger, less-frequent deployments that carry MORE risk.

In a world where AI lets you write code faster, the bottleneck lies in the gap between code and production.

06.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

The problem: version skew.

When you deploy a new version, users still on the old frontend send requests to the new backend. APIs change, shared TypeScript types break, React Server Components hydration fails.

The result? Broken UI, data corruption, and support tickets piling up.

06.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

We just shipped something big: Skew Protection for your Kubernetes apps, built right into the @platformatic Intelligent Command Center (ICC).

Think @Vercel-style deployment safety, but running in your own K8s cluster. No migration needed.

Here's why it matters. ๐Ÿงต

06.03.2026 16:59 ๐Ÿ‘ 6 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Skew Protection is available now in ICC as an experimental feature.

If your team wants to try it in a real enterprise setup, let me know; my DMs are open.

Full deep-dive blog post: blog.platformatic.dev/skew-protect...

05.03.2026 16:59 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

The deployment lifecycle is a clean state machine.

ICC monitors traffic on draining versions. When there's zero traffic (or the grace period elapses), it removes routing rules, scales to zero, and optionally deletes the old Deployment.

05.03.2026 16:59 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

How it works:

- Each app version runs as a separate, immutable K8s Deployment
- ICC detects new versions via label-based discovery
- A __plt_dpl cookie pins users to their deployment version
- Old versions drain gracefully, then get cleaned up automatically

05.03.2026 16:59 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

Our solution: ICC pins each user session to the version they started with.

User starts on version N? All their requests go to version N, even after you deploy version N+1.

We use the Kubernetes Gateway API for version-aware routing, with ICC as the control plane.

05.03.2026 16:59 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

This is a distributed systems problem that slows teams down.

Fear of breaking changes leads to larger, less-frequent deployments that carry MORE risk.

In a world where AI lets you write code faster, the bottleneck lies in the gap between code and production.

05.03.2026 16:59 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

The problem: version skew.

When you deploy a new version, users still on the old frontend send requests to the new backend. APIs change, shared TypeScript types break, React Server Components hydration fails.

The result? Broken UI, data corruption, and support tickets piling up.

05.03.2026 16:59 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

We just shipped something big: Skew Protection for your Kubernetes apps, built right into the Platformatic Intelligent Command Center (ICC).

Think Vercel-style deployment safety, but running in your own K8s cluster. No migration needed.

Here's why it matters. ๐Ÿงต

05.03.2026 16:59 ๐Ÿ‘ 9 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Read the full blog post at blog.platformatic.dev/auditable-ai...

04.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

Run it locally!

๐Ÿ”— github.com/platformatic/ai-gateway-auditable

04.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Why this matters: graceful degradation vs data loss.

Temporary failures = brief audit lag.
Lost audit data = regulatory risk, fines.

We trade insight delays for certainty no evidence is lost.

04.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Streaming preserved: proxy pipes SSE chunks real-time, buffers, then emits audit record with streamed: true.

Users get low-latency streaming. Operators get complete records. Win-win.

04.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

The audit pipeline:
1. Proxy enqueues payload
2. audit-worker writes JSONL batches (100 records or 5s)
3. Upload to S3 with SigV4
4. Delete local only after success

Hour-partitioned keys work great with Athena.

04.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Why filesystem-based storage? No Redis needed = simpler local dev. Still crash-tolerant: queue survives process restarts.

โš ๏ธ Trade-off: moving audit to main response cycle introduces latency.

04.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

Architecture at a glance:
โ€ข proxy โ€” low-latency request/response
โ€ข audit-worker โ€” durable queue consumption + batch shipping to S3

Keeps user-facing traffic fast while audit pipeline catches up safely.

04.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

That's why we built ai-gateway-auditable.

OpenAI-compatible gateway with @Platformatic Watt:
โœ… Provider routing with fallback
โœ… Durable audit logging to S3
โœ… Production-ready

๐Ÿ”— github.com/platformatic/ai-gateway-auditable

04.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

The problem: direct integrations = audit gaps. Finance needs clean attribution. Security needs auditable traces. Our early adopter saw up to 15% of request logs missed during peak volume, latency spiked 2x when providers slowed.

04.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

Every engineering team hitting scale with AI faces the same headache: what started as a simple provider integration becomes an operational nightmare. Usage tracking, cost containment, and audit trails slip out of reach fast.

๐Ÿงต Here's how we solved it.

04.03.2026 16:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

The clanker can certainly do that quickly enough. Use node:sqlite.

04.03.2026 12:40 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Full story at blog.platformatic.dev/job-queue-re...

03.03.2026 16:59 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

We've been testing this @platformatic, and it's been solid.

But we want YOUR edge cases. Try it, break it, open issues.

๐Ÿ“ฆ npm install @platformatic/job-queue
๐Ÿ”— github.com/platformatic...

03.03.2026 16:59 ๐Ÿ‘ 1 ๐Ÿ” 1 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

TypeScript-native. Typed payloads and results.

03.03.2026 16:59 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Production reliability features:

โšก Reaper detects stalled jobs and requeues them
โšก Leader election so multiple Reaper instances run safely
โšก Graceful shutdown waits for in-flight jobs
โšก Failed jobs persist errors for inspection

Ship deploys without losing work.

03.03.2026 16:59 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Three storage backends for different stages:

๐Ÿ‘ป MemoryStorage - dev & testing
๐Ÿ’พ FileStorage - simple single-node deploys
โ˜๏ธ RedisStorage - production, horizontal scaling, leader election

Start local, go distributed. No code changes.

03.03.2026 16:59 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0