Garik (@garik.codes) — bluesky.baby

Listers is low-key loaded with good software advice 🦉🐦🦢🦜

02.12.2025 06:29 👍 1 🔁 0 💬 0 📌 0

Turns out it was a bad GIF. Whether it was the title or the file itself I'm not sure.

Regardless, better error descriptions and handling in general lead to a better world :)

01.12.2025 19:35 👍 0 🔁 0 💬 0 📌 0

How S3 works

🧵 19/19 To recap, S3 is the world's hard drive.

It's cheap, fast enough, and extremely reliable for most scenarios.

Smart system design helps AWS cut costs, save on compute and storage, while also maintaining high availability and acceptable latency.

Now you know!

/rant

01.12.2025 19:30 👍 0 🔁 0 💬 0 📌 0

S3 is alive!? https://www.allthingsdistributed.com/2023/07/building-and-operating-a-pretty-big-storage-system.html

🧵 18/19 Crazier still is that AWS doesn't just add more and more features, they proactively edit their codebase.

In 2021 they rewrote a fundamental service, ShardStore, in Rust. It has 40k LOC and is frequently updated without interruptions to service.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

S3 Metadata for better data access and usability

S3 Tables for better performance, access control, and structure

🧵 17/19 S3 Tables deals with Parquet files and can optimize how they're squished together and stashed away.

Meanwhile, S3 Metadata makes it much easier to search and organize all that data. It makes the data that is already present more useful.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

Big Data has seen big growth over the last decade and is not slowing down

🧵 16/19 One of the newer, bigger user groups on S3 is data analytics. Think endless stuff in data lakes.

It's one thing to get data, but how to use it?

AWS is making changes to their infra based on this shift, and they now have automated key workflows that users had to implement themselves.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

One example of SSDs reducing overall costs and compute for a customer

🧵 15/19 The most recent storage class even uses SSDs!

Here this makes sense because utilizing expensive SSDs for the right data means savings, both in time and money for AWS as well as customers.

It acts as S3's RAM--serving objects with very low latency so elsewhere compute is minimized.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

S3 storage classes vary greatly in retrieval time and use cases https://cloudiamo.com/2024/12/13/s3-lifecycle-or-intelligent-tiering-object-size-always-matters/

🧵 14/19 Similarly, S3 offers many different tiers of storage to optimize the spread of hot and cold data throughout the system.

You can make rules yourself with S3 Lifecycle or automate the process with Intelligent-Tiering.

Retrieval times vary more than 10^6x from single-digit ms to 12 hours 🕦

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

Balancing a growing operation by moving cold data on to new racks first for even distribution as hot data arrives

🧵 13/19 That's where balancing data comes in.

AWS engineers designed the system to preload colder data onto new storage racks to maintain an even distribution as newer, hotter data arrives.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

Data access over time in S3 cools down

🧵 12/19 When data moves into S3 it starts hot and gradually cools, i.e. it's used more often when it's young and is accessed less frequently as it ages.

This fact of life could mess with operations if not dealt with properly.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

🧵 11/19 Another cool feature of S3 is that it becomes more predictable and resilient as it scales.

Since data is accessed across hard drives, and you can't really forecast reads from customers, having a large operation smooths out aggregate demand.

As the service grows, it becomes less spiky.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

Shuffle sharding and erasure coding can eliminate long tails when it comes to latency. Long requests are retried after exceeding p95 times https://youtu.be/NXehLy7IiPM?si=sUv9AY6Xs7RCHgs_&t=1948

🧵 10/19 Erasure coding also helps with testing code in production. Since everything is super redundant, it's fine if things break in prod.

S3 even eliminates long tails by canceling requests that go over its p95. The request is resent to a different server, and thus a different shard. And it works!

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

Erasure coding is safe and efficient but compute heavy

🧵 9/19 S3 achieves such high durability through erasure coding, essentially splitting up objects into chunks. But also doing some voodoo magic on that data and storing that, too.

The advantage of this approach is that instead of needing 3x storage from straight replication, data can be safe at 1.8x

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

That's basically never

🧵 8/19 And not only is it big, it is reliable.

S3 is designed for 99.999999999% data durability. Famously that's 11 nines.

Say you had 10,000 objects in S3, math dictates that you'd lose 1 object in 10,000,000 years.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

S3 by the numbers

🧵 7/19 To say S3 is a large service is an understatement. Look at these numbers 👀

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

The evolution of HDDs is impressive https://highscalability.com/behind-aws-s3s-massive-scale/

🧵 6/19 For the past 30 years HDDs have been stuck at 120 IOPS. And they might be forever.

Progress elsewhere, however, isn't slowing yet, and there are already solid sketches of 200TB drives within the next 10 years.

So, design around the constraint! Shard, and shard hard.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

HDDs are beyond accurate https://www.allthingsdistributed.com/2023/07/building-and-operating-a-pretty-big-storage-system.html

🧵 5/19 Slight detour--hard drives are wonderful and illustrate the insane progress in hardware over the last 75 years.

The catch is that they are constrained for I/O 😭

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

The power of two random choices for better storage capacity across the fleet https://youtu.be/NXehLy7IiPM?si=bbT8qAqM80-AEze_&t=2117

🧵 4/19 The way data is added to the system is called shuffle sharding. It's totally random, but not just regular random.

Before committing to a drive, S3 actually looks at 2 random drives, then picks the least used one.

This small change has outsized impact in organizing and spreading out data.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

🧵 3/19 Basically S3 operates by spreading out simple GET and PUT HTTP requests across many servers and stores sharded data on insanely cheap--and slow--hard disks.

Since S3 leverages massive parallelism, customers hardly notice any lag. Some customers have data stored on over a million hard drives!

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

Netflix <3 S3 https://www.cloudzero.com/blog/aws-biggest-customers/

🧵 2/19 Amazon's Simple Storage Service (S3) came onto the scene in 2006 as a backup utility and place to keep media.

It has grown and evolved a lot in the past two decades!

Its biggest customer today, Netflix, wasn't even streaming video in 2006!

Still, S3's core concepts remain unchanged.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

The Professor says "Let's crack in" to learning about S3!

🧵 1/19 While prepping Brussels sprouts for Thanksgiving I took the time to dive deeper into S3's architecture, and it's pretty sweet and genius.

Thread incoming.

01.12.2025 19:30 👍 0 🔁 0 💬 1 📌 0

No prob! Think I identified the form issue here:

github.com/overcommitte...

01.12.2025 18:22 👍 0 🔁 0 💬 0 📌 0

Buffer says it's up

Bluesky says it's not

Uh oh, I guess trying to post a 19-part thread via Buffer was not a good idea. Who knew?

Well, at the very least I learned a bunch about S3 the other day. Here's hoping I can retrieve it 🤞

30.11.2025 22:12 👍 1 🔁 0 💬 1 📌 0

Just started digging into some re:Invent videos on YouTube and it's nice to be able to learn so much about AWS's infra + design philosophies!

Looking forward to new material 👍

29.11.2025 14:51 👍 2 🔁 0 💬 0 📌 0

GitHub - mainmatter/svelte-lynx-integration: A POC for the svelte-lynx integration A POC for the svelte-lynx integration. Contribute to mainmatter/svelte-lynx-integration development by creating an account on GitHub.

Might be out of date, but there's this:

github.com/mainmatter/s...

27.11.2025 01:41 👍 2 🔁 0 💬 0 📌 0

Kids coding Mission 12

LEGO robotics!

Yesterday we made some good progress as we enter the final push before our first comp.

The kids tweaked the robot design ever so slightly and increased the consistency of one of their combo moves.

I thought they should move on, but they were just so happy to see it repeat itself for 15 minutes.

27.11.2025 01:35 👍 1 🔁 0 💬 0 📌 0

I think the slop part is key--it's something that underdelivers no matter the context and is probably gratuitous or erroneous to an extent.

26.11.2025 21:55 👍 2 🔁 0 💬 1 📌 0

Bait and switch

https://www.x402.org/x402.pdf

Reading up on x402 and I feel lied to with this 'one-pager' being 2 pages 🙅‍♂️

Guess they meant front and back.

26.11.2025 19:08 👍 0 🔁 0 💬 0 📌 0

In Zen, Firefox-based

In Arc, Chrome-based

@overcommitted.dev Was trying to submit some feedback on links in show pages but it appears the contact form isn't working?

Tried on multiple browsers with ad blockers off and it always 404/405s.

26.11.2025 17:55 👍 2 🔁 0 💬 1 📌 0

🧵 11/11 This is a thorny situation.

I'm not anti-efficiency or economies of scale. I'm not against complicated clean energy on the Columbia if it's already in place. I'm not saying Big Data shouldn't operate.

What I am saying is that this is absurd and took way too long to start getting sorted.

26.11.2025 02:14 👍 0 🔁 0 💬 0 📌 0

Garik

Latest posts by Garik @garik.codes