David Jayatillake (@jayatillake)

How big should your data team be? Founders and CEOs are wondering if their data function is bloated and if they should replace everyone with AI agents. Data Leaders are scrambling to defend why they need a 15-people data team in a 200...

How big should your data team be?

Data teams are often oversized. A company of 200 people rarely needs 15+ data staff, usually 5% of org size is enough

dataactionmentor.com/knowledge-ba...

28.09.2025 06:42 👍 2 🔁 1 💬 0 📌 0

Try to find a non-traditional role that is more suited to a future where engineering is very cheap. If you have an idea, try build it yourself. The experience of trying to found is more valuable than employee experience now and even more so in the coming years.

28.09.2025 08:53 👍 2 🔁 0 💬 0 📌 0

The amount you love someone is proportional to how often you Ghiblify their pictures.

29.07.2025 16:59 👍 1 🔁 0 💬 0 📌 0

This week I look at agents.

I think this is a new way to build where we don’t intentionally build code-based software.

open.substack.com/pub/davidsj/...

01.07.2025 16:34 👍 1 🔁 0 💬 0 📌 0

China's biggest public AI drop since DeepSeek, Baidu's open source Ernie, is about to hit the market Chinese internet search giant Baidu will open source its Ernie gen AI large language model as soon as this week, with uncertain consequences for the market.

BERT and ERNIE! 😂

tracking.tldrnewsletter.com/CL0/https:%2...

30.06.2025 20:37 👍 2 🔁 0 💬 0 📌 0

I don't usually share photos of my family on social media for good reason, but I'm happy to share these ones!

20.06.2025 16:00 👍 5 🔁 0 💬 0 📌 0

My AI Skeptic Friends Are All Nuts My smartest friends have bananas arguments about LLM coding.

This post encapsulates how I feel about the current state of LLMs and doomers etc. Really great read:

fly.io/blog/youre-a...

06.06.2025 17:07 👍 0 🔁 0 💬 0 📌 0

So when I've attended Snowflake summit before, I've usually written a blog post talking about the new features released, etc. Is someone going to do that this year, given I didn't go? 😊

#datasky #databs

06.06.2025 08:24 👍 1 🔁 0 💬 0 📌 0

It is possible to build machine learning systems which punch up instead of punching down.

06.06.2025 01:52 👍 691 🔁 128 💬 9 📌 3

Got a cool story about something in the data engineering space? You should 💯 submit it as a talk to Current 2025 in New Orleans 😁

Do it! Now! CfP is open until 15th June.

sessionize.com/current-2025...

(Pro-tip: you only need an abstract at this point; writing the talk can be later 😅)

#dataBS

05.06.2025 08:59 👍 4 🔁 1 💬 1 📌 1

This is genuinely one thing you can rely on AI for.

23.05.2025 17:16 👍 2 🔁 0 💬 1 📌 0

It was actually very impressive. Lots of stuff I want to try.

21.05.2025 20:11 👍 1 🔁 0 💬 0 📌 0

At the London Data Practitioners Meetup with @pedramnavid.com @jayatillake.bsky.social @rittmananalytics.bsky.social and the London Dagster community

14.05.2025 17:15 👍 2 🔁 1 💬 0 📌 0

I also think people don’t use the tags as we have found each other. I almost exclusively use the popular with friends feed.

14.05.2025 07:06 👍 2 🔁 0 💬 0 📌 0

It’s not but you don’t have to keep declaring ctes. May be able to have partial queries too.

14.05.2025 06:54 👍 3 🔁 0 💬 0 📌 0

Theyre still here just quieter than at the start. More of them though

14.05.2025 06:51 👍 2 🔁 0 💬 1 📌 0

Doctor’s orders 🫡

27.04.2025 12:54 👍 4 🔁 0 💬 1 📌 0

Siri’s new boss is already making big internal changes, per report - 9to5Mac Siri’s new boss at Apple, Mike Rockwell, has reportedly wasted no time making big changes internally to the people building its assistant.

I still think this is the biggest prize in AI. If Siri could actually do most things you do on a phone manually...

9to5mac.com/2025/04/22/s...

24.04.2025 19:00 👍 1 🔁 0 💬 0 📌 0

Haha yes but he fits the bill.

23.04.2025 07:10 👍 0 🔁 0 💬 0 📌 0

@petefein.bsky.social

22.04.2025 22:41 👍 0 🔁 0 💬 1 📌 0

I wonder what the limit difference between CSV and Parquet would be under real conditions, where most queries only need a tiny subset of large datasets. You could probably handle >petabyte datasets on that EC2 machine with good partitioning of Parquet or using Iceberg.

22.04.2025 22:37 👍 3 🔁 0 💬 0 📌 0

Well, if it works, the real engineers can tidy it up or more likely do nothing and talk about code standards.

22.04.2025 12:09 👍 1 🔁 0 💬 1 📌 0

Has anyone tried Llama 4 Maverick yet? How big a machine does it need to run locally?

@simonwillison.net

07.04.2025 15:36 👍 0 🔁 0 💬 0 📌 0

Looks like Nintendo became the best at console FPS.

02.04.2025 13:30 👍 0 🔁 0 💬 0 📌 0

Oh no! I’ve been enjoying bluesky for the data stuff but can imagine that it’s swung very radically left on other topics.

01.04.2025 08:41 👍 0 🔁 0 💬 0 📌 0

@windsurfai.bsky.social

24.03.2025 16:07 👍 1 🔁 0 💬 0 📌 0

I've seen many blog posts and social posts by these supposed true artisans saying that they tried this method, and the output was subpar.

Well, maybe it would have taken just as long if you had just written the code, but for the rest of us, we now have an option to build without you.

24.03.2025 16:06 👍 0 🔁 0 💬 2 📌 0

Vibe coder Free like a puppy

Once again, we've devised a derogatory name for something many of us are doing: "Vibe coding".

Just like "Citizen Data Scientist", "Excel Data Analyst", and many other terms made to belittle by the supposed true artisans that came before.

open.substack.com/pub/davidsj/...

24.03.2025 16:06 👍 0 🔁 0 💬 2 📌 0

yeah but was there coffee down there, and if so was it any good?

17.03.2025 23:22 👍 4 🔁 0 💬 1 📌 0

David Jayatillake

Latest posts by David Jayatillake @jayatillake