Erald David's Avatar

Erald David

@eraldds

Analytics @BukuWarung Avid reader. Random book quotes one at a time

17
Followers
43
Following
27
Posts
22.11.2024
Joined
Posts Following

Latest posts by Erald David @eraldds

Flipping lower order bit

10.01.2025 15:56 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Visited a seminar last year where most small talk that we had is "You have Platform A certification? You need to learn those".

It's so over for us folks

08.01.2025 17:21 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

It will solve me so muchh time if during my first year of analytics someone said "Look kid, both the tools and business should be your focus". I was so lucky I came up from business major so I can focus only on the tools. But man.

08.01.2025 17:21 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I love how every article about Analytics kept screaming "Raise your technical skills. Learn platform!" but the OG book from Kimball mentioned (in the first 20 page of the book): "Yeah no, you're hybrid DBA and MBA"

08.01.2025 17:17 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Ohh, love data modeling. Could you share any link for this?

05.01.2025 13:08 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Hannes mentioned this in Joe Reis's podcast and I still don't understand..

β€œAnd it's, by the way, it is, again, I won't name names, but it is completely common for analytical data management systems to ignore things like primary key constraints.”

Would love to have more elaboration on this

18.12.2024 15:51 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0


But go spend couple of hours searching for this problem in any Data publication and you will finds virtually nothing

18.12.2024 04:10 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

However, all these manual processes still need to be reported (as clean data) and integrated into existing systems (as structured data) by someone or a dedicated team. Quite often, this falls to the Data team.

18.12.2024 04:10 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Unexplored Topic: Most startups (at least in Indonesia) currently take a Business-Market Fit-first approach, keeping all processes manual until they generate revenue, before committing precious resources (like engineering bandwidth) to properly productize their solutions.

18.12.2024 04:10 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
S3 Is the New SFTP Customers want their data. A customer data lake is a great way to give it to them.

Last post of the year! S3, Parquet, Iceberg, and @duckdb.org are a great way to get customers their data.

16.12.2024 16:10 πŸ‘ 33 πŸ” 2 πŸ’¬ 7 πŸ“Œ 4

I didn't notice any comment praising about the book that you just picked so I will be the first person to say it: Data and Reality is great. Perfect book!

I went from "Huh, quite weird to choose old book for this" to "God, if I read this book early of my career, I can save so much time"

17.12.2024 09:19 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Book Club is happening - find all the details here: jennajordan.me/book-club/

#databs #datasky

15.12.2024 01:05 πŸ‘ 34 πŸ” 6 πŸ’¬ 6 πŸ“Œ 2

The first book would have to be Data & Reality by Bill Kent (2nd ed of course).

After that there are many potential options but I’ve wanted to have a book club for Data & Reality for so long that it has to be the first one. We could call it the β€œData Philosophizing” book club or something.

05.11.2024 17:59 πŸ‘ 45 πŸ” 2 πŸ’¬ 12 πŸ“Œ 2
#25 - BigQuery + Snowflake + Redshift + Databricks + DuckDB (CMU Intro to Database Systems)
#25 - BigQuery + Snowflake + Redshift + Databricks + DuckDB (CMU Intro to Database Systems) YouTube video by CMU Database Group

For my last class this semester, I tried to cram our Advanced Database course into one lecture. We cover the following database systems in 60min: youtu.be/fr5lIchF6pw
β€’ Google Dremel / BigQuery
β€’ Snowflake
β€’ Amazon Redshift
β€’ Yellowbrick
β€’ Databricks Photon
β€’ @duckdb.org
β€’ TabDB

05.12.2024 22:39 πŸ‘ 121 πŸ” 18 πŸ’¬ 6 πŸ“Œ 0

Fantastic opportunity for ppl in #mlsky #databs #stats etc to write their own perspectives on how machine learning and engineering have changed over the course of your careers (see original link in the thread), especially last 5 years. Would love to read all of these posts!

06.12.2024 13:55 πŸ‘ 29 πŸ” 3 πŸ’¬ 4 πŸ“Œ 0
Preview
a cartoon character sitting on a couch with a make gifs at gifsoup.com logo ALT: a cartoon character sitting on a couch with a make gifs at gifsoup.com logo

"Who tf start conversation like this? I just sat down"

(yeah I know wrong picture but the GIF doesn't exist)

01.12.2024 12:58 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I just completed "Historian Hysteria" - Day 1 - Advent of Code 2024 #AdventOfCode adventofcode.com/2024/day/1

01.12.2024 11:01 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I filed this case to "I know the concept already but still surprised with the instantiation" folder

No wonders why Einstein loves compounding

29.11.2024 20:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

This also hammer the points how things can quickly change when you can "seize" small win and compounds it.

"Yeah you know that city where grandpa used to go to check the price of corn? Pack up your suit man, we gonna bet against Argentina's Peso"

29.11.2024 20:11 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

It's crazy to think that Chicago become the center of modern finance because their geographic advantage and .....cow?

What does analogy that fit into this case?

On top of my head: it's like knowing your local IKEA become 3 Michelin star restaurant in the future... because they can cook meatball

29.11.2024 20:08 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

FYI, here's the entire code to create a dataset of every single bsky message in real time:

```
from atproto import *
def f(m): print(m.header, parse_subscribe_repos_message())
FirehoseSubscribeReposClient().start(f)
```

28.11.2024 09:56 πŸ‘ 441 πŸ” 62 πŸ’¬ 19 πŸ“Œ 10

Woahh idk about this. Very useful

29.11.2024 14:14 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
A high-level summary diagram taken from the slides linked below. It shows the interplay of two main components: a probabilistic model and decision maker or planner.

A high-level summary diagram taken from the slides linked below. It shows the interplay of two main components: a probabilistic model and decision maker or planner.

Probabilistic predictions of an underfitting polynomial classifier on a noisy XOR task and the corresponding under-confident calibration curve.

Probabilistic predictions of an underfitting polynomial classifier on a noisy XOR task and the corresponding under-confident calibration curve.

Probabilistic predictions of an overfitting polynomial classifier and the resulting overconfident calibration curve on the same noisy XOR problem.

Probabilistic predictions of an overfitting polynomial classifier and the resulting overconfident calibration curve on the same noisy XOR problem.

Simulation study to show the relative lack of stability of hyperparameter tuning when using hard metrics such as Accuracy or soft yet not probabilistic metrics such as ROC AUC compared to a strictly proper scoring rule such as the log-loss.

Simulation study to show the relative lack of stability of hyperparameter tuning when using hard metrics such as Accuracy or soft yet not probabilistic metrics such as ROC AUC compared to a strictly proper scoring rule such as the log-loss.

I recently shared some of my reflections on how to use probabilistic classifiers for optimal decision-making under uncertainty at @pydataparis.bsky.social 2024.

Here is the recording of the presentation:

www.youtube.com/watch?v=-gYn...

27.11.2024 14:17 πŸ‘ 49 πŸ” 19 πŸ’¬ 1 πŸ“Œ 1
Preview
The rise and fall of peer review Why the greatest scientific experiment in history failed, and why that's a great thing

Interesting critique of the problems with peer review. Unfortunately it fails to offer a proposal for what might be better. I'm reminded of the old adage β€œdemocracy is the worst form of government, except for all the others.” Likewise for peer review. www.experimental-history.com/p/the-rise-a...

26.11.2024 11:11 πŸ‘ 107 πŸ” 16 πŸ’¬ 9 πŸ“Œ 2

Finished reading the book, last two chapters on the philosophical aspects of technology worship are great, and the material provides vivid conceptual clarity on bubble dynamics. However, overall, the book could have been much shorter.

26.11.2024 17:36 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Let's compare notes!

Let's flood Bluesky with optimism and vision

27.11.2024 05:03 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I'm glad @hf.co is doing this. It brings down the barriers to allow more people to benefit from AI, rather than keeping it exclusively in the realm of deep pocketed giant companies.

AI can help open the gates, to allow regular people to do things they couldn't do before. (Which can be threatening!)

27.11.2024 00:53 πŸ‘ 168 πŸ” 13 πŸ’¬ 9 πŸ“Œ 3

"Process-first" vs "Data-first" people is a really useful dichotomy.

This also explain numerous occasion when me and one of my SE friend has. I kept thinking "How could you miss this important data point, this doesn't makes sense".. until he answered "Yes but what process generate that data?"

25.11.2024 17:15 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

The first 3 days joining Bluesky, I kept talking to myself "Oh wow, I didn't know this person already in here"

This is the way

25.11.2024 04:25 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

"The other kind of bubble is a filter bubble.

Participants in filter bubbles wall themselves off from opinions they disagree with and become increasingly convinced that their viewpoints reflect the one true way to understand the world."

24.11.2024 08:25 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0