Flipping lower order bit
Flipping lower order bit
Visited a seminar last year where most small talk that we had is "You have Platform A certification? You need to learn those".
It's so over for us folks
It will solve me so muchh time if during my first year of analytics someone said "Look kid, both the tools and business should be your focus". I was so lucky I came up from business major so I can focus only on the tools. But man.
I love how every article about Analytics kept screaming "Raise your technical skills. Learn platform!" but the OG book from Kimball mentioned (in the first 20 page of the book): "Yeah no, you're hybrid DBA and MBA"
Ohh, love data modeling. Could you share any link for this?
Hannes mentioned this in Joe Reis's podcast and I still don't understand..
βAnd it's, by the way, it is, again, I won't name names, but it is completely common for analytical data management systems to ignore things like primary key constraints.β
Would love to have more elaboration on this
But go spend couple of hours searching for this problem in any Data publication and you will finds virtually nothing
However, all these manual processes still need to be reported (as clean data) and integrated into existing systems (as structured data) by someone or a dedicated team. Quite often, this falls to the Data team.
Unexplored Topic: Most startups (at least in Indonesia) currently take a Business-Market Fit-first approach, keeping all processes manual until they generate revenue, before committing precious resources (like engineering bandwidth) to properly productize their solutions.
Last post of the year! S3, Parquet, Iceberg, and @duckdb.org are a great way to get customers their data.
I didn't notice any comment praising about the book that you just picked so I will be the first person to say it: Data and Reality is great. Perfect book!
I went from "Huh, quite weird to choose old book for this" to "God, if I read this book early of my career, I can save so much time"
Book Club is happening - find all the details here: jennajordan.me/book-club/
#databs #datasky
The first book would have to be Data & Reality by Bill Kent (2nd ed of course).
After that there are many potential options but Iβve wanted to have a book club for Data & Reality for so long that it has to be the first one. We could call it the βData Philosophizingβ book club or something.
For my last class this semester, I tried to cram our Advanced Database course into one lecture. We cover the following database systems in 60min: youtu.be/fr5lIchF6pw
β’ Google Dremel / BigQuery
β’ Snowflake
β’ Amazon Redshift
β’ Yellowbrick
β’ Databricks Photon
β’ @duckdb.org
β’ TabDB
Fantastic opportunity for ppl in #mlsky #databs #stats etc to write their own perspectives on how machine learning and engineering have changed over the course of your careers (see original link in the thread), especially last 5 years. Would love to read all of these posts!
"Who tf start conversation like this? I just sat down"
(yeah I know wrong picture but the GIF doesn't exist)
I just completed "Historian Hysteria" - Day 1 - Advent of Code 2024 #AdventOfCode adventofcode.com/2024/day/1
I filed this case to "I know the concept already but still surprised with the instantiation" folder
No wonders why Einstein loves compounding
This also hammer the points how things can quickly change when you can "seize" small win and compounds it.
"Yeah you know that city where grandpa used to go to check the price of corn? Pack up your suit man, we gonna bet against Argentina's Peso"
It's crazy to think that Chicago become the center of modern finance because their geographic advantage and .....cow?
What does analogy that fit into this case?
On top of my head: it's like knowing your local IKEA become 3 Michelin star restaurant in the future... because they can cook meatball
FYI, here's the entire code to create a dataset of every single bsky message in real time:
```
from atproto import *
def f(m): print(m.header, parse_subscribe_repos_message())
FirehoseSubscribeReposClient().start(f)
```
Woahh idk about this. Very useful
A high-level summary diagram taken from the slides linked below. It shows the interplay of two main components: a probabilistic model and decision maker or planner.
Probabilistic predictions of an underfitting polynomial classifier on a noisy XOR task and the corresponding under-confident calibration curve.
Probabilistic predictions of an overfitting polynomial classifier and the resulting overconfident calibration curve on the same noisy XOR problem.
Simulation study to show the relative lack of stability of hyperparameter tuning when using hard metrics such as Accuracy or soft yet not probabilistic metrics such as ROC AUC compared to a strictly proper scoring rule such as the log-loss.
I recently shared some of my reflections on how to use probabilistic classifiers for optimal decision-making under uncertainty at @pydataparis.bsky.social 2024.
Here is the recording of the presentation:
www.youtube.com/watch?v=-gYn...
Interesting critique of the problems with peer review. Unfortunately it fails to offer a proposal for what might be better. I'm reminded of the old adage βdemocracy is the worst form of government, except for all the others.β Likewise for peer review. www.experimental-history.com/p/the-rise-a...
Finished reading the book, last two chapters on the philosophical aspects of technology worship are great, and the material provides vivid conceptual clarity on bubble dynamics. However, overall, the book could have been much shorter.
Let's compare notes!
Let's flood Bluesky with optimism and vision
I'm glad @hf.co is doing this. It brings down the barriers to allow more people to benefit from AI, rather than keeping it exclusively in the realm of deep pocketed giant companies.
AI can help open the gates, to allow regular people to do things they couldn't do before. (Which can be threatening!)
"Process-first" vs "Data-first" people is a really useful dichotomy.
This also explain numerous occasion when me and one of my SE friend has. I kept thinking "How could you miss this important data point, this doesn't makes sense".. until he answered "Yes but what process generate that data?"
The first 3 days joining Bluesky, I kept talking to myself "Oh wow, I didn't know this person already in here"
This is the way
"The other kind of bubble is a filter bubble.
Participants in filter bubbles wall themselves off from opinions they disagree with and become increasingly convinced that their viewpoints reflect the one true way to understand the world."