As we work through integration details we’ll share more about how this will work. But if you use dbt today, you’ll be able to use this new tech.
As we work through integration details we’ll share more about how this will work. But if you use dbt today, you’ll be able to use this new tech.
While SDF won't be included as part of the Apache 2.0 code base, we plan to make meaningful parts of SDF’s capabilities available to all dbt users—whether you’re using dbt Core or dbt Cloud.
So what does this mean for dbt users? The first goal is to get SDF’s SQL parsing capabilities integrated into dbt.
Local Execution: Instead of having to hit your data platform in development, you can take that logical plan and execute it in a local environment.
Lineage: SDF has both the highest-fidelity and most high-performing SQL parsing on the market. And lineage and metadata is, of course, at the heart of the entire data control plane.
Because SDF understands your SQL, it can detect errors without connecting to the remote database. Troubleshooting all of a sudden becomes far faster, as errors get caught as you are typing, not when you do a dbt run.
SDF’s ability to understand SQL means that it can power IntelliSense in your IDE of choice. With every keystroke, SDF understands what you are typing and can automatically suggest what comes next, including suggesting table and column names.
Developer experience: There are many things that will eventually go into this bucket, but here are two great examples
SDF parses and compiles dbt projects really, really fast: Because it’s built in Rust, it simply runs faster than Python. As a result, SDF compiles the same dbt project multiple orders of magnitude faster than dbt Core. If you’re working in a large dbt project, this will vv impact your productivity.
Benefits for developers: faster, a better developer experience, lineage and local execution
Integration is easy. SDF has adopted dbt’s syntax, configuration, libraries, and Jinja natively, as part of the SDF runtime. As a result, for most dbt projects there will be no code changes required to take full advantage of SDF’s capabilities!
Unlike dbt historically (which has treated SQL as strings), SDF sees objects and types and syntax and semantics. In the same way that Virtual Machines (VMs) emulate physical hardware, SDF emulates the SQL compilers native to the data platforms you use.
The toolchain is powered by a state-of-the-art development in SQL understanding. SDF represents each SQL dialect (Snowflake, Redshift, BigQuery, etc) as a complete ANTLR grammar with definitions for all datatypes, coercion rules, functions, scoping intricacies and more.
It is written in Rust, highly parallelized, and designed for scale.
What is SDF? SDF is a high performance toolchain for SQL development packaged into one CLI; a multi-dialect SQL compiler, type system, transformation framework, linter, and language server.
Today, we’re announcing that dbt Labs has acquired SDF Labs. This a monumental day for dbt Labs and the entire dbt Community Thread on what that means 🧵
www.getdbt.com/blog/dbt-lab...
Screen shot of article about podcast "The intersection of UI, exploratory data analysis and SQL"
An excellent overview of the evolution of EDA and data viz: from Tukey to BI tools, Python/R/JS tools, in-process databases e.g. DuckDB, WASM...and the exciting future changes to how data is done. Thanks @jthandy.bsky.social y.bsky.social @hamilton.bsky.social
roundup.getdbt.com/p/the-inters...
jealous. i never got this upgrade. i just rely on my 'doesn't need that much sleep' superpower :P
Yep, agree. Unsurprisingly I see a lot of this as an ecosystem problem and think SWE is ahead because of the persistent compounding effects of OSS over the course of multiple decades. I think data people are constantly forced to make this shit choice when they should be able to have both.
sorry, legit not meant as a snipe, i found your post provocative.
why can't we have both?
consistent underlying platform, different development experiences... right?
C developers still fight about vim / emacs, they're still united by language.
i think we often make things in data very hard that should be very easy.
What is next? Do you go into prod also in Duck? or..?
Real question, curious about the workflow you're cooking on.
Don’t know what statistical process control is? This podcast where @jthandy.bsky.social interviews @cedricchin.bsky.social will get you started
overcast.fm/+AAw94Uafve8
where do YOU keep your last 7 years of tax returns!?
well now i'm one post in and zero toxicity. good start! lol
Dipping my toe in here. Deeply skeptical of social at this point in my life but at the same time in search of that Twitter golden age. Trying to be open. Hello world.