Woah what?!
Woah what?!
I just started Google AI Pro with Antigravity for $20 a month and it includes access to Gemini and Claude models. I've hit the limits on Claude a few times but not bad.
Markdown for sure. You can read it as plain text or render it, it is widely accepted across the development community, and AI speaks it very well.
Agreed, hybrid is the worst.
Timely link because I'm currently struggling with this. The article is interesting and not at all what I expected. I've always heard Silver is supposed to be the integration layer using 3NF, not star schema. So many more questions! www.databricks.com/glossary/med...
I just want everyone to know that if we are connected on here, Iβm rooting for you. For your career, your creative endeavor, your happiness, whatever you are chasing. I hope it happens for you!
While AI companies are allowed to slurp everything they want, Quad9 warns that legal fees are drowning DNS resolvers, which are now being targeted by copyright owners to enforce blocks on piracy sites
quad9.net/news/blog/wh...
Perhaps using unnest?
select unnest(value, recursive := true) from read_json('~/Data/example.json')
Any meetup.com organizers that have successfully moved their community to another platform? I would be interested in hearing your experience
My blogging motivation has declined a lot over the years... then the endless Hugo breaking changes pretty much killed it for good. Do you know how much work it would be to convert a Hugo blog over to Zola?
Conspiracy theory: Databricks is deliberately making your clusters start up super slowly so that you want to pay more to use serverless. #dataBS
I've been working on visualizing JOINs for some beginner SQL workshops. Here is LEFT JOIN. Thoughts? #databs youtu.be/ZSxtZAulogo?...
Moving from Azure Data Studio to VSCode but ugh... the VSCode SQL Server extensions are so frustrating to use.
I met my wife on xanga! β₯οΈ
Couple of big announcements from @cloudflare.social today for folk in #dataBS:
* Acquisition of Arroyo, launch of Pipelines for streaming ingestion: blog.cloudflare.com/cloudflare-a...
* Launch of R2 Data Catalogβa managed Apache Iceberg catalog for R2 blog.cloudflare.com/r2-data-cata...
Chispa has good diffs for PySpark dataframes
github.com/MrPowers/chi...
Databricks recently changed the default notebook format from "source" (.py, .sql, .scala) to IPYNB which seems to indicate they will be getting rid of the source format. IMO, the ipynb format brings a few issues like difficult diffs and the potential to leak data learn.microsoft.com/en-us/azure/...
I had an HDMI KVM but it was still annoying to switch back and forth. Plus I wanted to use the full resolution at 144Hz on my gaming PC. Now I just have a big desk with separate keyboards/mice/monitors.
Yes, I'm working on this right now and talking about how we can potentially "upgrade" some of the dimensions without breaking everything. π
1988 Data Warehouse Architecture is introduced 1994 Data Warehouse is too difficult to build -> Data Marts 2008 Data Warehouse is too small for big data -> Hadoop 2010 Data Warehouse is too structured -> Data Lake 2012 Data Warehouse is too difficult to scale -> Cloud Data Warehouse 2016 Data Warehouse is too difficult to manage -> Data Fabric 2019 Data Warehouse is too centralized -> Data Mesh 2020 Data Warehouse is too limiting -> Data Lakehouse 2023 Data Warehouse is too monolithic -> Composable Data Platform
I found this while looking through some scratch notes. I don't now recall what the context was, but it's an interesting thought on the evolution of the data warehouse. (Though there is an equivocation imbedded in this history) #databs
Thanks for the input and I agree... Right now I'm battling a mono-repo used by a big team with limited git knowledge and no tooling. Choosing a tool like dbt/flyway/liquibase could help force some standardization.
Do you ever feel like it is difficult to keep them in sync with what is deployed to the database? Or with many people working in the same repo?
Is keeping the table definition valuable for only certain databases? For example in Databricks you can easily get the definition and there aren't any indexes to store. Compared to SQL Server (or similar) where it is difficult to figure out what was deployed.
You did this only for certain breaking changes right? For example - meaning of the data in a column changed or columns removed. How did you maintain two separate versions of the schema?
Do you version your data assets? Or is there only the current version of a database table? What about the table definition? #dataBS
An agile ceremony / rite of passage nobody mentions: arguing about story points and what they mean.
1. Impact. How much revenue does my work protect or generate?
2. Quality. Does my work meet or exceed customer expectations?
3. Efficiency. Reward making the right buy versus build decision.
4. Reusability. How do others leverage my work?
5. Supportability. How much work do I create for others?
I'm out walking and had some thoughts about data and fun stuff and mental health that I wanted to share.
#dataBS
Gahhh itβs time! @data-dragoness.bsky.social devUp call for speakers! Letβs take over with the #PowerPlatform and #MicrosoftFabric topics!
For anyone whoβs in the middle west, letβs do this!
sessionize.com/dev-up-2025
In my experience I'm seeing companies using Spark switch from Scala to Python for two reasons: Python has an easier learning curve and Scala devs are much harder to find.