Burak's Avatar

Burak

@buremba

Data Engineer - Cooking https://github.com/buremba/universql πŸ₯

89
Followers
159
Following
87
Posts
27.11.2024
Joined
Posts Following

Latest posts by Burak @buremba

Introducing DuckLake
Introducing DuckLake YouTube video by DuckDB

Soon to be, it looks like: youtu.be/zeonmOO9jm4?...
Otherwise, there is no point of using Parquet instead of their DuckDB native format. I’m glad they didn’t ignore the β€œindustry standards”

27.05.2025 15:31 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Is there any plan to support data compaction to data lake when data inlining is used?

27.05.2025 15:30 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I was worried about Iceberg being ignored in favor of DuckLake but looks like you fixed Iceberg’s biggest problems and still kept the compatibility. Super exciting!

27.05.2025 15:20 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Turns out the implementation wasn’t WAL but they had a new Iceberg compatible data lake extension. I like the direction they are going!

27.05.2025 15:15 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Implement WALReader by adsharma Β· Pull Request #17247 Β· duckdb/duckdb This could be useful to external replication tools to read WAL records similar to how wal2json (Postgres) and binlog (MySQL) work. Translation to externally consumable format is not included.

I have this one but they might have soon to be public extension to use the WAL to keep the data in sync with data lake: github.com/duckdb/duckd...

21.05.2025 15:42 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Is @duckdb.org cooking native data lake integration with streaming support through WAL? That could enable DuckDB to have a multi-user mode..

21.05.2025 01:36 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

After not using Facebook for years, wanted to try out Marketplace. Apparently you can send messages to people on the website but you can only see messages are sent to you on their Messenger app. I guess this is their definition of β€œconnecting people”.

11.05.2025 16:51 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

That’s a good analogy, might steal it. :) However; when the destination path is not clear (which is usually case as you need to experiment and iterate anyways) smashing can help accelerate finding the destination as you learn where not to go.

05.05.2025 16:14 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Ironically the number of stale documents in our company is increased dramatically thanks to LLM.

26.04.2025 16:21 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Oh I lost count of how much time I waste trying to infer the column names from random CSV files without a header. This is very handy!

17.03.2025 18:49 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Just found out that Databricks hired Snowflake’s Polaris (Iceberg) lead PM. It’s crazy how aggressive these guys with the competition!

22.02.2025 23:15 πŸ‘ 4 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

Great to see Amazon implementing Iceberg REST Catalog layer for Glue! It enables read/write support on S3Tables from any Iceberg client, now everybody as a free Iceberg catalog via AWS Glue. aws.amazon.com/blogs/storag...

18.02.2025 00:45 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Exactly! I think Flight will get more popular over time as it's the most efficient implementation, but this approach can help existing RESTFul apps to adopt SQL integrations before switching over to GRPC.

15.02.2025 18:54 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
GitHub - PostgREST/postgrest: REST API for any Postgres database REST API for any Postgres database. Contribute to PostgREST/postgrest development by creating an account on GitHub.

The main inspirations are github.com/PostgREST/po... and @qxip.bsky.social 's DuckDB webmacro extension: duckdb.org/community_ex...

15.02.2025 18:41 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Post image

Released an experimental @fastapi.tiangolo.com integration with @duckdb.org today, which enables REST APIs to have bidirectional read/write support in SQL. github.com/buremba/duck...

15.02.2025 18:37 πŸ‘ 9 πŸ” 1 πŸ’¬ 2 πŸ“Œ 0

Pretty common but if one of these languages is the β€œmain” one, it might be more desirable to generate JSONSchema from Pydantic/TS and generate the models for other language from JSONSchema. It’s more about where you want the source of truth should be.

10.02.2025 20:52 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I had the exact same thought..

06.02.2025 18:46 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

"think twice before you speak."

05.02.2025 13:37 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Thanks. I'm also a fan of your creative extensions! Quackpipe was one of the inspirations. :)

29.01.2025 02:04 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
DuckCon #6 in Amsterdam DuckDB is an in-process SQL database management system focused on analytical query processing. It is designed to be easy to install and easy to use. DuckDB has no external dependencies. DuckDB has bin...

Today I had to explain my partner what @duckdb.org is because β€œI will fly to Amsterdam for a day to meet ducks” didn’t make any sense to her. Excited to meet with the contributors! duckdb.org/events/2025/...

28.01.2025 18:31 πŸ‘ 13 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

One here! 🍻

28.01.2025 18:17 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

It's interesting to see many seed-stage, well-funded startups trying to "re-write X in Rust." as a business model.

WarpStream, ScyllaDB, and Redpanda are successful because they're either 10x efficient or make the maintenance much easier than their alternative, not because they're written in C++

24.01.2025 13:50 πŸ‘ 6 πŸ” 0 πŸ’¬ 1 πŸ“Œ 1
update_table_metadata_location - Boto3 1.35.99 documentationContentsMenuExpandLight modeDark modeAuto light/dark modeClose Menu

I couldn't figure out how to insert a table into an S3 Table without Spark. I tried to use the API but it requires me to create the files and update the metadata. PyIceberg can't write to S3 Tables through its S3 integration yet so I had to stick to Spark. boto3.amazonaws.com/v1/documenta...

14.01.2025 22:41 πŸ‘ 2 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

If AWS is serious about S3 Tables, they should support Iceberg REST Catalog in it. Right now we can only create tables with Spark.

14.01.2025 20:27 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Qlik's Upsolver acquisition shows the importance of adopting new technologies as a potential acquisition target for bigger companies. It's a 10-year-old company, and they raised a ton, so I'm not sure how good the deal was for the co-founders.

14.01.2025 17:43 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

dbt acquiring SDF Labs shows how important it is to have a good relationship with your competitors. SQLMesh might be more ambitious, but I'm sure it was a good exit for SDF founders in only 2 years!

14.01.2025 17:36 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

It's a good day to be acquired in the data space.

14.01.2025 16:09 πŸ‘ 1 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

For the record I checked if Motherduck notebooks ahave it but doesn’t seem to be the case, at least yet.

03.01.2025 22:32 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Look great! I would love to try out, Where is this going to be available?

03.01.2025 22:31 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

People say LLM is killing low-code platforms such as Retool and Bubble, but they seem to hire more people + raise even more funding. They're better positioned to leverage LLM maybe.
The AI tools like bolt.new and v0.dev work best with Next + Shacdn combination after all, so I wouldn't be surprised.

03.01.2025 17:09 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0