AntoinePrv (@antoineprv)

Very nice post! I've long been thinking we should get models and datasets from package managers.
On the conda side, I've been wanting to do a CEP or server to automatically package arbitrary artifacts while avoiding duplicating the storage.

27.02.2026 16:39 👍 1 🔁 0 💬 0 📌 0

GH-48277: [C++][Parquet] unpack with shuffle algorithm by AntoinePrv · Pull Request #47994 · apache/arrow Rationale for this change The current bit-unpacking algorithm (which is implemented as a C++ code generator script in Python) does not fully leverage SIMD operations: all loads and some bitshifts u...

At @quantstack.bsky.social we designed novel bit-unpacking SIMD optimizations for @arrow.apache.org and #ApacheParquet, and implemented them entirely using C++ metaprogramming instead of Python-based code generation.

We'll publish a deep dive blog post soon.

github.com/apache/arrow...

26.02.2026 17:34 👍 9 🔁 6 💬 1 📌 0

Everybody is talking about secure Python sandboxes for LLM code execution, but what about using the browser sandbox?

Quick demo using JupyterLite and the Pyodide kernel 💡

11.02.2026 09:58 👍 6 🔁 2 💬 0 📌 0

A viewer for Parquet, SQLite, and Avro files in JupyterLab.
Check out our new JupyterLab extension: Arbalister. 🏹
Built upon Apache Datafusion, @jupyter.org , and @arrow.apache.org , it lazily fetches rows so that you can view files larger than memory!

blog.jupyter.org/instantly-vi...

29.01.2026 16:38 👍 15 🔁 7 💬 0 📌 1

Notebook.link

We are thrilled to introduce notebook.link, a platform that lets you create, share, and run Jupyter notebooks instantly in your browser.

Powered by JupyterLite and WebAssembly, it supports Python, R, C++, and a full in-browser terminal experience.

📖 Read the full story: medium.com/@QuantStack/...

22.01.2026 16:56 👍 24 🔁 18 💬 0 📌 3

SIMD coding is hard: platforms, inconsistencies, lane constraints... but xsimd abstracts a lot away. With my first contributions, I improved byte shuffling, now available in the latest 14.0 release.
#C++ #SIMD #xsimd #openSource

02.12.2025 21:01 👍 6 🔁 2 💬 0 📌 0

I'm getting some of it as well. You can mark *all* notifications as read with
`gh api -X PUT notifications -F "last_read_at=$(date -u +'%Y-%m-%dT%H:%M:%SZ')"`

24.09.2025 08:24 👍 0 🔁 0 💬 1 📌 0

Apache Arrow Summit, Thu, Oct 2, 2025, 9:30 AM | Meetup The day after the PyData Paris conference, we’re excited to host the first-ever Apache Arrow Summit - a gathering dedicated to fostering collaboration and innovation within

Register for the Apache Arrow Summit Paris 25 (October 2nd) at: www.meetup.com/pydata-paris...
The event is hosted by @pydataparis.bsky.social
We are looking forward to seeing you there and talking about all things Arrow.

28.08.2025 07:50 👍 12 🔁 6 💬 0 📌 2

I'll be attending the event, looking forward to meet Python folks!

28.08.2025 13:43 👍 2 🔁 0 💬 0 📌 0

As for constraints, you can model all decision variables as binaries and express conjunctive normal form as linear constraints.

In both cases (SAT and ILP), I think you'd need to make dedicated heuristics to have smth reasonable for package manager, so it boils down to the most adaptable codebase.

02.12.2024 11:38 👍 0 🔁 0 💬 1 📌 0

What would be your optimization objective? Probably some heuristic of what it means to be "up to date" with open questions as "Is it better to have indirect dependency very outdated rather than a direct one slightly outdated", but you'd also need to defined "very" and "slightly".

02.12.2024 11:38 👍 0 🔁 0 💬 1 📌 0

AntoinePrv

Latest posts by AntoinePrv @antoineprv