Screenshot of 3x50 bar chat sparklines that look a lot like a complex barcode
Apparently my next (and hopefully not final) form is a barcode scanner
Screenshot of 3x50 bar chat sparklines that look a lot like a complex barcode
Apparently my next (and hopefully not final) form is a barcode scanner
You could say I'm on the segfault treadmill, but I'm also literally working at a treadmill desk and I don't want that treadmill to segfault
And even after you stopped triggering segfaults, you're probably only sort of done because who knows what other as-of-yet-untriggered segfaults are still hiding deep within the stacks.
The problem with segfaulting code is that you literally can't tell you're done until you're done. Fix one, bam, here's another that you weren't able to trigger because of the first one!
β΅0 segfaults exist in my code
fix one, compile
now there's β΅0 segfaults that exist in my code
The extra shitty part is it forces everyone into an arms race: choosing to drive a non-SUV on a road full of SUVs comes out of your safety budget.
This is a classic prisoner's dilemma and it needs a strong external (i.e. regulatory) intervention to for e the system out of that zero sum state.
"How I, a non-developer, read the tutorial you, a developer, wrote for me, a beginner" by Annie Mueller π
π π
anniemueller.com/posts/how-i-...
For the first time: @honeycomb.io is hiring open roles in Australia!!! We have this senior role open as well as a mid-level role. job-boards.greenhouse.io/honeycomb/jo...
Once we fill these, we will have a thriving APAC team of 5 people: Field CTO, account exec, customer architect, and 2 support.
The legacy observability vendors' obsession with "cardinality control" is so backwards.
Why *control* cardinality instead of *embracing* it? High-cardinality data isn't a bugβit's the entire point. Your complex systems generate complex data.
Stop building tools that fight reality.
#SemanticSearch #DataInfrastructure #SearchArchitecture
Our latest post, www.graphiumlabs.com/blog/end-of-..., discusses how current search systems and tools are falling apart as they are required to handle an ever growing mass of data, and an increasing level of nuance and complexity.
I think they might be coming back. Have you seen www.graphiumlabs.com?
This post breaks down why understanding precision and recall is essential when building search and information retrieval systems for high stakes decision making:
www.graphiumlabs.com/blog/precisi...
In high-stakes environments, like medical diagnostics, legal research, and threat detection, the trade off between high recall and high precision isnβt just a theoretical optimization problem. The choice has real-world consequences.
Ideally, users want both at 100% β all (good) signal, and zero noise. But the way search works under the hood often forces a trade off: higher recall requires looser filters to bring in more results, and consequentially, more irrelevant results or noise, which bring down precision.
Precision means: βOf the results that were returned, how many were relevant (correct)?β
And recall says: βOf all the correct results, how many were returned?β
In search and information retrieval systems, precision and recall are more than just evaluation metricsβthey reflect how well a system aligns with the userβs needs and expectations of relevance and completeness.
New Graphium Labs blog post!
www.graphiumlabs.com/blog/precisi...
#precision #recall #relevance #search #informationretrieval #searchengineering #searchsystems #searchquality #mlmetrics
So when nuance is important, semantic search built on vector similarity tends to miss the mark by a really, really wide margin.
I'll start: vector embeddings don't encode semantics, they encode substitutability. It _looks right_ if you squint at it, or if the use case is pretty trivial (e.g. "brown" vs. "chocolate" when describing a sofa).
But opposites also have high substituability (good/bad, dark/light, rich/poor, etc.)
Still frame from the movie The Princess Pride of Inigo Montoya saying "I do not think that word means what you think it means" to Vizzini, overlaid with the text "I do not think these words mean what you think they mean" near Inigo Montoya's head, and the text "Semantic search is just vector embeddings cosine similarity" near Vizzini's head.
It's been bothering me for years how "Semantic" in "Semantic search", the way it's built these days, is semantically wrong.
So on this quite lovely Canada day, let's argue semantics about "Semantic".
I once failed the "check the checkbox" test by checking it... Wrong? I guess?
Super excited to share this! I've known Saem for many years, and once we started talking about what we're building at Graphium Labs, having him join us as CEO felt inevitable.
I found the most incredible graph on the other site
This was a really fun talk to give. Thanks Kir Shatrov and Cameron Morgan for organizing, and @tavis.damnsimple.com for recording!
Video: m.youtube.com/watch?v=D4ZL...
Slides: www.graphiumlabs.com/vancouver-sy...
I love this paper!
New Change, Technically episode is out: WHO'S AFRAID OF MATH?
We tackle *math anxiety," @analog-ashley.bsky.social teaches me about vulnerable circuits in the brain and being vulnerable about teaching, and I read a HECK of a lot of science to bring you this episode
Why "geometric" is bad:
Geometric refers to a geometric sequence in math, of the form a, ar, ar^2, ar^3, ..., ar^n.
If r>1 and the scale of something grows by the power, you lose control FAST. Nuclear meltdown fast. 99.9999% of the increase occurs in the last microsecond.
Fine -> BAD happens fast
Hey Tim let's talk.
This was a really fun talk to write and present!
Co-founder @nisanharamati.bsky.social gave a talk at last night's Vancouver.systems , "The Limits of Scaling and the Physical Properties of Data" going over how to predict the size limit where distributed systems stop scaling and start losing throughput.
slides: www.graphiumlabs.com/vancouver-sy...