Sounds neat. Hopefully not a lot of horses, haha. I’ve barely looked into media streaming stuff in Elixir, will def. check ex_nvr.
Curious about the inference server, is the yolo model running on its own non elixir engine or using Bumblebee things?
Sounds neat. Hopefully not a lot of horses, haha. I’ve barely looked into media streaming stuff in Elixir, will def. check ex_nvr.
Curious about the inference server, is the yolo model running on its own non elixir engine or using Bumblebee things?
Looks neat, curious about the end use case, or is it a capability demo? Inference with Yolo models, or something else? Either ways, good luck and hope it runs all smooth.
we're clearly living in another HCF epoch right now.
Ah nvm, now I get that you meant the webshop didn’t update pricing for the custom config.! 😅
Why not put one together yourself, esp. if a PC/Linux build? Much better value.
I built one in 2019(probably after 18 years), took a bit of research, but with sites like pcpartpicker, it’s really a breeze.
Recently learnt about Minisforum etc., and I’d def go the mini route if I didn’t need GPUs.
A bit embarrassing to admit, but another thing I've been rather late at using/understanding is WebAuthn(FIDO2) compared to U2F(FIDO). Not having used a FIDO2 compatible hardware key, I had sort of mentally bucketed them together.
But, FIDO2 is a pretty solid improvement/extension over the FIDO. TIL.
Is it only available via the Oreilly subscription? attempted to buy it, but not functional on my end.
Brilliant! I'm definitely going to try it out with the key.
Would you suggest backing up the private key for future/new yubikey transfers (similar to gpg certify key bkp to extend subkey expiry), or just generate one on key and "never expose it"? Guess it depends, but curious about your workflow?
I've also been considering using Age for encryption to try out modern tooling, (though I'm comfortable with GPG+Yubikey) and only recently learnt that there might be a pathway to use Age keys on the PIV slots.
@filippo.abyssdomain.expert Curious if this is the best way to use Age+Yubikeys?
Also, @ubiquiti.bsky.social hardware+software is really well done. A lot of things that I used to have sidecars & other solutions for is now just covered with UCG & U6+, and the Unifi software. What a treat really, and I'm pretty sure I haven't even started using all of it.
Super late to the "Gigabit at home" party, but recently updated to it, also moved from Synology to Ubiquiti Router+AP setup, Ethernet where possible.
Finally I can max out on downloading these bulky models from HF & Ollama store. 😅
Congrats Benjamin!
I've been using the same #GPG keys (master, sign., enc. & auth.) on a @yubico.com Yubikey 4 since 2017, (following the drduh guide) extending the expiry every X years.
I'm now considering creating new keys (esp. for RSA/4096 & ed25519) for a new #Yubikey 5.
How is everybody else going about this?
Right, that’s a fairly recent book as well. Thanks for reminding! 🙌🏽
Right after the Huawei book, I finally picked a copy of “Chip War”.
I think this book honestly is the most readable compressed history of the chip industry all the way from vacuum tubes to modern day custom accelerator chips incl. the geopolitics.
Not to drop the momentum, what should I read next?
I still do like using Yubikeys for that (GPG+SSH) since it’s a portable secure “key”, but pretty neat idea to use the Secure Enclave as well. Any hiccups/gotchas with it?
Brilliant, I've been on a search for something along the lines after finishing it. I'll queue this up in my local library if available. Thanks back for the tip. 🙌
TCP flow & congestion control was literally designed with this in mind. I think you might enjoy a Claude session on TCP flow control and congestion control in regards to streaming architectures. 🙌🏽
Nope it’s not. Unless it’s a very sophisticated inner loop. All the tokens should be produced at the accelerator’s own max capacity.
The tokens are just waiting in-between various queues on the TCP(“network”) layer as packets waiting to be sent (or eventually dropped).
That’s somewhat expected TCP flow control (TCP backpressure if you will) on streaming systems.
The tokens pile up in queue somewhere between the server and the browser, and if the TCP congestion clears up all the tokens will seem to arrive in “one big chunk”, like that flaky audio call. 😅
Just finished reading “House of Huawei”. What a solid read, would absolutely recommend to anyone trying to understand and forward connect the dots to where we are with technology wars right now.
Omg! I remember buying a self assembly kit for a friend’s birthday a long time ago.
The packaging didn’t really specify what it was. Fun little useless surprise at the end of putting it together.
Agree but perhaps not fully(it depends)!
Diverse tools def. increase the problem complexity space, but as an LLM provider you probably want to solve that to satisfy more downstream customers.
Of course, I agree that most LLM consumers(businesses etc.) individually don't need a diverse set of tools.
Ah wow! Is the Lisp codebase still strong, or mostly converted to Python by now?
Pretty interesting idea.
Immediate thought: one might need a diverse set of tools in the “training run”, so as not to overfit to the same set.
Possibly more interesting if you can run the inner loop during training, and “transfer” that learning to run outer loops during inference.
Solid 10/10 post by Thorsten, on letting the results of your LLM(http) calls decide what function to execute next, in a loop, in a loop, ....
Finally got around to reading this. Love the simplicity, I've been using a similar barebones(low dependencies) approach in current product(in Elixir). No SDK, no problems either.
It's all http calls, with some workflow branches and some letting the http response decide what function to dispatch on.
Rewatched the Game Theory video by @veritasium.bsky.social again today considering the turn of recent geopolitical events.
A good reminder that being nice, forgiving, clear while still retaliatory/provocable is a pretty good strategy in the majority of cases.
www.youtube.com/watch?v=mScp...
Man I just feel terrible not being able to join. :/
Been meaning to use the latest Gemini(2.x series) models for a while now. Perfect timing by @strickvl.bsky.social with this set of practical examples in this standalone site.
The only thing that could have made this even better for me would be direct curl examples, but maybe I'm asking too much. 😅