At NeurIPS this week? DM if you want to meet!
At NeurIPS this week? DM if you want to meet!
Congrats Eugene!
Happy to announce that I've joined Periodic Labs as member of technical staff. We're a mission driven startup aimed at accelerating scientific discovery using AI, with a strong focus on material science (discovery of new materials such as superconductors and such). We're hiring: periodic.com
Follow me for more CS cuisine!
The fact that all LLM libraries don't have the same data format is as surprising as the fact that there is more than one sign language dialect
Ray is an excellent way of testing if all your `__repr__` are coded properly (but it shouldn't be)
Just stumbled upon RouteRL: a multiagent RL framework to facilitate the testing and development of efficient route choice strategies
coexistence-project.github.io/RouteRL/
Looks pretty cool!
What is GGUF, Safetensors, PyTorch, ONNX?
In this blog post, let's discover common formats for storing an AI model.
huggingface.co/blog/ngxson/...
MLGym makes it super easy to set up complex tasks to be solved by LLMs. Honestly one of the most intuivite APIs I have ever seen in that space!
After that, your LLM reads these instructions, and outputs prompts with some thoughts. The commands are executed in the docker's bash, and the result is returned to the agent.
Today we're opensourcing MLGym, an API for AI research agents.
MLGym relies on a gym environment that wraps a docker image. Each env has a task specified as a YAML file, telling in plain english what you want your LLM to achieve
π
Good old cProfile with snakeviz is pretty cool too jiffyclub.github.io/snakeviz/
Again, not for cuda ops, and not as fine-grained as line-profiler but quite useful for macro-tracking of compute time
torch.utils.benchmark.Timer is amazing to assess the runtime of a whole isolated piece of code, but be mindful that the way it plays with global variables isn't always obvious and may differ from time.time() on occasions
I use line_profiler to check the code line-by-line (careful: cuda ops re async, do not trust it for these!) - very useful to check cpu-overhead pypi.org/project/line...
The profilers I use: PyTorch profiler to view the time spend doing the various ops of my code. It can reliably show you what's going on for a single iteration of your function. pytorch.org/tutorials/re...
In general, in-place operations are not preferable to regular ones (you won't gain much mem improvement or speed-ups). Don't load your code with ReLU(inplace=True), mul_, add_ if not absolutely necessary.
Using hydra or similar fancy config objects: Avoid calling cfg.attribute often in the code. Instead, cache the args values in your script as global workspace variables.
If you have a tiny model (robotics, RL) cpu-overhead bound, avoid frequent calls to eval() or train() in eager mode, or model.parameters() or anything that goes through your model. Prefer cached versions of these calls.
Avoid calling tensor.item() in between cuda operations. This triggers a cuda synchronization and blocks your code. Do the logging after all code (forward / backward / optim) has completed. See how to find sync points here)
Avoid pinning memory in your code unless you thoroughly tested that it accelerates runtime (see this tutorial for more info). As an aside, pin_memory is also less safe! pytorch.org/tutorials/in...
Don't send tensors to device using to(device) if you can instantiate them directly there. For instance, prefer randn((), device=device) to randn(()).to(device)
A few tips I share when I talk about perf with PyTorch in eager mode (with focus on small models): πͺ’
I guess my point was that a proper name + definition is necessary to write good code. When I see βpolicyβ, βcriticβ, βreplay bufferβ, βenvβ I know exactly what does and doesnβt belong to them. With agent is systematically a βhm yeah why notβ - then you end up with ill-defined monster classes
If your agent is a policy call it policy, if it's a trainer call it trainer! If it's just a big undefined collection of methods, consider refactoring it...
Every time I meet with people and someone talks about agent, there's at least one person who asks "what do you mean by agent?" or "you should not call that an agent".
I stand by my point that the word "agent" should be avoided at all costs.
At least in RL, anytime I see an "Agent" class it's meant to be a "whatever doesn't fit in any other bucket in my codebase".
hard to tell, let's try :D
Everyone's like "hey I just coded and trained a SOTA LLM in my garage last week, also wrote a blogpost about it and opensourced the repo" and the only thing I did in the meantime was fix a CI and configure a remote interpreter on a server... π’
Side note: we saw some nice adoption from DeepSeek-R1 reprod repos, which is humbling, if not thrilling!
github.com/Jiayi-Pan/Ti...