Check out Tianweiβs latest work on using unlikelihood objective to distill search traces back to base model to boost reasoning capabilities of LLMs!
@allenanie
Stanford CS PhD working on RL and LLMs with Emma Brunskill and Chris Piech. Co-creator of Trace. Prev @GoogleDeepMind @MicrosoftResearch Specifically - Offline RL - In-context RL - Causality https://anie.me/about Unverified hot takes go to this account
Check out Tianweiβs latest work on using unlikelihood objective to distill search traces back to base model to boost reasoning capabilities of LLMs!
For all the RL PhDs and people interested in Planning and MDPs, there's a summer internship opportunity at AWS Science that specializes in LLM post-training, RLHF, LLM agents, and benchmarks like WebArena. Interested students can send their CV to fakoor@amazon.com
For education and psychometrics people, this dataset is very useful!
I credit Omar @lateinteraction.bsky.social for this beautiful summary of the difference π€£
Hi Tim β Trace can optimize the control flow, whereas DSPy optimizes the modules in a fixed control flow (for now) π I would use DSPy for a supervised learning setup and Trace for an RL-like task (when thereβs a clear definition of reward and feedback).
Trace performs inference-time optimization β not directly updating weights of the underlying neural network. It updates the agentic workflow (python functions, prompts to LLMs and etc)
People say Ching-an and I are indistinguishableβ¦is that true π€£
Come check us out near the Tesla Booth in West Exhibition Hall A 3-5pm! Come and claim your mug π€£ we have an identity crisis β people keep thinking we are from IBM for some reasonβ¦
We are happy to give a talk or have a 1:1 chat if you are interested in learning what Trace is and/or how to use it! Trace has already been presented at the UW Robotics Colloquium and ServiceNow. #foundermode for Open-Source Software! Time to build π§ and ship π!
This open-source project is a joint effort with
@chinganc_rl
and Adith, the MSR RL group. We are presenting Trace at the NeurIPS Expo Demo this afternoon 3pm-5pm PT. We have MUGs, T-SHIRTs, and STICKERs!
π microsoft.github.io/Trace/
π¨βπ» github.com/microsoft/Tr...
Once you build an agent with Trace, you can use ANY LLM optimizer you want. With the release of Trace 0.1.3, we introduce TextGrad (github.com/microsoft/Tr...) as an optimizer for the RL agent, along with OPRO and OptoPrime.
What enables Trace to be an RL-style agentic library? We use **Generative Optimization** techniques (LLM as an optimizer) to derive an analog to RL's policy gradient algorithm. The agent makes a move, receives feedback/reward, and updates its parameters.
In Trace, you define an Agent with declarative Python functions using Trace primitives. Trace provides flexible ways to mark what you want to change -- for example, we mark two prompts and two functions below as trainable.
True RL agents learn online -- continuously changing themselves to improve upon the feedback (reward) from a user or an environment. Why haven't people done this in the LLM "Agentic" libraries? We wondered the same and developed Trace -- a true *RL-style* agentic framework.
Unveiling Trace v0.1.3 at NeurIPS 2024, a library for building an RL-style AI Agent that learns from the environment and human feedback. Today's LLM Agent libraries are not RL agents. They specify a workflow, and it remains unchanged regardless of user feedback. #NotRL vimeo.com/1036224270
An honor to have you here!! Welcome ππ
arxiv.org/abs/2411.17668 Our postdoc zihan slays another COLT open problem! proceedings.mlr.press/v247/kornows...
For people who like RL theory, this is a must follow!
π
Can I get added? Not NLP but still working with LLMs on the RL side.
Hello...world?
Trying to reconstruct my academic networks over here :) Follow me if we know each other or if you're interested in machine learning for healthcare/social equity! Please retweet, or resky, or whatever they call it over here.
π
Totally β itβs a great list π
Here is a list of ML OSS & Open Source / Science enthusiasts I found on Bluesky π¦
go.bsky.app/8MFcfXd
Let me know if you find such people here!
I'm still new here and probably the list misses many must-add people, so let's built it togetherπͺ
Hi, Iβm one of the main maintainers of Trace: github.com/microsoft/Tr... and will use this platform to promote it and engage with the OSS community π«‘
This is kinda cool honestly
I seeβ¦wellβ¦hope theyβll include it soon π
How to save/bookmark posts on π¦?
Filled out so fast π« but I saw some friends who made to the list β happy for them instead π₯³
I wanted to contribute to "Starter Pack Season" with one for Stanford NLP+HCI: go.bsky.app/VZBhuJ5
Here are some other great starter packs:
- CSS: go.bsky.app/GoEyD7d + go.bsky.app/CYmRvcK
- NLP: go.bsky.app/SngwGeS + go.bsky.app/JgneRQk
- HCI: go.bsky.app/p3TLwt
- Women in AI: go.bsky.app/LaGDpqg