Sasha Abramowitz's Avatar

Sasha Abramowitz

@sash-4

Research engineer at InstaDeep working on multi-agent RL

1,398
Followers
263
Following
12
Posts
18.11.2024
Joined
Posts Following

Latest posts by Sasha Abramowitz @sash-4

πŸ§‘β€πŸ”¬ Oumayma Mahjoub and Wiem Khilfi will be presenting Sable at #ICML2025.

πŸ—“οΈWednesday, 16 July, 4:30 PM PDT
πŸ“West Exhibition Hall B2-B3, Poster Number W-820

#MARL #AI #ICML2025

14.07.2025 15:46 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
Sable: a Performant, Efficient and Scalable Sequence Model for MARL As multi-agent reinforcement learning (MARL) progresses towards solving larger and more complex problems, it becomes increasingly important that algorithms exhibit the key properties of (1) strong per...

If you are interested, have a look at the full paper and code:
πŸ“œPaper: arxiv.org/abs/2410.01706
πŸ§‘β€πŸ’»Code: bit.ly/4eMUXhn
🌐Website/Data: sites.google.com/view/sable-m...

(7/N)

14.07.2025 15:46 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

πŸŽ‰ A massive thank you to my incredible co-authors Oumayma Mahjoub, Ruan De Kock, Wiem Khlifi, Simon Du Toit, Jemma Daniel, Louay Ben Nessir, Louise Beyers, Claude Formanek & Arnu Pretorius

(6/N)

14.07.2025 15:44 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

⚑Despite its power, Sable is remarkably efficient. It scales to over 1000 agents with linear memory increase and boasts 7x better GPU memory efficiency and up to a 6.5x improvement in throughput compared to MAT (previous SOTA).

(5/N)

14.07.2025 15:42 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

πŸ”¬In a benchmark across 45 diverse tasks (the largest in the literature), Sable substantially outperformed existing methods, ranking best 11 times more often than previous SOTA methods.

(4/N)

14.07.2025 15:42 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

πŸ’ͺ Our solution? Sable adapts the retention mechanism from Retentive Networks (RetNets) and achieves centralised learning advantages without the associated drawbacks. This allows for efficient, long-term memory and impressive scalability.

(3/N)

14.07.2025 15:41 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

πŸ€” The challenge? Centralised training in MARL performs well but cannot scale, limiting its use to scenarios with only a few agents. This creates a trade-off between performance and agent scalability.

(2/N)

14.07.2025 15:41 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Sable graphed against the Multi-Agent Transformer (MAT), showing Sable outperforms MAT in performance, throughput and GPU memory useage

Sable graphed against the Multi-Agent Transformer (MAT), showing Sable outperforms MAT in performance, throughput and GPU memory useage

🚨 Thrilled to share our #ICML2025 paper: "Sable: a Performant, Efficient and Scalable Sequence Model for MARL"!

We introduce a new SOTA cooperative Multi-Agent Reinforcement Learning algorithm that delivers the advantages of centralised learning without its drawbacks.

(1/N)

14.07.2025 15:40 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Please add me πŸ™

26.11.2024 11:50 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Totally agree with you on the filtering point, but we're all are pretty bad at predicting what papers will be useful in future e.g PPO was rejected.

So maybe only reviewing for soundness is a good thing?

20.11.2024 19:24 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Can you add me πŸ™

19.11.2024 06:11 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

End-to-end compiling RL algorithms and envs and running everything across multiple TPU cores/GPUs, so that you never have to communicate anything with the CPU. This gives ridiculous speed ups, on the order 100x depending on environment. I don't think torch is there yet.

19.11.2024 06:08 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0