Dylan Cope's Avatar

Dylan Cope

@dylancope

Researching multi-agent RL, emergent communication, and evolutionary computation. Postdoc at FLAIR Oxford. PhD from Safe and Trusted AI CDT @ KCL/Imperial. Previously visiting researcher at CHAI U.C. Berkeley. dylancope.com he/him London ๐Ÿ‡ฌ๐Ÿ‡ง

1,468
Followers
662
Following
21
Posts
09.02.2024
Joined
Posts Following

Latest posts by Dylan Cope @dylancope

@ordinarythings.bsky.social has a better understanding of the social impacts of AI than many of the people in the industry, and is doing a great job clearly explaining these issues in an entertaining way. This is the kind of public outreach the world needs more of.

27.06.2025 09:09 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Will AI Slop Kill the Internet? | SlopWorld
Will AI Slop Kill the Internet? | SlopWorld YouTube video by Ordinary Things

"People want more friends, sure. But if your solution to that is to build a product that makes it easier and more pleasurable to talk to no one, then fuck you. You are a misery merchant no better than a drug dealer"

youtu.be/NuIMZBseAOM?...

27.06.2025 09:05 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

The way cars race up to the zebra crossing in SF is wild. I see the painted line a couple metres back from the crossing, but that seems to be a mere suggestion.

01.06.2025 03:48 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

I'm blushing

10.01.2025 14:55 ๐Ÿ‘ 5 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Mmh I don't know if I would say they're Sagans of our time. I think it's people like Vsauce, Hank Green, 3blue1brown, Physicsgirl, smartereveryday, Simone Giertz, Veritasium, MinutePhysics, etc.

05.12.2024 21:45 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

I think some people are annoyed and the baby bird response is a form of condescension. I don't like it.

I think it's good to be considerate and express gratitude if a reviewer has put in time. But you also have to make actual arguments.

04.12.2024 00:22 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

I never knew a photo of someone holding a hedgehog could feel so inspirational. This looks like it should be on a political poster or something!

27.11.2024 13:30 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

When people are interested in learning about how to train agents to communicate (emergent communication), I always recommend this paper as a first read: dl.acm.org/doi/10.5555/...

Attached meme summarises the main pitfall to be wary of!

25.11.2024 13:43 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Chiming into the conversation on peer-review. I think this is a good point that we need to take seriously. Science denialism has gotten a huge boost recently and many grifters benefit from well-meaning debates that they can twist into anti-intellectual narratives.

24.11.2024 19:07 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

๐Ÿ‘‹๐Ÿป

24.11.2024 14:56 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
e introduce the effective horizon, a property of
MDPs that controls how difficult RL is. Our analysis is mo-
tivated by Greedy Over Random Policy (GORP), a simple
Monte Carlo planning algorithm (left) that exhaustively ex-
plores action sequences of length k and then uses m random
rollouts to evaluate each leaf node. The effective horizon
combines both k and m into a single measure. We prove
sample complexity bounds based on the effective horizon that
correlate closely with the real performance of PPO, a deep
RL algorithm, on our BRIDGE dataset of 155 deterministic
MDPs (right).

e introduce the effective horizon, a property of MDPs that controls how difficult RL is. Our analysis is mo- tivated by Greedy Over Random Policy (GORP), a simple Monte Carlo planning algorithm (left) that exhaustively ex- plores action sequences of length k and then uses m random rollouts to evaluate each leaf node. The effective horizon combines both k and m into a single measure. We prove sample complexity bounds based on the effective horizon that correlate closely with the real performance of PPO, a deep RL algorithm, on our BRIDGE dataset of 155 deterministic MDPs (right).

Kind of a broken record here but proceedings.neurips.cc/paper_files/...
is totally fascinating in that it postulates two underlying, measurable structures that you can use to assess if RL will be easy or hard in an environment

23.11.2024 18:18 ๐Ÿ‘ 151 ๐Ÿ” 28 ๐Ÿ’ฌ 8 ๐Ÿ“Œ 2

I think the LLMs would generally write jax that isn't compatible with jit - lots of non-concrete shape issues. But if you know a couple patterns for doing branchless conditionals in SIMD settings it's not too hard to fix.

Or you could try aggressively prompting the LLMs ๐Ÿ˜‚

23.11.2024 15:39 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

For my domains it is night and day! Easily 10x speed-ups. I've been using JAX for the last 8 months, and I was using RLlib before which was very slow for my purposes.

Writing custom environments in JAX can be a bit of a pain though.

22.11.2024 17:12 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Currently I'm using:

- Custom gymnax env
- PureJAXRL
- PPO
- GRU RNNs
- wandb
- praying that my choice of hyper parameters is fine

22.11.2024 13:15 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

My hopeful interpretation is that tweet is getting less engagement because we're all over here now, and not looking at Twitter!

But it also wouldn't remotely surprise me if Musk is suppressing mentions of bluesky over there.

22.11.2024 11:48 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

I really hope it lasts! Feels very refreshing to see so many interesting things on the feed.

22.11.2024 01:17 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Post a non-religious photo you think of as holy.

21.11.2024 17:20 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Put differently - LLM pre training is imitation learning, and so maybe they will imitate our ability to adapt OOD?

Imo the problem is that IL is notoriously bad OOD. Not yet convinced "just scale" fixes the fundamental issue of biased demo data/compounding errors.

20.11.2024 19:52 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Video thumbnail

Managed to stump it with a drop that relies on correcting your balance with the wall. Wasn't too hard for me to get it but the agents don't get it!

20.11.2024 12:37 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Could you add me! :)

20.11.2024 09:51 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

One of my first posts on twitter was "fuck twitter". I'd just like to reiterate that sentiment today, as I join bluesky

19.11.2024 00:37 ๐Ÿ‘ 58 ๐Ÿ” 1 ๐Ÿ’ฌ 2 ๐Ÿ“Œ 1

Please get the others from Novara on too ๐Ÿ˜…

14.11.2024 13:47 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

It works better than Twitter!

14.11.2024 13:41 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0