Jonathan Hayase's Avatar

Jonathan Hayase

@jon.jon.ke

5th year PhD student at UW CSE, working on Security and Privacy for ML

191
Followers
39
Following
1
Posts
23.11.2024
Joined
Posts Following

Latest posts by Jonathan Hayase @jon.jon.ke

Tokenizers govern the allocation of computation. It's a waste to spend a whole token of compute predicting the "way" in "By the way". SuperBPE redirects that compute to predict more difficult tokens, leading to wins on downstream tasks!

21.03.2025 18:31 ๐Ÿ‘ 4 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
poster for paper

poster for paper

excited to be at #NeurIPS2024! I'll be presenting our data mixture inference attack ๐Ÿ—“๏ธ Thu 4:30pm w/ @jon.jon.ke โ€” stop by to learn what trained tokenizers reveal about LLM development (โ€ผ๏ธ) and chat about all things tokenizers.

๐Ÿ”— arxiv.org/abs/2407.16607

11.12.2024 22:08 ๐Ÿ‘ 13 ๐Ÿ” 4 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0