play this
www.puzzlescript.net/play.html?p=...
play this
www.puzzlescript.net/play.html?p=...
in the trailer, the mentions use the full names for "Mirror Isle" and "Skipping Stones to Lonely Homes", and "Heroes" for Heroes of Sokoban, so I'd be surprised if the intention is to hide anything.
for the record i don't think language is "solved". the parts i cared about solving, though, are to a large extent "solved", to the extent that the remaining "non-solved" parts are imo not linguistic
why?
whats the difference in your view?
i discuss this in the gist text. this is the more correct way to frame it imo (env provides observations, which agent interprets as rewards based on its goals), and it also opens up possible variations in how to think about learning from the env.
I complain a lot about RL lately, and here we go again.
The CS view of RL is wrong in how it thinks about rewards, already at the setup level. Briefly, the reward computation should be part of the agent, not part of the environment.
More at length here:
gist.github.com/yoavg/3eb3e7...
yes it sucks to be the ICLR organizers today, totally agree
given that the data is already out and a large jsonl file is rumored to be floating around (which seems very plausible to me), i think the moral thing to do now would be to make the breached data publicly available for all rather than trying to hide it.
RL is ok. but the jump from
A) people can be thought of as agents who observe and environment, act, observe the outcome and update their beliefs
to:
B) lets model all things as a POMDP with a numeric reward function!
is just way too big for me
the fascinating (to me) quality of hard-core RL researchers (e.g Sutton) is the ability to have an all encompassing view of RL as the basis of intelligence, while at the same time working on super low level stuff like tabular TD algorithms, and yet strongly believe these are actually the same thing
מסכים לגמרי
והאמייל הזה (אני מניח שלמטרת צילומים רשמיים למטרה כשלהי של ארגון כלשהו) נתפס כצינזור? (אני פשוט מופתע כי אני לא הייתי חושב על זה)
בתור אחד שרואה עצמו כעשוי לכתוב משהו כזה בטעות ולא מבין מה העניין, אשמח אם תוכלו להסביר מה כל כך מקפיץ פה?
לא הבנתי מה הרפרנס לשטר של כסף וגם לא ראיתי את הירוק דווקא כרפרור לצהל.. אבל כאמור אולי זה כי אני באמת לא מעצב אז אני לא חושב במובנים האלו
האם את מאמינה בעיקרון? כי נשמע מציוצים אחרים שלא ממש, ואז, מה אכפת לך בעצם עד כמה זה מדוייק? אני אישית כן מאמין, ואכן הייתי שמח אם זה ישתפר להבא, ומאמין שאכן כך יהיה כי זה כולה בולט שמקוצר באופן לא ברור על פוסטר.
לא הסלוגן האידאלי אבל גם באמת לא כזה מופרך. הכרה מפורשת בזכות של ישראל להתקיים כישות ציונית, לצד המדינות הערביות כולל הפלסטינית.
כלא מעצב, זה נראה לי קצת חובבני אבל סהכ ממש בסדר. מה הבעיה? מה זה ירוק לא נכון?
what's the latest-and-greatest attempt to reverse-engineer and document the inner-working of claude-code?
(hmm i guess we can amend to "increase in the proportion of knowledge we believe to be true")
i think memory is never "free", in the sense that the real bottleneck is not storage, but the ability to retrieve the right thing, while not retrieving a wrong (out of date) thing by mistake.
but assuming we do delete facts, is deleting considered learning in your definition?
is "increase" necessary? or is "change" enough? (although i guess that in an ideal form, you dont "forget" a wrong fact but add the fact that it is wrong, so you may consider it as increasing...)
yes, following instructions in prompt is not learning. but if a wrapping systems stores items to inject in future prompts, then you can consider the system as learning.
it will be in-context-induction, and the storing and retention from external memory would be learning.
the storage, if it happens, is the learning part. the inference process is not learning.
or as i wrote two years ago:
gist.github.com/yoavg/59d174...
i dont think it is a very useful view. at a very minimum we see extremely elaborate neighbor-matching and interpolation mechanisms, so the "glorified" part should be elaborated on and studied.
i agree, where is "storing" in the above case?
ah, cool!
indeed kNN is also not learning. its just a classification method. if you want to consider kNN as a learning method, then the learning part is just "store these pairs as is".