Erik Arakelyan's Avatar

Erik Arakelyan

@kirekara

Researcher @Nvidia | PhD from @CopeNLU | Formerly doing magic at @Amazon Alexa AI and @ARM. ML MSc graduate from @UCL. Research is the name of the game. ᓚᘏᗢ http://osoblanco.github.io

629
Followers
121
Following
14
Posts
08.11.2024
Joined
Posts Following

Latest posts by Erik Arakelyan @kirekara

Post image Post image Post image Post image

Back after a successful #EMNLP2025 conference in Suzhou, China -- some impressions ⤵️

Our papers: www.copenlu.com/news/8-paper...

@apepa.bsky.social @rnv.bsky.social @siddesh.bsky.social @kirekara.bsky.social @shoejoe.bsky.social @zainmujahid.me @lucasresck.bsky.social @copenlu.bsky.social
#NLProc

12.11.2025 07:53 👍 23 🔁 4 💬 0 📌 0
Post image

Attending EMNLP 2025 this week? So is CopeNLU -- come find us there! ⤵️

www.copenlu.com/news/8-paper...

@apepa.bsky.social @rnv.bsky.social @kirekara.bsky.social @shoejoe.bsky.social @dustinbwright.com @zainmujahid.me @lucasresck.bsky.social @iaugenstein.bsky.social

#NLProc #AI #EMNLP2025

04.11.2025 13:21 👍 4 🔁 3 💬 0 📌 0

The last round of applause goes to the @copenlu.bsky.social lab, @ucph.bsky.social and my amazing colleagues and friends there for the heartwarming, inspiring and fun times we had ♥️ to everyone involved in this journey goes my deepest sympathy ♥️♥️

03.04.2025 13:00 👍 2 🔁 0 💬 0 📌 0

I also want to thank the fantastic PhD committee,
@barbaraplank.bsky.social , Ivan Titov and
@delliott.bsky.social sky.social, for their deep, thought-provoking and insightful questions and analysis.

03.04.2025 12:58 👍 1 🔁 0 💬 0 📌 0

I defended my PhD at the University of Copenhagen ☺️ What a journey! I want to give massive thanks to my amazing supervisors, @iaugenstein.bsky.social and @neuralnoise.com who were there with me throughout the whole process.

Thesis on: osoblanco.github.io/thesis/
The Arxiv version is coming soon!

03.04.2025 12:54 👍 7 🔁 1 💬 3 📌 0

@dfdazac.bsky.social was an honor to work with someone as amazing as you.

The line made me teary 🥹🥹♥️♥️

30.11.2024 12:04 👍 11 🔁 2 💬 1 📌 0
Post image

Hello bluesky!
I'm using this first post to share that my PhD thesis is now available online at research.vu.nl/en/publicati...
Thanks to all my collaborators who joined me in this journey!

29.11.2024 16:42 👍 20 🔁 2 💬 1 📌 2

I think given the current weird/awful state of how reviewing is handled in major ML venues we would explicitly need ranking the reviewers even if they are anonymous. This can help (S)ACs at least internally filter out malicious and unqualified ones.

Will work on smth like this closer to ~ICML.

28.11.2024 17:59 👍 5 🔁 0 💬 1 📌 0

What i secretly desire is even stricter than grounding with RAG. Maybe have a big Knowledge Graph for grounding and use a good neural link predictor for confirming if the facts are correct. This covers factuality, we also would like deductive and analytic reasoning similar to a theorem prover.

19.11.2024 10:01 👍 3 🔁 0 💬 1 📌 0

The main question about the current LLM “reasoning” research is what to do next. Most go into synthetic generation and training on maybe with self-Refinement in hopes the model becomes better. I think we are missing controlled task formalization, step by step reasoning and strict step verification.

19.11.2024 05:34 👍 24 🔁 3 💬 5 📌 1
Post image

My amazing collaborators will be presenting three papers next week at EMNLP 2024! I wrote a blog post about our EMNLP papers and some of the other projects we're brewing 🚀🙂 neuralnoise.com/2024/nov-res...

09.11.2024 23:01 👍 11 🔁 5 💬 0 📌 0

The results consistently show that, over each model, traces that lead to correct answers had a higher percentage of unique emergent facts and overlap in the relations used between the code and search, while the portion of underutilized relations was lower.🤔🤔

08.11.2024 14:20 👍 2 🔁 0 💬 0 📌 0
Post image

By comparing relations in code with those in search traces, we measure emergent hallucinations and unused relations, highlighting areas of sub-optimal reasoning. We also assess the uniqueness of emergent facts per inference hop, indicating the extent of problem-space exploration.

08.11.2024 14:19 👍 2 🔁 0 💬 0 📌 0
Post image

We found out that there is a strong correlation between the search faithfulness towards the code and model performance across all of the models.

08.11.2024 14:18 👍 2 🔁 0 💬 0 📌 0
Post image

Using FLARE also allows the evaluation of faithfulness of the completed search w.r.t. the defined facts, relations, and search logic (taken from Prolog). We simply compare (ROUGE-Lsum) the simulated search with the actual code execution when available.

08.11.2024 14:17 👍 2 🔁 0 💬 0 📌 0
Post image

The method boosts the performance of various LLMs at different scales (8B -> 100B+) compared to CoT and Faithful CoT on various Mathematical, Multi-Hop, and Relation Inference tasks.

08.11.2024 14:16 👍 2 🔁 0 💬 0 📌 0
Post image

LLM formalizes the tasks using Prolog into facts, relations, and search logic and simulates exhaustive search by iteratively exploring the problem space with backtracking.

08.11.2024 14:15 👍 2 🔁 0 💬 0 📌 0
Post image

👋Psst! Want more faithful, verifiable and robust #LLM reasoning than with CoT, but using external solvers is meh? Our FLARE💫uses Logic Prog with Exhaustive Simulated Search to achieve this.🧵
@pminervini.bsky.social, Patrick Lewis, Pat Verga and @iaugenstein.bsky.social

arxiv.org/abs/2410.11900

08.11.2024 14:13 👍 10 🔁 5 💬 6 📌 0
Post image

At #EMNLP2024 we will present our paper on LLM values and opinions!

We introduce tropes: repeated and consistent phrases which LLMs generate to argue for political stances.

Read the paper to learn more! arxiv.org/abs/2406.19238
Work done Uni Copenhagen + Pioneer Center for AI

07.11.2024 14:57 👍 21 🔁 5 💬 1 📌 2
Preview
Analysing The Impact of Sequence Composition on Language Model Pre-Training Most language model pre-training frameworks concatenate multiple documents into fixed-length sequences and use causal masking to compute the likelihood of each token given its context; this strategy i...

Hey! 🙂 we analysed what happens during pre-training, and for causal LMs, intra-document causal masking helps quite a bit both in terms of pre-training dynamics and downstream task performance: arxiv.org/abs/2402.13991

08.11.2024 09:05 👍 7 🔁 2 💬 1 📌 0