π Belated Happy New Year 2025 & Happy Lunar New Year! πβ¨
Weβre kicking off the year with SheffieldNLP's latest research, accepted at NAACL, ICLR & ECIR!
Stay tuned for a thread summarizing our published work! π
π Belated Happy New Year 2025 & Happy Lunar New Year! πβ¨
Weβre kicking off the year with SheffieldNLP's latest research, accepted at NAACL, ICLR & ECIR!
Stay tuned for a thread summarizing our published work! π
Joking aside DeepSeek is really impressive, showing that scaling is **not** all you need
[Prompt:] Write an epic rap battle between Donald Trump and Xi Jinping on East versus West. [Prompt:] Good job. Add 8 Mile vibes. [Prompt:] Cool. Now make Donald rapping in the style of Snoop Dog and Xi in the style of Eminem.
ChatGPT π₯ - DeepSeekβοΈ
Help! Looking for two emergency reviewers for an ARR December 2024 paper on topic modelling. Please msg me if you can provide a review by tomorrow π
Synthetic calibration data (for pruning and quantization) generated by the LLM itself is a better approx of the pre-training data dist than "external" data.
Really cool work by Miles (@mileswil.bsky.social) and George (@soon1otis.bsky.social) to be presented at #NAACL2025
arxiv.org/abs/2410.17170
We're hiring! Looking for a Lecturer (~Assistant Prof.) at the intersection of #NLProc and computational social media analysis (i.e. computational social science)
Info and how to apply:
www.jobs.ac.uk/job/DLI664/l...
Wrote up my first piece of PhD work last week! π§΅
Summarization via LMs is great at extracting info from documents, but how does summarization look in sensitive settings where privacy-preservation is essential?
Short answer: LMs are poor privacy preservers.
Arxiv: arxiv.org/abs/2412.12040
Findings:
β Privacy-preservation at inference-time is really underexplored!β¨π LMs struggle to prevent PII leakage in their summaries.β¨π©ββοΈ Human evaluations reveal privacy risks that metrics may overlook.
Paper w/ @naletras.bsky.social and Ning Ma
Cc. @sltcdt.bsky.social
We invite nominations to join the ACL2025 PC as reviewer or area chair(AC). Review process through ARR Feb cycle. Tentative timeline: Review 1-20 Mar 2025, Rebuttal is 26-31 Mar 2025. ACs must be available throughout the Feb cycle. Nominations by 20 Dec 2024:
shorturl.at/TaUh9 #NLProc #ACL2025NLP
Participated to a #EU "Survey on Simplifying Applications in EU Grants", I clicked to receive my response by email, it required CAPTCHA obviously. After a few failed attempts, I managed to receive it. Glad that it didn't ask me to provide detailed KTPs, TRLs and a 27B-6.
Great to have Joe Stacey today for his talk on atomic inference for interpretable NLI! ππΊ Breaking tasks into atoms for transparency + outperforming baselines was inspiring. Loved his insights on robustnessβand a fun trip to the Christmas Market after! πβ¨
*word
Cass and I are looking for a #PhD student to work on multimodal LLMs @sheffieldnlp.bsky.social.
This is a fully-funded scholarhsip (including stipend), open to home and international candidates.
Deadline: 29/1/2025
Please spread the work!
#nlproc
www.findaphd.com/phds/project...
Having a large number of short ARR cycles a year doesnβt make sense to me. Since there is no arxiv anonymity period anymore, we can move to less rushed cycles, more engagement during discussion period and better review/metareview quality given the extra time
The author response period of @ReviewAcl is way too short (this time during a weekend). We defo can do better by extending it for more meaningful discussions.
This perhaps would mean a smaller number (e.g. 3 or 4) of longer ARR cycles but itβs still worth it #nlproc #naacl2025
Fun fact: If I remember correctly we got desk rejected by Plos One because they couldn't find reviewers with appropriate expertise. I don't think that we even tried *ACL because we didn't have a "novel" END-TO-END model. PeerJ CS got some extremely high quality reviewers though!
The paper was published eight years ago:
peerj.com/articles/cs-...
and apart from inspiring further research in NLP and #legaltech, it also resulted in the creation of the NLLP Workshop and its amazing community 5/5
The initial idea was conceived at a coffee shop in Sheffield (the Couch, still operating) in 2014, trying to explain NLP and text classification to Dimitris 4/n
Looking back, it still amazes me that this work was just a side project, not specifically funded by a grant and published in a less prestigious outlet (although a UCL press release helped enormously) 3/n
For the first time, we showed that it is possible to just use the fact descriptions of legal cases to train classifiers - SVMs (ehm what?!) acting as π©ββοΈ- for predicting judicial decisions. This sparked huge interest (and π₯debates) in the use AI in the legal domain 2/n
Our paper (w/ Bill, Dimitris and Daniel) crossed the 1000 citations mark. While citation count as a metric only partially captures the true impact of a paper, it still indicates how influential this work was at the intersection of law and NLP 1/n
A huge thank you to Xiting Wang (Renmin University of China) for an insightful talk! ππΊ
Her talk on explaining large & small language models and uncovering safety risks through Concept Activation Vectors was very interesting and truly inspiring. π
scholar.google.com/citations?us...
ππ»ββοΈ
ππΌππΌ
Hello #nlproc world