Additionally, extractive models were considered, including TextRank (Nenkova and McKeown, 2011), LexRank (Erkan and Radev, 2004), LSA (Steinberger and JeΕ½ek, 2004), KLSum (Haghighi and Vanderwende, 2009), and SumBasic (Woodsend and Lapata, 2011)
Additionally, extractive models were considered, including TextRank (Nenkova and McKeown, 2011), LexRank (Erkan and Radev, 2004), LSA (Steinberger and JeΕ½ek, 2004), KLSum (Haghighi and Vanderwende, 2009), and SumBasic (Woodsend and Lapata, 2011)
The following models were evaluated:
BART (Lewis et al., 2019), Gemma (Gemma Team et al., 2024), SabiΓ‘ (Pires et al., 2023), Llama (Team, 2024a), TeenyTinyLlama (CorrΓͺa et al., 2024), Hermes (Teknium et al., 2024), Qwen (Team, 2024b), and Tucano (CorrΓͺa et al., 2025).
I experimentally evaluated the use of small language models in the task of text summarization in the context of auditing in Brazilian public health using news data.
My article, titled "Small language models applied in text summarization task of health-related news to improve public health audit: an experimental case study" has just been published in Frontiers in Artificial Intelligence.
doi.org/10.3389/frai...
#nlp #nlproc #llm #slm
Embodiment is the concept that the function of the brain is inexorably shaped by the body, a lens that is often neglected when neuroscientists study specific brain subsystems, write @bingbrunton.bsky.social and @tuthill.bsky.social.
#neuroskyence #neuroai
www.thetransmitter.org/neuroai/brea...
boa
My deep learning course at the University of Geneva is available on-line. 1000+ slides, ~20h of screen-casts. Full of examples in PyTorch.
fleuret.org/dlc/
And my "Little Book of Deep Learning" is available as a phone-formatted pdf (nearing 700k downloads!)
fleuret.org/lbdl/
Oi aqui Γ© o Luigi Mangione eu estou precisando de um pix pra pagar a minha advogada
Acordando cedo sem despertador
Pre-training as we know it will end - Dr. Ilya Sutskever at NeurIPS 2024
No. Words not seen by the model are "understood" yes. At least, the numerical representation will be approximated, simply because the models are trained contextually. If you simply give a model a task with sentences with some invented words, it will be able to solve the task.
Reading β 450 abstracts of studies on text summarization applied in the context of health, medicine and biomedicine to carry out a systematic mapping of the literature, and I thought there would be many more studies on text summarization in this area.
#EMNLP has a nice set of tokenization/subword modeling papers this year.
It's a good mix of tokenization algorithms, tokenization evaluation, tokenization-free methods, and subword embedding probing. Lmk if I missed some!
Here is a list with links + presentation time (in chronological order).
the data not seen by the model follows the structure of the language of the data seen. So, in a way, they can indeed "reason" (with many quotation marks) about unseen data.
Working on my dissertation qualification. Already receiving several criticisms about points of improvement from my advisor, but I ended up having other insights
A great book TBR
A starter pack of starter packs:
Robotics and AI go.bsky.app/DfAoaJ1
Computer Vision go.bsky.app/PkAKJu5
Computer Graphics Research go.bsky.app/ckQ1u9
Grumpy Machine Learners go.bsky.app/6ddpivr
Reinforcement Learning go.bsky.app/3WPHcHg
Wow that is an impressive image of neurons and their beautiful connections
From:
Super-resolution imaging of fast morphological dynamics of neurons in behaving animals
www.nature.com/articles/s41...
π€ ML/AI Mega Starter Pack
1. Open-source LLMS
go.bsky.app/FELkyDr
π§΅
Webinar reminder: on 26 November 2024 from 11-12 am CET the @sheffieldnlp.bsky.social team (led by K Bontcheva) will showcase veraAI work on text mining and analysis, and how this can support #factchecking. Register here for access (it's organized by @ebu.bsky.social).
tech.ebu.ch/events/2024/...
Talk sobre NLP, espaΓ§os vetoriais e classificaΓ§Γ£o de textos no RDSE
youtu.be/R1m7T59R-T0?...
cc @samsantosb.bsky.social #bolhadev #datascience #datasciencebr
π¨π¨The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead?π§π€
This is a massive question that is both important and timely.
πhttps://aclanthology.org/2024.emnlp-main.1230/
w/ Sabrina Akter, JP Singh, and @antonisa.bsky.social
Accepted to #EMNLP2024 Main,
1/3
Stop oversampling! Changing the cutoff in probabilistic classifiers is enough for imbalanced data.
In our new paper, Gabriel O. AssunΓ§Γ£o, Marcos O. Prates, and I explore this in depth. jds-online.org/journal/JDS/...
#DataScience #MachineLearning #ImbalancedData #AI #Oversampling
Thanks for sharing
π§ π€I just put together a starter pack for CogSci & human-centered AI researchers. Looking to add more folks hereβ let me know!
go.bsky.app/NTjjUwG
#cogsci #ai #hci
New here? Interested in AI/ML? Check out these great starter packs!
AI: go.bsky.app/SipA7it
RL: go.bsky.app/3WPHcHg
Women in AI: go.bsky.app/LaGDpqg
NLP: go.bsky.app/SngwGeS
AI and news: go.bsky.app/5sFqVNS
You can also search all starter packs here: blueskydirectory.com/starter-pack...
Thanks for sharing
I've seen some interest emerging about neurosymbolic AIs. Which would be great for explainability and to eliminate this subjectivity that exists about LLM vector spaces.