Excited to be at @neuripsconf.bsky.social next week co-presenting our tutorial: "Model Merging: Theory, Practice, and Applications" 🔥
Proud to do this with my PhD advisor, Colin Raffel, our postdoc research fellow @mcicc.bsky.social, and an incredible panel of speakers 💙
#NeurIPS2025 #ModelMerging
SyMerge: single-layer adaptation for synergistic model merging
SyMerge introduces a single-layer adaptation technique for synergistic model merging. Read more: getnews.me/symerge-single-layer-ada... #symerge #modelmerging #ai
Optimizer Noise Shapes Model Merging Success in Neural Networks
Effective noise scale—combining learning rate, weight decay, batch size and augmentation—predicts model‑merging success, with a non‑monotonic optimum. Read more: getnews.me/optimizer-noise-shapes-m... #modelmerging #effectivenoisescale #optimizers
Chain of Merges: New Layer‑wise Method Improves Model Fusion
Chain of Merges (CoM) merges weights layer‑by‑layer, updates activation stats to curb covariate shift, and hits state‑of‑the‑art results on several benchmarks. Read more: getnews.me/chain-of-merges-new-laye... #modelmerging #chainofmerges
Task Vectors Match Gradient Descent: Theory Improves Model Merging
Study shows a one‑epoch fine‑tune yields a vector = –η∇L, linking arithmetic to gradients. Vision benchmarks confirm the first‑epoch gradient dominates, enabling merging. Read more: getnews.me/task-vectors-match-gradi... #taskarithmetic #modelmerging
New Scaling Laws Reveal Predictable Gains from Model Merging in LLMs
Researchers unveiled a compact power-law scaling rule linking base model size and number of experts (k), showing gains drop about 1/k. Paper submitted September 2025. Read more: getnews.me/new-scaling-laws-reveal-... #modelmerging #scalinglaws #llms
Model Merging Increases Vulnerability to Adversarial Transfer Attacks
A recent study finds model merging provides little protection against transfer attacks, with a relative success rate over 95% across eight merging methods. Read more: getnews.me/model-merging-increases-... #modelmerging #adversarial
Model Merging Enables Tunable Reasoning Performance in LLMs
Model merging blends two pretrained LLMs via weight arithmetic, creating models; the study found Pareto improvements where the merged model beat a parent on accuracy and token usage. Read more: getnews.me/model-merging-enables-tu... #modelmerging #llms
Model Merging Boosts Domain‑Specific Ad‑hoc Retrieval Effectiveness
Linear weight merging of a retrieval model with a domain‑specific one improves search in medicine and Japanese text, matching LoRA fine‑tuning even with limited data. Read more: getnews.me/model-merging-boosts-dom... #modelmerging #adhocretrieval
Evaluation Pipeline Connects Model Merging Behavior and Internals
Researchers merged Qwen2.5 models, then tested them on the MMLU benchmark and probing of morphology and syntax, finding stronger linguistic knowledge despite mid scores. Read more: getnews.me/evaluation-pipeline-conn... #modelmerging #mmlu #probing
Privacy-Preserving Evolutionary Merging Improves Language Models
PriME merges language models with evolutionary algorithms, boosting task performance by up to 45% on the LaMP benchmark while cutting membership‑inference risk. Read more: getnews.me/privacy-preserving-evolu... #privacy #modelmerging
New Method Superposes Task‑Specific Features for Model Merging
A model‑merging technique superposes task‑specific features with transformation matrices, avoiding fine‑tuning. Benchmarks in NLP and vision showed accuracy exceeding heuristic averaging. Read more: getnews.me/new-method-superposes-ta... #modelmerging #ml
Text Shot: Model merging is a technique for integrating the knowledge of multiple specialized AI models into a single, more capable model. Instead of fine-tuning, which refines a single pre-trained model using new data, merging combines the parameters of several models simultaneously. This process can consolidate a wealth of knowledge into one asset without requiring expensive, gradient-based training or access to the original training data. For enterprise teams, this offers several practical advantages over traditional fine-tuning. In comments to VentureBeat, the paper’s authors said model merging is a gradient-free process that only requires forward passes, making it computationally cheaper than fine-tuning, which involves costly gradient updates. Merging also sidesteps the need for carefully balanced training data and mitigates the risk of “catastrophic forgetting,” where a model loses its original capabilities after learning a new task. The technique is especially powerful when…
How Sakana AI’s new evolutionary algorithm builds powerful AI models without expensive retraining venturebeat.com/ai/how-sakana-ais-new-ev... #AI #ModelMerging
🔥🤖📈Combining Large Models Unlocks New Levels Of Performance In AI Research www.azoai.com/news/2024101... #AIresearch #modelmerging #scalability #generalization #largemodels #expertmodels #zeroshot #instructiontuning #multitasktraining
Kollektive KI-Intelligenz – Evolutionäre Algorithmen können Sprachmodelle weiterentwickeln
#KünstlicheIntelligenz #artificialintelligence #KI #AI #FoundationModels #Evolution #ModelMerging #Sprachmodelle #EvolutionäreAlgorithmen
kinews24.de/kollektive-k...