πππ
26.02.2026 23:05
π 1
π 0
π¬ 0
π 0
Excited to share that our paper "Global-Aware Edge Prioritization for Pose Graph Initialization" has been accepted to CVPR 2026! #CVPR2026 See you soon in Denver!π₯³π₯³ Code is coming soonπ§
βHow would you do an accurate and efficient pose graph initialization in a global manner? arxiv.org/abs/2602.21963
26.02.2026 15:54
π 10
π 3
π¬ 1
π 0
Global-Aware Edge Prioritization for Pose Graph Initialization
@weitong8591.bsky.social, @gtolias.bsky.social, Jiri Matas, @danielbarath.bsky.social
tl;dr: rank pose graph edges->global consistency->improve SfM
arxiv.org/abs/2602.21963
26.02.2026 13:24
π 9
π 2
π¬ 1
π 1
Sleeping while waiting on an βanywhere in the worldβ paper decision release. #CVPR2026
20.02.2026 21:21
π 17
π 1
π¬ 2
π 0
Attention, Please! Revisiting Attentive Probing Through the Lens of Efficiency
As fine-tuning becomes impractical at scale, probing is emerging as the preferred evaluation protocol. However, standard linear probing can understate the capability of models whose pre-training optim...
8/8 Resources π
Paper: arxiv.org/abs/2506.10178
Code: github.com/billpsomas/e...
Joint work with: Dionysis Christopoulos,@eirinibaltzi.bsky.social,@ikakogeorgiou.bsky.social, @tim-arav.bsky.social,Nikos Komodakis,Konstantinos Karantzalos,Yannis Avrithis,@gtolias.bsky.social.
See you @ ICLR 2026π§π·
20.02.2026 15:03
π 0
π 0
π¬ 0
π 0
7/n Take-home messages π‘
EP:
- Plug-and-play.
- Compatible with all pre-training families.
- Unlocks the potential of encoders optimized for local representations.
- Complementary with PEFT.
- Better to have it, than not to have it. π
20.02.2026 15:03
π 0
π 0
π¬ 1
π 0
6/n EP + PEFT = π₯
- EP captures information that LoRA alone does not, and vice versa.
- LoRA+EP improves over both pure EP and pure LoRA.
π Example: a LoRA+EP configuration with 250K params reaches 72%, 4.3% above linear probing (67.7%), while using over 3Γ fewer parameters.
20.02.2026 15:03
π 0
π 0
π¬ 1
π 0
5/n Interpretability π
- EP queries specialize in distinct spatial regions.
- Attention maps are complementary.
- Semantic correspondences emerge (e.g. tails, feet).
- Verified quantitatively too.
20.02.2026 15:03
π 0
π 0
π¬ 1
π 0
4/n Designed for local representationsπ§©
π Across ImageNet-1K:
- Consistent gains over k-NN and Linear Probing (LP).
- Particularly strong improvements for MIM, VL, and generative.
- Minimal overhead.
20.02.2026 15:03
π 0
π 0
π¬ 1
π 0
3/n Core observation βοΈ
Prior attentive probing uses redundant projections.
π Introducing Efficient Probing (EP):
π Multi-query cross-attention.
π Plug-and-play on top of frozen encoders.
πΈ Lightweight and parameter-efficient.
20.02.2026 15:03
π 0
π 0
π¬ 1
π 0
2/n Why revisit probing? π€
- Linear probing underestimates encoders optimizing local representations.
- Full fine-tuning is costly at scale.
- Attentive probing helps, yet methods are over-parametrized and not well-studied.
π Can we get attention benefits without that much overhead?
20.02.2026 15:03
π 0
π 0
π¬ 1
π 0
1/n Attention, Please! π
Our work βRevisiting Attentive Probing Through the Lens of Efficiencyβ has been accepted at #ICLR2026.
We introduce Efficient Probing (EP) β a lightweight, multi-query attentive probing method for frozen encoders.
Paper + code at the end π
20.02.2026 15:03
π 11
π 4
π¬ 1
π 1
Would love to try
13.01.2026 18:33
π 1
π 0
π¬ 0
π 0
Best promo anyone could make for this position ππΎπ° And, amazingly, everything said is true π
09.01.2026 05:36
π 2
π 0
π¬ 1
π 0
Postdoctoral research position in Instance-level visual generation
Czech Technical University in Prague (CTU) offers a fellowship program, the CTU Global Postdoc Fellowship. This new and attractive two-year fellowship-program offers excellent researchers who have rec...
I have an opening for a two years post-doc position on instance-level (personalized) visual generation. Eligibility: (i) <=7 years from Ph.D. (ii) studies or 1 year outside of Czechia (ii) >=3 journal with IF or CORE A*/A conference papers. Deadline: 15 Feb.
Details: www.euraxess.cz/jobs/399390
08.01.2026 11:11
π 12
π 10
π¬ 2
π 1
πNew task: Instance-level Image+TextβImage Retrieval
πGiven a query image + an edit (βduring nightβ), retrieve the same specific instance after the change β not just any similar object.
π’New dataset on HF: i-CIR huggingface.co/datasets/bil...
π₯Download, run, and share results!
06.01.2026 20:00
π 12
π 5
π¬ 0
π 0
12/12 Joint work with Giorgos Petsangourakis, Christos Sgouropoulos, Theodoros Giannakopoulos, Giorgos Sfikas, @ikakogeorgiou.bsky.social.
27.12.2025 10:32
π 1
π 0
π¬ 0
π 0
REGLUE Your Latents with Global and Local Semantics for Entangled Diffusion
Latent diffusion models (LDMs) achieve state-of-the-art image synthesis, yet their reconstruction-style denoising objective provides only indirect semantic supervision: high-level semantics emerge slo...
11/n Summaryπ
REGLUE shows that the way we leverage VFM semantics matters for diffusion. Combining compact local semantics with global context yields faster convergence and state-of-the-art image generation.
πarXiv: arxiv.org/abs/2512.16636
π»Project: reglueyourlatents.github.io
27.12.2025 10:30
π 1
π 0
π¬ 1
π 0
10/n Faster convergenceπ₯
REGLUE (SiT-B/2) achieves 12.9 and 28.7 FID at 400K iterations in conditional and unconditional generation, respectively, outperforming REPA, ReDi, and REG. REGLUE (SiT-XL/2) matches 1M-step SOTA performance in just 700k iterations (~30% fewer steps).
27.12.2025 10:30
π 0
π 0
π¬ 1
π 0
9/n Alignment effects β
External alignment complements joint modeling, but its benefits depend on the signal. Local alignment yields consistent gains, whereas global-only alignment can degrade performance. Spatial joint modeling remains the primary driver.
27.12.2025 10:29
π 1
π 0
π¬ 1
π 0
8/n Local > Global Semanticsπ§©
Our analysis shows that jointly modeling with patch-level semantics drives most gains. The global [CLS] helps, but fine-grained spatial features deliver a strongly larger FID improvement, highlighting the importance of local structure for diffusion.
27.12.2025 10:29
π 0
π 0
π¬ 1
π 0
7/n Semantic preservation under compressionπ
Do compressed patch features retain VFM semantics?
Points show frozen compressed DINOv2 semantics (x: ImageNet top-1 / Cityscapes mIoU) vs SiT-B generation quality (y: ImageNet FID) when trained on VAE latents + compressed features.
27.12.2025 10:29
π 0
π 0
π¬ 1
π 0
6/n Non-linear compression matters π
Linear PCA can limit patch-level semantics (e.g., ReDi). We introduce a lightweight non-linear semantic compressor that aggregates multi-layer VFM features into a compact, semantics-preserving space, boosting quality (21.4 β 13.3 FID).
27.12.2025 10:28
π 0
π 0
π¬ 1
π 0
5/n Our method π§
REGLUE puts these into one unified model and jointly models:
1οΈβ£ VAE latents (pixels)
2οΈβ£ local semantics (compressed patch features)
3οΈβ£ global [CLS] (concept)
β alignment loss as a complementary auxiliary boost.
27.12.2025 10:28
π 1
π 0
π¬ 1
π 0
4/n Main insight π‘
Jointly modeling compressed patch-level semantics β VAE latents provides spatial guidance and yields larger gains than alignment-only (REPA) or global-only (REG).
Alignment loss and a global [CLS] token stay complementary, orthogonal signals.
27.12.2025 10:27
π 1
π 0
π¬ 1
π 0
3/n Key design choice π§© Compact spatial semantics matter!
To leverage VFMs effectively, diffusion should jointly model VAE latents with multi-layer VFM spatial (patch-level) semantics, via a compact, non-linearly compressed representation.
27.12.2025 10:27
π 0
π 0
π¬ 1
π 0
2/n More semantics are needed! β
Existing joint modeling and external alignment approaches (e.g., REPA, REG) inject only a βnarrow sliceβ of VFM features into diffusion. We argue richer semantics are needed to unlock their full potential.
27.12.2025 10:26
π 0
π 0
π¬ 1
π 0
1/n REGLUE Your Latents! π
We introduce REGLUE: a unified framework that entangles VAE latents β Global β Local semantics for faster, higher-fidelity image generation.
Links (paper + code) at the endπ
27.12.2025 10:26
π 14
π 4
π¬ 1
π 0