Zhenjun Zhao's Avatar

Zhenjun Zhao

@ericzzj

ericzzj1989.github.io Postdoc@UniZar | PhD@CUHK | 3D vision, SLAM, Image matching (http://github.com/ericzzj1989/Awesome-Global-Solvers-for-3D-Vision)

1,329
Followers
511
Following
1,322
Posts
16.11.2024
Joined
Posts Following

Latest posts by Zhenjun Zhao @ericzzj

Post image Post image Post image Post image

Dark3R: Learning Structure from Motion in the Dark

Andrew Y Guo, Anagh Malik, SaiKiran Tedla, Yutong Dai, Yiqian Qin, Zach Salehe, Benjamin Attal, Sotiris Nousias, Kyros Kutulakos, David B. Lindell

tl;dr: MASt3R+distillation with noisy–clean raw image pairs

arxiv.org/abs/2603.05330

06.03.2026 21:10 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

SURE: Semi-dense Uncertainty-REfined Feature Matching

Sicheng Li, Zaiwang Gu, Jie Zhang, Qing Guo, Xudong Jiang, Jun Cheng

tl;dr: uncertainty-aware LoFTR
no eval. on IMC

arxiv.org/abs/2603.04869

06.03.2026 21:09 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer

Juntong Fang, Zequn Chen, Weiqi Zhang, Donglin Di, Xuancheng Zhang, Chengmin Yang, Yu-Shen Liu

tl;dr: attention-forcing->dynamic motion/static scene structure; grouped causal attention+BA-like token aggregation

arxiv.org/abs/2603.05078

06.03.2026 21:08 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

AIM-SLAM: Dense Monocular SLAM via Adaptive and Informative Multi-View Keyframe Prioritization with Foundation Model

Jinwoo Jeon, Dong-Uk Seo, Eungchang Mason Lee, Hyun Myung

tl;dr: good keyframe selection+VGGT+MASt3R-SLAM

arxiv.org/abs/2603.05097

06.03.2026 21:08 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

Omni-Manip: Beyond-FOV Large-Workspace Humanoid Manipulation with Omnidirectional 3D Perception

Pei Qu, Zheng Li, Yufei Jia, Ziyun Liu, Liang Zhu, Haoang Li, Jinni Zhou, Jun Ma

tl;dr: end-to-end LiDAR-driven visuomotor policy in large workspaces

arxiv.org/abs/2603.05355

06.03.2026 21:08 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

GloSplat: Joint Pose-Appearance Optimization for Faster and More Accurate 3D Reconstruction

Tianyu Xiong, Rui Li, Linjie Li, Jiaqi Yang

tl;dr: use 3D feature tracks

arxiv.org/abs/2603.04847

06.03.2026 21:07 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

DAGE: Dual-Stream Architecture for Efficient and Fine-Grained Geometry Estimation

Tuan Duc Ngo, Jiahui Huang, Seoung Wug Oh, Kevin Blackburn-Matzen, Evangelos Kalogerakis, Chuang Gan, Joon-Young Lee

tl;dr: low/high-resolution stream->fusion

arxiv.org/abs/2603.03744

05.03.2026 17:07 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction

Weirong Chen, @chuanxiaz.bsky.social, @ganlinzhang.xyz, Andrea Vedaldi, @dcremers.bsky.social

tl;dr: TripoSG+VGGT

layout issue with tables?
arxiv.org/abs/2603.04179

05.03.2026 17:06 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Similar idea as:
bsky.app/profile/eric...

05.03.2026 17:05 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

ZipMap: Linear-Time Stateful 3D Reconstruction with Test-Time Training

@haian-jin.bsky.social, Rundi Wu, Tianyuan Zhang, Ruiqi Gao, @jonbarron.bsky.social, @snavely.bsky.social, @holynski.bsky.social

tl;dr: another(?) TTT+VGGT

arxiv.org/abs/2603.04385

05.03.2026 17:05 πŸ‘ 4 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Post image Post image Post image Post image

Overlapping Domain Decomposition for Distributed Pose Graph Optimization

Aneesa Sonawalla, Yulun Tian, Jonathan P. How

tl;dr: RBCD+overlapping domain decomposition

arxiv.org/abs/2603.03499

05.03.2026 17:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

MERG3R: A Divide-and-Conquer Approach to Large-Scale Neural Visual Geometry

Leo Kaixuan Cheng, Abdus Shaikh, Ruofan Liang, Zhijie Wu, Yushi Guan, Nandita Vijaykumar

tl;dr: another scalable VGGT

arxiv.org/abs/2603.02351

04.03.2026 17:22 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

@junyi42.bsky.social, Charles Herrmann, Junhwa Hur, Chen Sun, Ming-Hsuan Yang, Forrester Cole, Trevor Darrell, Deqing Sun

tl;dr: sliding window attention+Test-Time Training

arxiv.org/abs/2603.03269

04.03.2026 17:21 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

TokenSplat: Token-aligned 3D Gaussian Splatting for Feed-forward Pose-free Reconstruction

Yihui Li, Chengxin Lv, Zichen Tang, Hongyu Yang, Di Huang

tl;dr: align semantically corresponding information across viewpoints directly in the feature space

arxiv.org/abs/2603.00697

03.03.2026 14:10 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

OnlineX: Unified Online 3D Reconstruction and Understanding with Active-to-Stable State Evolution

Chong Xia, Fangfu Liu, Yule Wang, Yize Pang, Yueqi Duan

tl;dr: memory state->active local state & stable global state; active state->global state

arxiv.org/abs/2603.02134

03.03.2026 14:10 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

riMESA: Consensus ADMM for Real-World Collaborative SLAM

Daniel McGann, Michael Kaess

tl;dr: MESA improvement+M-Estimation

arxiv.org/abs/2603.01178

03.03.2026 14:09 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

tl;dr: Shor’s relaxation+Burer-Monteiro factorization->lifted factor graph (BM SDP relaxation)->Riemannian Staircase->LM+optimality certifier
extend SE-Sync/Shonan Rotation Averaging to factor graph

03.03.2026 14:08 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

Certifiable Estimation with Factor Graphs

Zhexin Xu, Nikolas R. Sanderson, Hanna Jiamei Zhang, David M. Rosen

arxiv.org/abs/2603.01267

03.03.2026 14:08 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

tl;dr: Gaussian noise->low-opacity regions; distill->auto-regressive causal model

03.03.2026 14:08 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image

ArtiFixer: Enhancing and Extending 3D Reconstruction with Auto-Regressive Diffusion Models

Riccardo de Lutio, @tobiasfshr.bsky.social, Yen-Yu Chang, Yuxuan Zhang, Jay Zhangjie Wu, Xuanchi Ren, Tianchang Shen, Katarina Tothova, Zan Gojcic, Haithem Turki

arxiv.org/abs/2603.00492

03.03.2026 14:08 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

You still have 2 weeks to submit your paper to Image Matching Workshop at #CVPR2026

Deadline: March 16.
Topics: anything related to image matching and 3D reconstruction.
cmt3.research.microsoft.com/IMW2026
@cvprconference.bsky.social

02.03.2026 13:42 πŸ‘ 7 πŸ” 5 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

Multiprojective Geometry of Compatible Triples of Fundamental and Essential Matrices

Timothy Duff, Viktor Korotynskiy, Anton Leykin, Tomas Pajdla

tl;dr: multidegree & multihomogeneous vanishing ideal->compatible triples for F/E matrices
read carefully later

arxiv.org/abs/2602.23450

02.03.2026 14:57 πŸ‘ 4 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

No Calibration, No Depth, No Problem: Cross-Sensor View Synthesis with 3D Consistency

Cho-Ying Wu, Zixun Huang, Xinyu Huang, Liu Ren

tl;dr: cross-modality matching->semi-dense X-images->densification & fusion->dense X-images->self-matching->RGB-X 3DGS

arxiv.org/abs/2602.23559

02.03.2026 14:56 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

UFO-4D: Unposed Feedforward 4D Reconstruction from Two Images

Junhwa Hur, Charles Herrmann, @songyoupeng.bsky.social, Philipp Henzler, Zeyu Ma, Todd Zickler, Deqing Sun

tl;dr: dynamic NoPoSplat

arxiv.org/abs/2602.24290

02.03.2026 14:56 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

Enhancing Vision-Language Navigation with Multimodal Event Knowledge from Real-World Indoor Tour Videos

Haoxuan Xu, Tianfu Li, Wenbo Chen, Yi Liu, Xingxing Zuo, Yaoxian Song, Haoang Li

tl;dr: event-centric knowledge enhancement in VLN

arxiv.org/abs/2602.23937

02.03.2026 14:55 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

FLIGHT: Fibonacci Lattice-based Inference for Geometric Heading in real-Time

David Dirnfeld, Fabien Delattre @pedro-miraldo.bsky.social Erik Learned-Miller

tl;dr: Hough transform is back -- now for camera translation direction.
arxiv.org/abs/2602.23115

27.02.2026 16:44 πŸ‘ 7 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Post image Post image Post image Post image

VGG-T^3: Offline Feed-Forward 3D Reconstruction at Scale

Sven Elflein, Ruilong Li, SΓ©rgio Agostinho, Zan Gojcic, @lealtaixe.bsky.social, Qunjie Zhou, Aljosa Osep

tl;dr: map the KV space via weights of a fixed-size MLP; optimize the MLP at test time in token space

arxiv.org/abs/2602.23361

27.02.2026 14:13 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image Post image Post image Post image

GIFSplat: Generative Prior-Guided Iterative Feed-Forward 3D Gaussian Splatting from Sparse Views

Tianyu Chen, Wei Xiang, Kang Han, Yu Lu, Di Wu, Gaowen Liu, Ramana Rao Kompella

tl;dr: init.->multi-step forward-only residual updates; DIFIX3D+->prior

arxiv.org/abs/2602.22571

27.02.2026 14:11 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Wenxuan Song, Jiayi Chen, Xiaoquan Sun, Huashuo Lei, Yikai Qin, Wei Zhao, Pengxiang Ding, Han Zhao, Tongxin Wang, Pengxu Hou, Zhide Zhong, Haodong Yan, Donglin Wang, Jun Ma, Haoang Li

27.02.2026 14:11 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image Post image Post image

Rethinking the Practicality of Vision-language-action Model: A Comprehensive Benchmark and An Improved Baseline

tl;dr: in title

arxiv.org/abs/2602.22663

27.02.2026 14:11 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0