Alex Iacob | ML Research Scientist

ML Research Scientist · PhD Candidate

Machine learning research from core methods to scalable systems.

I work on optimization and large-scale model training across geographically distributed infrastructure. My current projects emphasize bandwidth and memory-efficient training and robust performance in realistic resource-constrained environments. For current work and writing, start with papers and blog.

Email CV PDF Google Scholar GitHub LinkedIn X

Selected Publications

DEPT: Decoupled Embeddings for Pre-training Language Models

ICLR 2025 Oral (Top 1.8%)

Decoupled embeddings for heterogeneous multilingual pre-training.

Alex Iacob, Lorenzo Sani, Meghdad Kurmanji, William F. Shen, Xinchi Qiu, Dongqi Cai, Yan Gao, Nicholas Donald Lane. ICLR 2025.

OpenReview arXiv Flower blog post

Decouples embeddings from the transformer body to pre-train on multilingual and multi-domain corpora with lower memory and communication overhead.

MT-DAO: Multi-Timescale Distributed Adaptive Optimizers with Local Updates

ICLR 2026 (Top 3%)

Multi-timescale local adaptive optimization under bandwidth limits.

Alex Iacob, Andrej Jovanovic, Mher Safaryan, Meghdad Kurmanji, Lorenzo Sani, Samuel Horváth, William F. Shen, Xinchi Qiu, Nicholas D. Lane. ICLR 2026.

OpenReview arXiv

Uses multi-timescale momentum tracking to match DDP quality in local-update pre-training while reducing wall-clock in low-communication settings.

DES-LOC: Desynced Low Communication Adaptive Optimizers for Foundation Models

ICLR 2026 (Top 5%)

Desynced optimizer-state synchronization with convergence guarantees.

Alex Iacob, Lorenzo Sani, Mher Safaryan, Paris Giampouras, Samuel Horváth, Meghdad Kurmanji, Andrej Jovanovic, Preslav Aleksandrov, William F. Shen, Xinchi Qiu, Nicholas D. Lane. ICLR 2026.

OpenReview arXiv

Desynchronizes parameter and moment synchronization to provide provably convergent, low-communication adaptive optimization for pre-training at large scales.

Worldwide Federated Training of Language Models

Best Paper (NeurIPS FL@FM 2024)

A hierarchical mixture-of-experts training approach.

Alex Iacob, Lorenzo Sani, Bill Marino, Preslav Aleksandrov, William F. Shen, Nicholas Donald Lane. CoRR 2024.

arXiv

Introduces WorldLM as a hierarchical mixture-of-experts approach for language-model training on naturally heterogeneous data.

Machine learning research from core methods to scalable systems.

Selected Publications

Full Bibliography