CV
Research CV with downloadable artifacts and concise role timelines.
Education
Oct 22--Present
PhD in Computer Science · University of Cambridge
- Third-year PhD Candidate in Machine Learning Systems. Advisor: Dr. Nicholas Lane.
- Specialization: Distributed Optimization for training foundation models across geo-distributed datacenters under strict bandwidth/VRAM constraints.
Oct 21--Jul 22
MPhil Advanced Computer Science · University of Cambridge
- Distinction (Rank 5/36, 84%). Advisor: Dr. Nicholas Lane.
- Dissertation: The Local-Global Trade-off in Federated Learning. Published at EuroMLSys.
Oct 18--Jul 21
BSc Computer Science · King's College London
- First-Class Honours (85%). Recipient of the Undergraduate Research Fellowship Award.
Experience
May. 24--Present
Research Scientist · Flower Labs
- Model Training: Pre-trained 1-13B models on hundreds of billions of tokens outperforming baselines such as SmolLM2 and OLMo2 in downstream tasks with matched compute.
- Engineering: Developed the aggregation layer for Flower Photon, unifying 32 H100 GPUs across 4 geo-distributed datacenters (US/EU) by leveraging Local SGD-based methods.
- Research: Published DEPT, enabling arbitrary vocabulary scaling with limited memory/bandwidth. Created two local-update adaptive optimizers, providing convergence guarantees for heterogeneous loss functions, established new SOTA.
- Optimization: Migrated pre-training codebase to torchtitan and torchft. Achieving a 4x wall-clock speedup over baseline Python implementation.
Jan. 23--Present
Teaching Assistant · University of Cambridge
- Authored the primary lab codebase for the university's first Federated Learning course.
Jun. 20--Oct. 22
Undergraduate Research Fellow · King's College London
- Applied computational social choice techniques to multi-agent system decision-making.
Skills
Tools
PyTorch, torchtitan, torchft, Docker, WandB, Slurm, Hydra
Distributed ML
4D Parallelism (FSDP/TP/PP/SP), Multi-Node Training, Local SGD
Research Areas
Optimizer Design, Mixture-of-Experts, Distributed LLM Pre-training, Unlearning
Selected Publications
- Oral Top 1.8% ICLR-25: Alex Iacob, et al. "DEPT: Decoupled Embeddings for Pre-training Language Models".
- Top 3% ICLR-26: Alex Iacob, et al. "MT-DAO: Multi-Timescale Distributed Adaptive Optimizers with Local Updates".
- Top 5 % ICLR-26: Alex Iacob, et al. "DES-LOC: Desynced Low Communication Adaptive Optimizers for Training Foundation Models".
- Best Paper FL@FM: Alex Iacob, et al. "Worldwide Federated Training of Language Models". Published at the Federated Learning in the Age of Foundation Models venue at NeurIPS 2024.