Babak Ehteshami Bejnordi

Research Scientist @ Qualcomm AI Research

Babak Ehteshami Bejnordi

I am a research scientist at Qualcomm AI Research (Senior Staff and Manager). My primary research focus lies in the realm of efficient Deep Learning for Large Language Models (LLMs) and Computer Vision. My recent research works have been in the areas of Efficient LLM deployment, Efficient (Latent) Reasoning, Mixture of Experts, Multi-Task Learning, and Continual Learning. I am a manager and team lead with main focus on Efficient LLM Architectures at Qualcomm AI Research, Amsterdam. Previously,I was the organizer of the Qualcomm Innovation Fellowship Program in Europe between 2019 and 2023.

I obtained my PhD at the Diagnostic Image Analysis Group, Radboud University, the Netherlands, where I worked on the development of ML algorithms for breast cancer diagnostics. During my PhD, I also organized the CAMELYON16 challenge.

From Jun to Nov 2016, I was a visiting researcher at Harvard University, where I worked on applying deep learning to computational pathology, with a focus on tumor-associated stroma as a prognostic biomarker in breast cancer, in collaboration with researchers from Harvard, NIH, and Mayo Clinic.

Qualcomm AI Research, Amsterdam, The Netherlands

Research updates:

Latest Research

View all →
kava
Qualcomm Tech Report'26

Reasoning on the Edge

Reasoning in small LLMs using LoRA adapters, combined with supervised fine-tuning and RL-based Budget forcing.

Paper →
kava
ICLR'26

Latent Reasoning

Distilling knowledge from a compressed KV-cache of a teacher into a latent-reasoning student.

Paper →
Cache-MoE
TMLR'25

Cache-MoE

Efficient Mixture-of-Experts for mobile devices with limited DRAM.

Paper →
Refactor LLM into MoE
NeurIPS '24

Refactor LLM into MoE

Refactorizing LLMs as router-decoupled mixture of experts with system co-design.

Paper →
LLM-to-SLM
ICML '24

LLM-to-SLM

Think Big, Generate Quick: LLM-to-SLM for fast autoregressive decoding.

Paper →
Scalarization for MTL
NeurIPS '23

Scalarization for MTL

Scalarization for Multi-Task and Multi-Domain Learning at scale.

Paper →
InterroGate for MTL
BMVC '24

InterroGate for MTL

Learning to share, specialize, and prune representations for Multi-task Learning.

Paper →