Portfolio

Cache-MoE

Expo Demo @NeurIPS'24

Efficient Mixture-of-Experts for mobile devices with limited DRAM.

MoE On-device Caching LLM Efficiency

Paper Demo

Refactor LLM into MoE

Published @NeurIPS'24

Refactorizing LLMs as router-decoupled mixture of experts with system co-design.

MoE Batched-inference Dynamic sparsity Decoupled routing LLM Efficiency

Paper Code

LLM-to-SLM

Published @ICML'24: ES-FoMo II

Think Big, Generate Quick: LLM-to-SLM for fast autoregressive decoding.

Hybrid LLM Fast decoding LLM Efficiency LLM to SLM

Paper Code

InterroGate for MTL

Published @BMVC'24

Learning to share, specialize, and prune representations for Multi-task Learning.

Multi-task Learning Inference efficiency Gated Networks Channel sparsity

Paper Presentation

Scalarization for MTL

Published @NeurIPS'23

Scalarization for Multi-Task and Multi-Domain Learning at scale.

Population-based Training Scalarization Multi-Task Learning Multi-Domain Learning

Paper Presentation

MSViT

Published @ICCV'23: NIVT

Dynamic mixed-scale tokenization for vision transformers.

Conditional compute Mixed-scale Efficient CV Tokenization

Paper Code

Salisa

Published @ECCV'22

Saliency-based input sampling for efficient video object detection.

Efficient Inference VOD Video Object Detection Spatial Transformer Network

Paper Appendix

Single-gated MoE

Published @BMVC'22

Single-gate Mixture of Experts (MoE) with early exiting for convolutional architectures.

MoE Anytime Inference On-device Early-exiting

Paper Presentation

FrameExit

Published @CVPR'21 (Oral)

Conditional Early Exiting for Efficient Video Recognition.

Early Exiting Video Recognition Gating Network Efficient Recognition

Paper Code

SkipConv

Published @CVPR'21

Skip-Convolutions for efficient video processing.

Residual Convolutions Efficient Video Processing Skip-Convolution

Paper Code

Channel Gating for Continual Learning

Published @CVPR'20 (Oral)

Conditional channel gated networks for task-aware continual learning.

Continual Learning Chanel-Gating Task-aware Dynamic sparsity

Paper Presentation

Channel Gating with Batch-shaping

Published @ICLR'20

Batch-shaping for learning conditional channel gated networks.

Batch-shaping Channel Gating Dynamic sparsity

Paper Code

Babak Ehteshami Bejnordi

Research Projects

Cache-MoE

Refactor LLM into MoE

LLM-to-SLM

InterroGate for MTL

Scalarization for MTL

MSViT

Salisa

Single-gated MoE

FrameExit

SkipConv

Channel Gating for Continual Learning

Channel Gating with Batch-shaping