Skip to content
About

MD. Faysal Islam Fahad

Deep Learning Engineer & AI researcher focused on vision transformers and multimodal learning.

Vision Transformers · Self-Supervised Learning · Multimodal Reasoning

I am a Deep Learning Engineer and AI researcher working at the intersection of computer vision, multimodal learning, and large-scale model training. My work focuses on pushing transformer architectures into resource-constrained, real-world settings — from medical imaging to autonomous perception. I enjoy turning recent papers into production systems, and shipping models that survive contact with real data.

Highlights

  • Trained ViT-B/16 from scratch on a custom 5M image dataset
  • Published at top-tier vision conferences (CVPR / ICCV workshop track)
  • Led on-device CV inference on Jetson Orin (sub-30ms latency)
  • Open-source contributor with 1.5k+ GitHub stars across repos
06 — Experience

A timeline of the work.

Roles, programs and research positions that shaped how I think about AI.

Senior AI Researcher
Frontier AI Lab
Jan 2024 — Present
Remote

Leading research on efficient multimodal models and edge deployment.

  • Architected the lab's ViT distillation pipeline
  • Mentored 3 junior researchers
PyTorchTritonCUDAWeights & Biases
Deep Learning Engineer
Vision Robotics Inc.
Mar 2022 — Dec 2023
Berlin / Remote

Shipped real-time perception models for warehouse robotics.

  • Reduced detection latency 3.5×
  • Owned the on-device CV stack on Jetson Orin
TensorRTONNXC++Python
M.Sc. in Artificial Intelligence
Tech University
Sep 2020 — Feb 2022
Munich

Master's degree with thesis on self-supervised learning for medical imaging.

  • Graduated with distinction
  • Published a workshop paper
PyTorchPandasLaTeX
04 — Skills

The deep learning toolbox.

Frameworks, models, and ops I use daily — calibrated by years of training, debugging and shipping.

Deep Learning

3 skills
  • Vision Transformers92%
  • Self-Supervised Learning88%
  • Diffusion Models85%

Computer Vision

3 skills
  • CNN Architectures93%
  • Object Detection90%
  • Image Segmentation88%

NLP & Multimodal

2 skills
  • Multimodal Models82%
  • LLM Fine-tuning80%

ML Frameworks

4 skills
  • PyTorch95%
  • Hugging Face90%
  • TensorFlow85%
  • JAX75%

Languages

3 skills
  • Python96%
  • TypeScript82%
  • CUDA70%

Tools & Ops

2 skills
  • Weights & Biases88%
  • Docker86%

Cloud

1 skill
  • AWS80%

Web

2 skills
  • FastAPI85%
  • Next.js80%