Based in Berlin, Germany

Bridging theoretical Deep Learning and production systems.

I am a Senior AI Research Scientist specializing in Physical AI, Self-Supervised Learning, and robust vision encoders. Formerly pioneering research at UC Berkeley (BAIR) and deploying models to millions at Microsoft Research.

Dr. Marcel Simon
0
Citations (h-index 14)
Top 1%
Venues (TPAMI, ICCV)
Millions
Daily Inferences (Bing)
0
Open Source Stars

Advancing Physical AI & World Models

Architecting next-generation vision models at Nota AI. Designing video-distilled single-image encoders using Diffusion Transformers (DiT) to understand real-world physics without brittle external trackers.

AAAI 2026 paper teaser
AAAI Workshop 2026 Oral Presentation

Next-Frame Prediction as a Reliability-Aware Training Paradigm for Robust Vision Encoders

Foundation models deployed in dynamic domains suffer from critical reliability failures. We propose a lightweight paradigm that distills temporal knowledge from video into a standard single-image encoder, setting a new SOTA for DINO-style video distillation.

Read Abstract / Paper
ICML 2025 paper teaser
ICML Workshop 2025

Video Self-Distillation for Single-Image Encoders: A Step Toward Physically Plausible Perception

Most SSL methods train on static images, missing temporal cues. We introduce a video-distilled single-image encoder trained to predict the next-frame representation, injecting 3D spatial and temporal priors without optical flow.

Read Abstract / Paper

Selected Foundational Publications

TPAMI 2020 paper teaser
IEEE TPAMI 2020 Impact Factor 24.3

The Whole is More Than Its Parts? From Explicit to Implicit Pose Normalization

Consolidated PhD findings. Demonstrated that "implicit" pooling methods developed at Berkeley outperform traditional "explicit" part models. Introduced 'Activation Flow' visualization.

ICCV 2017 paper teaser
ICCV 2017 (Rank A*)

Generalized Orderless Pooling Performs Implicit Salient Matching

Invented "Generalized Orderless Pooling" (Alpha-Pooling) during time as Visiting Research Scholar at UC Berkeley (ICSI / BAIR) with Prof. Trevor Darrell.

ICCV 2015 paper teaser
ICCV 2015 (Rank A*) ~500 Citations

Neural Activation Constellations: Unsupervised Part Model Discovery

Pioneered learning part models in a completely unsupervised manner, without part annotations or bounding boxes, achieving SOTA on fine-grained recognition datasets.

Engineering Scale & Industry Impact

Translating academic background into production-ready AI models. Writing clean, modular code used globally.

Microsoft Bing Visual Search

Trained and integrated an image classification model for animal species recognition directly into the Bing search engine. Powered the "Species Recognition" feature, successfully handling millions of daily inferences in production.

Production AI High-Throughput Inference

Open Source: PyTorch-Wildlife

Core contributor to Microsoft AI for Earth's `CameraTraps` and `SpeciesClassification` platforms. Developed tools allowing conservationists globally to run MegaDetector and DeepFaune models.

PyTorch ONNX 950+ Stars

Technical Arsenal

Self-Supervised Learning (SSL) Physical AI Diffusion Transformers (DiT) Foundation Models Fine-Grained Classification Python (Expert) PyTorch TensorFlow / Keras C++ MATLAB ONNX Docker Azure SLURM / Multi-GPU