Prof. Dr. Jürgen Gall

TRA Mathematics, Modelling and Simulation of Complex Systems

Lamarr Institute for Machine Learning and Artificial Intelligence

Department of Information Systems and Artificial Intelligence

Office 2.037

Friedrich-Hirzebruch-Allee 8

53115 Bonn

Germany

Phone: +49 228 73 69600

Research Interests

Action recognition and video understanding

Anticipation and forecasting

Human pose estimation

Job Offers/Theses

PhD and Postdoc positions are available. For more details, please contact me.

Several bachelor and master projects are available. The topics include action recognition, video understanding, anticipation, human pose estimation, representation learning, and computer vision for plant, earth, and neuro-science. If you are interested, do not hesitate to contact me.

The University of Bonn offers an excellent master program with focus on Graphics, Vision, Audio. More information including application procedure can be found at the Institute of Computer Science.

Publications

Zhong Z., Martin M., Wu C., Schneider D., Diederichs F., Gall J., and Beyerer J., FlowNar: Scalable Streaming Narration for Long-Form Videos (PDF, Code), International Conference on Machine Learning (ICML'26), To appear.

Collart P., Gall J., Schnepf A., Pagel H., and Doorenbos L., Constrained Hybrid Modelling to Predict Microbial Dynamics and Organic Matter Turnover in Soil Systems (PDF), International Conference on Machine Learning (ICML'26), To appear.

Pallotta E., Azar S. M., Doorenbos L., Serdar O., Iqbal U., and Gall J., EgoControl: Controllable Egocentric Video Generation via 3D Full-Body Poses (PDF, Supplementary Material, Code/Model), IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'26), To appear.

Doorenbos L., Spurio F., and Gall J., Video Panels for Long Video Understanding (PDF, Supplementary Material, Code), IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'26), To appear.

Suleman H., Wasim S. T., Naseer M., and Gall J., StableMamba: Distillation-free Scaling of Large SSMs for Images and Videos (PDF), International Journal of Computer Vision, Vol. 134, No. 232, 2026.

Suryanto N., Naseer M., Li P., Wasim S. T., Yi J., Gall J., Ceravolo P., and Damiani E., RedSage: A Cybersecurity Generalist LLM (PDF, Code/Data), International Conference on Learning Representations (ICLR'26), 2026.

Bahrami E., Zatsarynna O., Francesca G., and Gall J., Towards Generalizing Temporal Action Segmentation to Unseen Views (PDF), International Journal of Computer Vision, Springer, Vol. 134, No. 180, 2026.

Denninger L., Azar S. M., and Gall J., CamC2V: Context-aware Controllable Video Generation (PDF, Supplementary Material, Code), International Conference on 3D Vision, 2026.

Zhong Z., Martin M., Schneider D., Lerch D. J., Wu C., Diederichs F., Gall J., and Beyerer J., Scalable Video Action Anticipation with Cross Linear Attentive Memory (PDF, Supplementary Material, Code), IEEE/CVF Winter Conference on Applications of Computer Vision (WACV'26), 8113-8123, 2026.

Ding S., Schneider L., Cordts M., and Gall J., ADA-Track++: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association (PDF, Code), IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 48, No. 1, 482-499, 2026. ©IEEE

Shams Eddin M. H., Zhang Y., Kollet S., and Gall J., RiverMamba: A State Space Model for Global River Discharge and Flood Forecasting (PDF, Supplementary Material, Code, Data), Conference on Neural Information Processing Systems (NeurIPS'25), 2025.

Wasim S. T., Suleman H., Zatsarynna O., Naseer M., and Gall J., MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation, (PDF, Supplementary Material), International Conference on Computer Vision (ICCV'25), 2025.

Gökay U., Spurio F., Bach D. R., and Gall J., Skeleton Motion Words for Unsupervised Skeleton-based Temporal Action Segmentation, (PDF, Supplementary Material, Code), International Conference on Computer Vision (ICCV'25), 2025.

Li S., Cheng Z., Li R., Li S., Gall J., Xu X., and Yang X., Global-Aware Monocular Semantic Scene Completion with State Space Models, (PDF, Supplementary Material, Code), International Conference on Computer Vision (ICCV'25), 2025.

Li S., Burke M., Ramamoorthy S., and Gall J. Learning a Neural Association Network for Self-supervised Multi-Object Tracking (PDF, Supplementary Material), British Machine Vision Conference (BMVC'25), 2025.

Herrmann P., Bieshaar M., Mack D., Herzog R., and Gall J. Self-Intersection-Aware 3D Human Motion Generation Using an Efficient Human Sphere Proxy (PDF, Supplementary Material, Code), British Machine Vision Conference (BMVC'25), 2025.

Yi J., Lopez G., Hadir S., Weyler J., Klingbeil L., Deichmann M., Gall J., and Seidel S., Non-invasive diagnosis of nutrient deficiencies in winter wheat and winter rye using UAV-based RGB images (PDF, Data), Computers and Electronics in Agriculture, Volume 239, Part A, Elsevier, 2025.

Thoduka S., Houben, S., Gall J., and Plöger P., Enhancing Video-Based Robot Failure Detection Using Task Knowledge (PDF, Code/Data), European Conference on Mobile Robots (ECMR'25), 2025. ©IEEE

Veeramacheneni L., Wolter M., Kuehne H., and Gall J., Canonical Rank Adaptation: An Efficient Fine-Tuning Strategy for Vision Transformers (PDF, Code), International Conference on Machine Learning (ICML'25), 2025.

Yi J., Wasim S. T., Luo Y., Naseer M., and Gall J., Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models (PDF, Supplementary Material, Code), IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'25), 24119-24128, 2025. ©IEEE

Zatsarynna O., Bahrami E., Abu Farha Y., Francesca G., and Gall J., MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Anticipation (PDF, Supplementary Material, Code), IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'25), 3438-3448, 2025. ©IEEE

Pallotta E., Azar S. M., Li S., Zatsarynna O., and Gall J., SyncVP: Joint Diffusion for Synchronous Multi-Modal Video Prediction (PDF, Supplementary Material, Code), IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'25), 13787-13797, 2025. ©IEEE

Shaker A., Wasim S. T., Khan S., Gall J., and Khan F. S., GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model (PDF, Supplementary Material, Code), IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'25), 14912-14922, 2025. ©IEEE

Velayudhan D., Ahmed A., Alansari M., Gour N., Behouch A., Hassan T., Wasim S. T., Maalej N., Naseer M., Gall J., Bennamoun M., Damiani E., and Werghi N., STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection (PDF, Supplementary Material, Code, Data), IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'25), 20767-20777, 2025. ©IEEE

Leonhardt L., Gall J., and Roscher R., ClimSat - A diffusion autoencoder model for climate-conditional satellite image editing (PDF, Code), Science of Remote Sensing, Vol 11, 2025.

Patnala A., Schultz M. G., and Gall J., BERT Bi-modal self-supervised learning for crop classification using Sentinel-2 and Planetscope (PDF), Frontiers in Remote Sensing, Vol 6, 2025.

Veeramacheneni L., Wolter M., Kuehne H., and Gall J., Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation (PDF, Supplementary Material, Code), International Conference on Learning Representations (ICLR'25), 2025.

Romeo L, Marani R., Perri A.G., and Gall J., Multi-modal temporal action segmentation for manufacturing scenarios (PDF, Data), Engineering Applications of Artificial Intelligence, Vol 148, 2025.

Spurio F., Bahrami E., Francesca G., and Gall J., Hierarchical Vector Quantization for Unsupervised Action Segmentation (PDF, Code), AAAI Conference on Artificial Intelligence (AAAI'25), 6996-7005, 2025. ©AAAI Press

Li S., Zanjani F.G., Ben Yahia H., Asano Y.M., Gall J., and Habibian A., VaLID: Variable-Length Input Diffusion for Novel View Synthesis (PDF), IEEE/CVF Winter Conference on Applications of Computer Vision (WACV'25), 2240-2249, 2025. ©IEEE

Shams Eddin M. H. and Gall J., Identifying Spatio-Temporal Drivers of Extreme Events (PDF, Video, Code, Data), Conference on Neural Information Processing Systems (NeurIPS'24), 2024.

Yi J., Luo Y., Deichmann M., Schaaf G., and Gall J., MV-Match: Multi-View Matching for Domain-Adaptive Identification of Plant Nutrient Deficiencies (PDF, Supplementary Material, Code), British Machine Vision Conference (BMVC'24), 2024.

Patnala A., Stadtler S., Schultz M. G., and Gall J., Bi-modal contrastive learning for crop classification using Sentinel-2 and Planetscope (PDF), Frontiers in Remote Sensing, Vol 5, 2024.

Zatsarynna O., Bahrami E., Abu Farha Y., Francesca G., and Gall J., Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation (PDF, Code), European Conference on Computer Vision (ECCV'24), 2024. ©Springer-Verlag

Mueller F., Tanke J., and Gall J., Massively Multi-Person 3D Human Motion Forecasting with Scene Context (PDF, Code), Workshop and Competition on Affective Behavior Analysis in-the-wild, LNCS 15637, 130-147, 2024. ©Springer-Verlag

Luo Y., Yi J., Abu Farha Y., Wolter M., and Gall J., Rethinking temporal self-similarity for repetitive action counting (PDF), IEEE International Conference on Image Processing (ICIP'24), 2024. ©IEEE

Ding S., Schneider L., Cordts M., and Gall J., ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association (PDF, Supplementary Material, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'24), 2024. ©IEEE

Li R., Li S., Chen X., Ma T., Gall J., and Liang J., TFNet: Exploiting Temporal Cues for Fast and Accurate LiDAR Semantic Segmentation (PDF), Workshop on Autonomous Driving, 2024. ©IEEE

Thoduka S., Hochgeschwender N., Gall J., and Plöger P., A Multimodal Handover Failure Detection Dataset and Baselines (PDF, Code/Data), IEEE International Conference on Robotics and Automation (ICRA), 2024. ©IEEE

Shams Eddin M. H. and Gall J., Focal-TSMP: deep learning for vegetation health prediction and agricultural drought assessment from a regional climate simulation (PDF, Video, Code, Data), Geoscientific Model Development, Vol 17, 2987-3023, 2024.

Storm H., Seidel S.J., Klingbeil L., Ewert F., Vereecken H., Amelung W., Behnke S., Bennewitz M., Börner J., Döring T., Gall J., Mahlein A.-K., McCool C., Rascher U., Wrobel S., Schnepf A., Stachniss C., and Kuhlmann H., Research priorities to leverage smart digital technologies for sustainable crop production (PDF), European Journal of Agronomy, Vol 156, Elsevier, 2024.

Sushko V., Zhang D., Gall J., and Khoreva A., Generating novel scene compositions from single images and videos (PDF, Code), Computer Vision and Image Understanding, Vol 239, Elsevier, 2024. ©Elsevier

Wang L., Gall J., Chin T.-J., Sato I., and Chellappa R. (Eds.), Special Issue on ACCV 2022 (Issue), International Journal of Computer Vision, Springer, 2024.

Leonhardt L., Gall J., and Roscher R., Leveraging Bioclimatic Context for Supervised and Self-Supervised Land Cover Classification (PDF, Code), DAGM German Conference on Pattern Recognition (DAGM GCPR'23), Springer, LNCS 14264, 227-242, 2024. ©Springer-Verlag

Tanke J., Kwon O.-H., Mueller F., Doering A., and Gall J., Humans in Kitchens: A Dataset for Multi-Person Human Motion Forecasting with Scene Context (PDF, Supplementary Material, Data), Neural Information Processing Systems Datasets and Benchmarks Track, 2023.

Bahrami E., Francesca G., and Gall J., How Much Temporal Long-Term Context is Needed for Action Segmentation?, (PDF, Supplementary Material, Code), International Conference on Computer Vision (ICCV'23), 10317-10327, 2023. ©IEEE

Tanke J., Zhang L., Zhao A., Tang C., Cai Y., Wang L., Wu P.-C., Gall J., and Keskin C., Social Diffusion: Long-term Multiple Human Motion Anticipation, (PDF, Supplementary Material, Code), International Conference on Computer Vision (ICCV'23), 9567-9577, 2023. ©IEEE

Sushko V., Wang R., and Gall J., Smoothness Similarity Regularization for Few-Shot GAN Adaptation, (PDF, Supplementary Material), International Conference on Computer Vision (ICCV'23), 7050-7059, 2023. ©IEEE

Ding S., Rehder E., Schneider L., Cordts M., and Gall J., 3DMOTFormer: Graph Transformer for Online 3D Multi-Object Tracking, (PDF, Supplementary Material, Code), International Conference on Computer Vision (ICCV'23), 9750-9760, 2023. ©IEEE

Doering A. and Gall J., A Gated Attention Transformer for Multi-Person Pose Tracking (PDF, Supplementary Material, Video, Video), International Workshop on Analysis and Modeling of Faces and Gestures, 3181-3190, 2023. ©IEEE

Li S., Li R., and Gall J., Semantic RGB-D Image Synthesis (PDF), International Workshop on Representation Learning with Very Limited Images, 944-952, 2023. ©IEEE

Zatsarynna O. and Gall J., Action Anticipation with Goal Consistency (PDF, Code), IEEE International Conference on Image Processing (ICIP'23), 1630-1634, 2023. ©IEEE

Li P., Ding S., Chen X., Hanselmann N., Cordts M., and Gall J., PowerBEV: A Powerful Yet Lightweight Framework for Instance Prediction in Bird's-Eye View (PDF, Supplementary Material, Code), International Joint Conference on Artificial Intelligence (IJCAI'23), 1080-1088, 2023. ©IJCAI

Shams Eddin M. H., Roscher R., and Gall J., Location-Aware Adaptive Normalization: A Deep Learning Approach for Wildfire Danger Forecasting (PDF, Code), IEEE Transactions on Geoscience and Remote Sensing, 61, 1-18, 2023. ©IEEE

Patnala A., Stadtler S., Schultz M. G., and Gall J., Generating Views Using Atmospheric Correction for Contrastive Self-Supervised Learning of Multispectral Images (PDF), IEEE Geoscience and Remote Sensing Letters, 20, 1-5, 2023. ©IEEE

Shang L., Wang J., Schäfer D., Heckelei T., Gall J., Appel F., and Storm H., Surrogate Modelling of a Detailed Farm-level Model using Deep Learning (PDF, Data/Code), Journal of Agricultural Economics, 1-26, 2023.

Wang L., Gall J., Chin T.-J., Sato I., and Chellappa R. (Eds.), Computer Vision - 16th Asian Conference on Computer Vision (ACCV 2022) (Part I, Part II, Part III, Part IV, Part V, Part VI, Part VII), Springer, LNCS 13841-13847, 2023.

Bauckhage C., Förstner W., Gall J., Möller M., and Schwing A. (Eds.), Special Issue on Pattern Recognition (DAGM GCPR 2021) (Issue), International Journal of Computer Vision, Springer, 2023.

Sushko V., Zhang D., Gall J., and Khoreva A., One-Shot Synthesis of Images and Segmentation Masks (PDF, Supplementary Material, Code), Winter Conference on Applications of Computer Vision (WACV'23), 6274-6283, 2023. ©IEEE

Grund C., Tanke J., and Gall J., ElliPose: Stereoscopic 3D Human Pose Estimation by Fitting Ellipsoids (PDF, Code), Winter Conference on Applications of Computer Vision (WACV'23), 2870-2880, 2023. ©IEEE

Li S., Abu Farha Y., Liu Y., Cheng M.-M., and Gall J., MS-TCN++: Multi-Stage Temporal Convolutional Network for Action Segmentation (PDF, MS-TCN Code, MS-TCN++ Code), IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, No. 6, 6647-6658, 2023. ©IEEE

Souri Y., Fayyaz M., Minciullo L., Francesca G., and Gall J., Fast Weakly Supervised Action Segmentation Using Mutual Consistency (PDF, Code), IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, No. 10, 6196-6208, 2022. ©IEEE

Sushko V., Schönfeld E., Zhang D., Gall J., Schiele B., and Khoreva A., OASIS: Only Adversarial Supervision for Semantic Image Synthesis (PDF, Code), International Journal of Computer Vision, Springer, Vol. 130, No. 12, 2903-2923, 2022.

Souri Y., Abu Farha Y., Bahrami E., Francesca G., and Gall J., Robust Action Segmentation from Timestamp Supervision (PDF, Supplementary Material, Code), British Machine Vision Conference (BMVC'22), 2022.

Li S., Cheng M.-M., and Gall J., Dual Pyramid Generative Adversarial Networks for Semantic Image Synthesis (PDF, Supplementary Material, Code), British Machine Vision Conference (BMVC'22), 2022.

Pourheydari S., Bahrami E., Fayyaz M., Francesca G., Noroozi M., and Gall J., TaylorSwiftNet: Taylor Driven Temporal Modeling for Swift Future Frame Prediction (PDF, Supplementary Material), British Machine Vision Conference (BMVC'22), 2022.

Fayyaz M., Koohpayegani S. A., Rezaei-Jafari F., Sengupta S., Vaezi-Joze H.-R., Sommerlade E., Pirsiavash H., and Gall J., Adaptive Token Sampling for Efficient Vision Transformers (PDF, Supplementary Material, Code), European Conference on Computer Vision (ECCV'22), Springer, LNCS 13671, 396-414, 2022. ©Springer-Verlag

Behrmann N., Golestaneh S. A., Kolter Z., Gall J., and Noroozi M., Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation (PDF, Supplementary Material, Code), European Conference on Computer Vision (ECCV'22), Springer, LNCS 13695, 52-68, 2022. ©Springer-Verlag

Li R., Tanke J., Vo M., Zollhofer M., Gall J., Kanazawa A., and Lassner C., TAVA: Template-free Animatable Volumetric Actors (PDF, Supplementary Material, Code), European Conference on Computer Vision (ECCV'22), Springer, LNCS 13692, 419-436, 2022. ©Springer-Verlag

Zatsarynna O., Abu Farha Y., and Gall J., Self-supervised Learning for Unintentional Action Prediction (PDF, Supplementary Material), DAGM German Conference on Pattern Recognition (DAGM GCPR'22), Springer, LNCS 13485, 429-444, 2022. ©Springer-Verlag

Ding S., Rehder E, Schneider L., Cordts M., and Gall J., End-to-End Single Shot Detector using Graph-based Learnable Duplicate Removal (PDF, Supplementary Material), DAGM German Conference on Pattern Recognition (DAGM GCPR'22), Springer, LNCS 13485, 375-389, 2022. ©Springer-Verlag

Doering A., Chen D., Zhang S., Schiele B., and Gall J., PoseTrack21: A Dataset for Person Search, Multi-Object Tracking and Multi-Person Pose Tracking (PDF, Supplementary Material, Data), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'22), 20931-20940, 2022. ©IEEE

Chen D., Doering A., Zhang S., Yang J., Gall J., and Schiele B., Keypoint Message Passing for Video-based Person Re-Identification (PDF), AAAI Conference on Artificial Intelligence (AAAI), 239-247, 2022. ©AAAI

Hoffmann D., Behrmann N., Gall J., Brox T., and Noroozi M., Ranking Info Noise Contrastive Estimation: Boosting Contrastive Learning via Ranked Positives (PDF, Code), AAAI Conference on Artificial Intelligence (AAAI), 897-905, 2022. ©AAAI

Li S., Liu Y., and Gall J., Rethinking 3D LiDAR Point Cloud Segmentation (PDF, Code), IEEE Transactions on Neural Networks and Learning Systems, 2022. ©IEEE

Li S., Chen X., Liu Y., Dai D., Stachniss C., and Gall J., Multi-scale Interaction for Real-time LiDAR Data Segmentation on an Embedded Platform (PDF, Code), IEEE Robotics and Automation Letters (RA-L), Vol. 7, No. 2, 738-745, 2022. ©IEEE

Tanke J., Zaveri C., and Gall J., Intention-based Long-Term Human Motion Anticipation (PDF, Supplementary Material), International Conference on 3D Vision (3DV'21), 596-605, 2021. ©IEEE

Bauckhage C., Gall J., and Schwing A. (Eds.), Pattern Recognition - 43rd DAGM German Conference on Pattern Recognition (DAGM GCPR 2021) (Book), Image Processing, Computer Vision, Pattern Recognition, and Graphics, Springer, LNCS 13024, 2021.

Souri Y., Abu Farha Y., Despinoy F., Francesca G., and Gall J., FIFA: Fast Inference Approximation for Action Segmentation (PDF, Supplementary Material, Video, Presentation), DAGM German Conference on Pattern Recognition (DAGM GCPR'21), Springer, LNCS 13024, 282-296, 2021. ©Springer-Verlag

Li S., Zhou Y., Yi J., and Gall J., Spatial-Temporal Consistency Network for Low-Latency Trajectory Forecasting (PDF, Supplementary Material), International Conference on Computer Vision (ICCV'21), 1920-1929, 2021. ©IEEE

Behrmann N., Fayyaz M., Gall J., and Noroozi M., Long Short View Feature Decomposition via Contrastive Video Representation Learning (PDF, Supplementary Material), International Conference on Computer Vision (ICCV'21), 9224-9233, 2021. ©IEEE

Biswas S. and Gall J., Multiple Instance Triplet Loss for Weakly Supervised Multi-Label Action Localisation of Interacting Persons (PDF), Understanding Social Behavior in Dyadic and Small Group Interactions Workshop, 2159-2167, 2021. ©IEEE

Thoduka S., Gall J., and Plöger P., Using Visual Anomaly Detection for Task Execution Monitoring (PDF, Code/Data), IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 4604-4610, 2021. ©IEEE

Chen X., Li S., Mersch B., Wiesmann L., Gall J., Behley J., and Stachniss C., Moving Object Segmentation in 3D LiDAR Data: A Learning-based Approach Exploiting Sequential Data (PDF, Code), IEEE Robotics and Automation Letters (RA-L), Vol. 6, No. 4, 6529-6536, 2021. ©IEEE

Behley J., Garbade M., Milioto A., Quenzel J., Behnke S., Gall J., and Stachniss C., Towards 3D LiDAR-based Semantic Scene Understanding of 3D Point Cloud Sequences -- The SemanticKITTI Dataset (Code/Data), The International Journal of Robotics Research, Vol. 40, No. 8-9, 959-967, 2021. ©SAGE Journals

Li Z., Abu Farha Y., and Gall J., Temporal Action Segmentation from Timestamp Supervision (PDF, Supplementary Material, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'21), 8361-8370, 2021. ©IEEE

Fayyaz M., Bahrami E., Diba A., Noroozi M., Adeli E., van Gool L., and Gall J., 3D CNNs with Adaptive Temporal Feature Resolutions (PDF, Supplementary Material, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'21), 4729-4738, 2021. ©IEEE

Zatsarynna O., Abu Farha Y., and Gall J., Multi-Modal Temporal Convolutional Network for Anticipating Actions in Egocentric Videos (PDF), IEEE Workshop on Precognition: Seeing through the Future, 2249-2258, 2021. ©IEEE

Li S., Yi J., Abu Farha Y., and Gall J., Pose Refinement Graph Convolutional Network for Skeleton-based Action Recognition (PDF, Code), IEEE Robotics and Automation Letters (RA-L), Vol. 6, No. 2, 1028-1035, 2021. ©IEEE

Sushko V., Schönfeld E., Zhang D., Gall J., Schiele B., and Khoreva A., You Only Need Adversarial Supervision for Semantic Image Synthesis (PDF, Code), International Conference on Learning Representations (ICLR'21), 2021.

Behrmann N., Gall J., and Noroozi M., Unsupervised Video Representation Learning by Bidirectional Feature Prediction (PDF), Winter Conference on Applications of Computer Vision (WACV'21), 1669-1678, 2021. ©IEEE

Richard A., Lea C., Ma S., Gall J., De la Torre F., and Sheikh Y., Audio- and Gaze-driven Facial Animation of Codec Avatars (PDF, Video), Winter Conference on Applications of Computer Vision (WACV'21), 41-50, 2021. ©IEEE

Biswas S. and Gall J., Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting (PDF, Supplementary Material, Code), Asian Conference on Computer Vision (ACCV'20), Springer, LNCS 12626, 547-561, 2021. ©Springer-Verlag

Kwon O.-H., Tanke J., and Gall J., Recursive Bayesian Filtering for Multiple Human Pose Tracking from Multiple Cameras (PDF), Asian Conference on Computer Vision (ACCV'20), Springer, LNCS 12623, 438-453, 2021. ©Springer-Verlag

Yi J., Krusenbaum L., Unger P., Hüging H., Seidel S.J., Schaaf G., and Gall J., Deep Learning for Non-Invasive Diagnosis of Nutrient Deficiencies in Sugar Beet Using RGB Images (PDF, Data), Sensors, 20, 5893, 2020.

Abu Farha Y., Ke Q., Schiele B., and Gall J., Long-Term Anticipation of Activities with Cycle Consistency (PDF, Supplementary Material), DAGM German Conference on Pattern Recognition (GCPR'20), Springer, LNCS 12544, 159-173, 2021. ©Springer-Verlag

Zhang Y., Briq R., Tanke J., and Gall J., Adversarial Synthesis of Human Pose From Text (PDF, Supplementary Material), DAGM German Conference on Pattern Recognition (GCPR'20), Springer, LNCS 12544, 145-158, 2021. ©Springer-Verlag

Zatsarynna O., Sawatzky J., and Gall J., Discovering Latent Classes for Semi-Supervised Semantic Segmentation (PDF, Supplementary Material), DAGM German Conference on Pattern Recognition (GCPR'20), Springer, LNCS 12544, 202-217, 2021. ©Springer-Verlag

Wolter M., Gall J., and Yao A., Sequence Prediction using Spectral RNNs (PDF, Code), International Conference on Artificial Neural Networks (ICANN'20), Springer, LNCS 12396, 825-837, 2020. ©Springer-Verlag

Rafi U., Doering A., Leibe B., and Gall J., Self-supervised Keypoint Correspondences for Multi-Person Pose Estimation and Tracking in Videos (PDF, Supplementary Material), European Conference on Computer Vision (ECCV'20), Springer, LNCS 12365, 36-52, 2020. ©Springer-Verlag

Diba A., Fayyaz M., Sharma V., Paluri M., Gall J., Stiefelhagen R., and van Gool L., Large Scale Holistic Video Understanding (PDF, Supplementary Material, Data), European Conference on Computer Vision (ECCV'20), Springer, LNCS 12350, 593-610, 2020. ©Springer-Verlag

Fayyaz M. and Gall J., SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation (PDF, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'20), 498-507, 2020. ©IEEE

Kuehne H., Richard A., and Gall J., A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation (PDF, Code), IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 42, No. 4, 765-779, 2020. ©IEEE

Panareda Busto P., Iqbal A., and Gall J., Open Set Domain Adaptation for Image and Action Recognition (PDF, Supplementary Material, Slides, Code), IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 42, No. 2, 413-429, 2020. ©IEEE

Bruckschen L., Amft S., Tanke J., Gall J., and Bennewitz M., Detection of Generic Human-Object Interactions in Video Streams (PDF, Video), International Conference on Social Robotics (ICSR'19), Springer, LNCS 11876, 108-118, 2019. ©Springer-Verlag

Behley J., Garbade M., Milioto A., Quenzel J., Behnke S., Stachniss C., and Gall J., SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences (PDF, Supplementary Material, Code/Data), International Conference on Computer Vision (ICCV'19), 9296-9306, 2019. ©IEEE

Ruiz A. H., Gall J., and Moreno-Noguer F., Human Motion Prediction via Spatio-Temporal Inpainting (PDF), International Conference on Computer Vision (ICCV'19), 7133-7142, 2019. ©IEEE

Abu Farha Y. and Gall J., Uncertainty-Aware Anticipation of Activities (PDF), International Workshop on Human Behaviour Understanding, 1197-1204, 2019. ©IEEE

Richard A., Iqbal A., and Gall J., Enhancing Temporal Action Localization with Transfer Learning from Action Recognition (PDF), Workshop and Challenge on Comprehensive Video Understanding in the Wild, 1533-1540, 2019. ©IEEE

Sawatzky J., Banerjee D., and Gall J., Harvesting Information from Captions for Weakly Supervised Semantic Segmentation (PDF), Workshop on Cross-Modal Learning in Real World, 4481-4490, 2019. ©IEEE

Iqbal A. and Gall J., Level Selector Network for Optimizing Accuracy-Specificity Trade-offs (PDF), International Workshop on Large Scale Holistic Video Understanding, 1466-1473, 2019. ©IEEE

Panareda Busto P. and Gall J., Joint Viewpoint and Keypoint Estimation with Real and Synthetic Data (PDF, Code, Supplementary Material), German Conference on Pattern Recognition (GCPR'19), Springer, LNCS 11824, 107-121, 2019. ©Springer-Verlag

Tanke J. and Gall J., Iterative Greedy Matching for 3D Human Pose Tracking from Multiple Views (PDF, Code), German Conference on Pattern Recognition (GCPR'19), Springer, LNCS 11824, 537-550, 2019. ©Springer-Verlag

Biswas S., Souri Y., and Gall J., Hierarchical Graph-RNNs for Action Detection of Multiple Activities (PDF), IEEE International Conference on Image Processing (ICIP'19), 1-5, 2019. ©IEEE

Thoker F. and Gall J., Cross-modal Knowledge Distillation for Action Recognition (PDF), IEEE International Conference on Image Processing (ICIP'19), 6-10, 2019. ©IEEE

Chen Y.-T., Garbade M., and Gall J., 3D Semantic Scene Completion From a Single Depth Image using Adversarial Training (PDF), IEEE International Conference on Image Processing (ICIP'19), 1835-1839, 2019. ©IEEE

Abu Farha Y. and Gall J., MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation (PDF, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'19), 3570-3579, 2019. ©IEEE

Sawatzky J., Souri Y., Grund C., and Gall J., What Object Should I Use? - Task Driven Object Detection (PDF, Supplementary Material, Code/Data), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'19), 7597-7606, 2019. ©IEEE

Kukleva A., Kuehne H., Sener F., and Gall J., Unsupervised Learning of Action Classes with Continuous Temporal Embedding (PDF, Supplementary Material, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'19), 12058-12066, 2019. ©IEEE

Garbade M., Chen Y.-T., Sawatzky J., and Gall J., Two Stream 3D Semantic Scene Completion (PDF), Multimodal Learning and Applications Workshop, 416-425, 2019. ©IEEE

Sabokrou M., Pourreza M., Fayyaz M., Entezari R., Fathy M., Gall J., and Adeli E., AVID: Adversarial Visual Irregularity Detection (PDF, Code), Asian Conference on Computer Vision (ACCV'18), Springer, LNCS 11269, 169-184, 2018. ©Springer-Verlag

Sawatzky J., Garbade M., and Gall J., Ex Paucis Plura: Learning Affordance Segmentation from Very Few Examples (PDF), German Conference on Pattern Recognition (GCPR'18), Springer, LNCS 11366, 488-505, 2018. ©Springer-Verlag

Briq R., Moeller M., and Gall J., Convolutional Simplex Projection Network for Weakly Supervised Semantic Segmentation (PDF, Code), British Machine Vision Conference (BMVC'18), 2018.

Doering A., Iqbal U., and Gall J., Joint Flow: Temporal Flow Fields for Multi Person Tracking (PDF), British Machine Vision Conference (BMVC'18), 2018.

Rafi U., Gall J., and Leibe B., Direct Shot Correspondence Matching (PDF), British Machine Vision Conference (BMVC'18), 2018.

Iqbal U., Molchanov P., Breuel T., Gall J., and Kautz J., Hand Pose Estimation via Latent 2.5D Heatmap Regression (PDF), European Conference on Computer Vision (ECCV'18), Springer, LNCS 11215, 125-143, 2018. ©Springer-Verlag

Diba A., Fayyaz M., Sharma V., Arzani M., Yousefzadeh R., Gall J., and van Gool L., Spatio-Temporal Channel Correlation Networks for Action Classification (PDF), European Conference on Computer Vision (ECCV'18), Springer, LNCS 11208, 299-315, 2018. ©Springer-Verlag

Iqbal U., Doering A., Yasin H., Krüger B., Weber A., and Gall J., A Dual-Source Approach for 3D Human Pose Estimation from Single Images (PDF, Code), Computer Vision and Image Understanding, Vol 172, 37-49, Elsevier, 2018. ©Elsevier

Moeller M., Loffeld O., Gall J., and Krahmer F., Are Good Local Minima Wide in Sparse Recovery? (PDF), International Workshop on Compressed Sensing applied to Radar, Multimodal Sensing, and Imaging (CoSeRa), 2018.

Richard A., Kuehne H., Iqbal A., and Gall J., NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning (PDF, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'18), 7386-7395, 2018. ©IEEE

Abu Farha Y., Richard A., and Gall J., When will you do what? - Anticipating Temporal Occurrences of Activities (PDF, Code, Video), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'18), 5343-5352, 2018. ©IEEE

Richard A., Kuehne H., and Gall J., Action Sets: Weakly Supervised Action Segmentation without Ordering Constraints (PDF, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'18), 5987-5996, 2018. ©IEEE

Andriluka M., Iqbal U., Insafutdinov E., Pishchulin L., Milan A., Gall J., and Schiele B., PoseTrack: A Benchmark for Human Pose Estimation and Tracking (PDF, Data), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'18), 5167-5176, 2018. ©IEEE

Biswas S. and Gall J., Structural Recurrent Neural Network (SRNN) for Group Activity Analysis (PDF), IEEE Winter Conference on Applications of Computer Vision (WACV'18), 1625-1632, 2018. ©IEEE

Panareda Busto P. and Gall J., Viewpoint Refinement and Estimation with Adapted Synthetic Data (PDF), Computer Vision and Image Understanding, Vol 169, 75-89, Elsevier, 2017. ©Elsevier

Kuehne H., Richard A., and Gall J., Weakly Supervised Learning of Actions from Transcripts (PDF), Computer Vision and Image Understanding, Special Issue on Language in Vision, Vol 163, 78-89, Elsevier, 2017. ©Elsevier

Sawatzky J. and Gall J., Adaptive Binarization for Weakly Supervised Affordance Segmentation (PDF), International Workshop on Assistive Computer Vision and Robotics, 1383-1391, 2017. ©IEEE

Panareda Busto P. and Gall J., Open Set Domain Adaptation (PDF, Supplementary Material, Slides, Code), International Conference on Computer Vision (ICCV'17), 754-763, 2017. ©IEEE (Marr Prize Honorable Mention)

Ji M., Gall J., Zheng H., Liu Y., and Fang L., SurfaceNet: An End-to-end 3D Neural Network for Multiview Stereopsis (PDF, Code), International Conference on Computer Vision (ICCV'17), 2326-2334, 2017. ©IEEE

Garbade M. and Gall J., Thinking Outside the Box: Spatial Anticipation of Semantic Categories (PDF, Images/Data/Code), British Machine Vision Conference (BMVC'17), 2017.

Iqbal A., Richard A., Kuehne H., and Gall J., Recurrent Residual Learning for Action Recognition (PDF), German Conference on Pattern Recognition (GCPR'17), Springer, LNCS 10496, 126-137, 2017. ©Springer-Verlag

Richard A., Kuehne H., and Gall J., Weakly Supervised Action Learning with RNN based Fine-to-Coarse Modeling (PDF, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17), 1273-1282, 2017. ©IEEE

Iqbal U., Milan A., and Gall J., PoseTrack: Joint Multi-Person Pose Estimation and Tracking (PDF, Data/Code, PoseTrack Challenge), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17), 4654-4663, 2017. ©IEEE

Sawatzky J., Srikantha A., and Gall J., Weakly Supervised Affordance Detection (PDF, Data, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17), 5197-5206, 2017. ©IEEE

Iqbal U., Garbade M., and Gall J., Pose for Action - Action for Pose (PDF, Code), IEEE International Conference on Automatic Face and Gesture Recognition (FG'17), 438-445, 2017. ©IEEE

Richard A. and Gall J., A Bag-of-Words Equivalent Recurrent Neural Network for Action Recognition (PDF, Code), Computer Vision and Image Understanding, Special Issue on Image and Video Understanding in Big Data, Vol 156, 79–91, Elsevier, 2017. ©Elsevier

Srikantha A. and Gall J., Weak Supervision for Detecting Object Classes from Activities (PDF, Images/Data), Computer Vision and Image Understanding, Special Issue on Image and Video Understanding in Big Data, Vol 156, 138–150, Elsevier, 2017. ©Elsevier

Kalliatakis G., Stamatiadis G., Ehsan S., Leonardis A., Gall J., Sticlaru A. and McDonald-Maier K., Evaluating Deep Convolutional Neural Networks for Material Classification (PDF), International Conference on Computer Vision Theory and Applications (VISAPP'17), 2017.

Wang L., Zhu C., Ye J., and Gall J. (Eds.), Special Issue on Deep Learning with Applications to Visual Representation and Analysis (Issue), Signal Processing: Image Communication, Vol. 47, 463-555, 2016.

Tzionas D. and Gall J., Reconstructing Articulated Rigged Models from RGB-D Video (PDF, Images/Videos/Data), International Workshop on Recovering 6D Object Pose, Springer, LNCS 9915, 620-633, 2016. ©Springer-Verlag

Iqbal U. and Gall J., Multi-Person Pose Estimation with Local Joint-to-Person Associations (PDF), International Workshop on Crowd Understanding, Springer, LNCS 9914, 627-642, 2016. ©Springer-Verlag

Sheikh R., Garbade M., and Gall J., Real-time Semantic Segmentation with Label Propagation (PDF), Workshop on Computer Vision for Road Scene Understanding and Autonomous Driving (CVRSUAD'16), Springer, LNCS 9914, 3-14, 2016. ©Springer-Verlag

Rafi U., Kostrikov I., Gall J., and Leibe B., An Efficient Convolutional Network for Human Pose Estimation (PDF, Code), British Machine Vision Conference (BMVC'16), 2016.

Tzionas D., Ballan L., Srikantha A., Aponte P., Pollefeys M., and Gall J., Capturing Hands in Action using Discriminative Salient Points and Physics Simulation (PDF, Images/Videos/Data), International Journal of Computer Vision, Special Issue on Human Activity Understanding from 2D and 3D data, Vol 118(2), 172-193, Springer, 2016. ©Springer-Verlag

Garbade M. and Gall J., Handcrafting vs Deep Learning: An Evaluation of NTraj+ Features for Pose Based Action Recognition (PDF), Workshop on New Challenges in Neural Computation and Machine Learning (NC²), 2016.

Araslanov N., Koo S., Gall J., and Behnke S., Efficient Single-view 3D Co-segmentation using Shape Similarity and Spatial Part Relations (PDF), German Conference on Pattern Recognition (GCPR'16), Springer, LNCS 9796, 297-308, 2016. ©Springer-Verlag

Richard A. and Gall J., Temporal Action Detection using a Statistical Language Model (PDF, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16), 3131-3140, 2016. ©IEEE

Yasin H., Iqbal U., Krüger B., Weber A., and Gall J., A Dual-Source Approach for 3D Pose Estimation from a Single Image (PDF, Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16), 4948-4956, 2016. ©IEEE

Ristin M., Guillaumin M., Gall J., and van Gool L., Incremental Learning of Random Forests for Large-Scale Image Classification (PDF, Images/Code), IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 38, No. 3, 490-503, 2016. ©IEEE

Kuehne H., Gall J., and Serre T., An end-to-end generative framework for video segmentation and recognition (PDF), IEEE Winter Conference on Applications of Computer Vision (WACV'16), 2016. ©IEEE

Gall J., Gehler P., and Leibe B. (Eds.), Pattern Recognition - 37th German Conference on Pattern Recognition (GCPR'15) (Book), Image Processing, Computer Vision, Pattern Recognition, and Graphics, Springer, LNCS 9358, 2015.

Tzionas D. and Gall J., 3D Object Reconstruction from Hand-Object Interactions (PDF, Images/Data/Code), International Conference on Computer Vision (ICCV'15), 729-737, 2015. ©IEEE

Panareda Busto P., Liebelt L., and Gall J., Adaptation of Synthetic Data for Coarse-to-Fine Viewpoint Refinement (PDF), British Machine Vision Conference (BMVC'15), 2015.

Richard A. and Gall J., A BoW-equivalent Recurrent Neural Network for Action Recognition (PDF, Code), British Machine Vision Conference (BMVC'15), 2015.

Srikantha A. and Gall J., Human Pose as Context for Object Detection (PDF), British Machine Vision Conference (BMVC'15), 2015.

Rafi U., Gall J., and Leibe B., A Semantic Occlusion Model for Human Pose Estimation from a Single Depth Image (PDF, Data), ChaLearn Looking at People Workshop, 67-74, 2015. ©IEEE

Ristin M., Gall J., Guillaumin M., and van Gool L., From Categories to Subcateories: Large-scale Image Classification with Partial Class Label Refinement (PDF, Images/Data/Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'15), 231-239, 2015. ©IEEE

Liu Y., Gall J., Loscos C., and Dai Q., Reconstruction of Human Motion, Digital Representations of the Real World: How to Capture, Model, and Render Visual Reality, Magnor M., Grau O., Sorkine-Hornung O., Theobalt C. (Eds.), CRC Press, 2015. ©Taylor & Francis Group

Dantone M., Gall J., Leistner C., and van Gool L., Body Parts Dependent Joint Regressors for Human Pose Estimation in Still Images (PDF, Supplemental material, Extended version, Images/Data), IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 36, No. 11, 2131-2143, 2014. ©IEEE

Eweiwi A., Cheema M., Bauckhage C., and Gall J., Efficient Pose-based Action Recognition (PDF, Code), Asian Conference on Computer Vision (ACCV'14), Springer, LNCS 9007, 428-443, 2015. ©Springer-Verlag

Tzionas D., Srikantha A., Aponte P., and Gall J., Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points (PDF, Images/Videos/Data), German Conference on Pattern Recognition (GCPR'14), Springer, LNCS 8753, 277-289, 2014. ©Springer-Verlag

Kostrikov I. and Gall J., Depth Sweep Regression Forests for Estimating 3D Human Pose from Images (PDF, Images), British Machine Vision Conference (BMVC'14), 2014.

Srikantha A. and Gall J., Discovering Object Classes from Activities (PDF, Images/Data), European Conference on Computer Vision (ECCV'14), Springer, LNCS 8694, 415-430, 2014. ©Springer-Verlag

Weinmann M., Gall J., and Klein R., Material Classification based on Training Data Synthesized Using a BTF Database (PDF, Images/Data), European Conference on Computer Vision (ECCV'14), Springer, LNCS 8691, 156-171, 2014. ©Springer-Verlag

Srikantha A. and Gall J., Hough-based Object Detection with Grouped Features (PDF), IEEE International Conference on Image Processing (ICIP'14), 1653-1657, 2014. ©IEEE

Ristin M., Guillaumin M., Gall J., and van Gool L., Incremental Learning of NCM Forests for Large-Scale Image Classification (PDF, Images/Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'14), 3654-3661, 2014. ©IEEE

Gall J., Simulated Annealing, Encyclopedia of Computer Vision, Ikeuchi K. (Eds.), Springer, 737-741, 2014. ©Springer-Verlag

Beetz M., Cremers D., Gall J., Li W., Liu Z., Pangercic D., Sturm J., and Tai Y.-W. (Eds.), Special Issue on Visual Understanding and Applications with RGB-D Cameras (Issue), Journal of Visual Communication and Image Representation, Vol. 25, No. 1, 1-238, 2014.

Jhuang H., Gall J., Zuffi S., Schmid C., and Black M., Towards Understanding Action Recognition (PDF, Images/Data), International Conference on Computer Vision (ICCV'13), 3192-3199, 2013. ©IEEE

Ye M., Zhang Q., Wang L., Zhu J., Yang R., and Gall J., A Survey on Human Motion Analysis from Depth Data (PDF, Tutorial), Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications, Grzegorzek M., Theobalt C., Koch R., and Kolb A. (Eds.), Springer, LNCS 8200, 149-187, 2013. ©Springer-Verlag

Liu Y., Gall J., Stoll C., Dai Q., Seidel H.-P., and Theobalt C., Markerless Motion Capture of Multiple Characters Using Multi-view Image Segmentation (PDF, Images/Video/Data), IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, No. 11, 2720-2735, 2013. ©IEEE

Tzionas D. and Gall J., A Comparison of Directional Distances for Hand Pose Estimation (PDF, Images/Data), German Conference on Pattern Recognition (GCPR'13), Springer, LNCS 8142, 131-141, 2013. ©Springer-Verlag

Dantone M., Gall J., Leistner C., and van Gool L., Human Pose Estimation using Body Parts Dependent Joint Regressors (PDF, Images/Data), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'13), 3041-3048, 2013. ©IEEE

Gall J. and Lempitsky V., Class-Specific Hough Forests for Object Detection (Images/Code), Decision Forests for Computer Vision and Medical Image Analysis, Criminisi A. and Shotton J. (Eds.), Springer, 145-160, 2013. ©Springer-Verlag

Fanelli G., Dantone M., Gall J., Fossati A. and van Gool L., Random Forests for Real Time 3D Face Analysis (PDF, Images/Videos/Data/Code), International Journal of Computer Vision, Special Issue on Human Computer Interaction, Vol 101(3), 437-458, Springer, 2013. ©Springer-Verlag

Ristin M., Gall J., and van Gool L., Local Context Priors for Object Proposal Generation (PDF, Images), Asian Conference on Computer Vision (ACCV'12), Springer, LNCS 7724, 57-70, 2013. ©Springer-Verlag

Fossati A., Gall J., Grabner H., Ren X., and Konolige K. (Eds.), Consumer Depth Cameras for Computer Vision - Research Topics and Applications (Book, Workshop), Advances in Computer Vision and Pattern Recognition, Springer, 2012.

Razavi N., Gall J., Kohli P., and van Gool L., Latent Hough Transform for Object Detection (PDF, Images), European Conference on Computer Vision (ECCV'12), Springer, LNCS 7574, 312-325, 2012. ©Springer-Verlag

Ballan L., Taneja A., Gall J., van Gool L., and Pollefeys M., Motion Capture of Hands in Action using Discriminative Salient Points (PDF, Images/Videos/Data), European Conference on Computer Vision (ECCV'12), Springer, LNCS 7577, 640-653, 2012. ©Springer-Verlag

Pellegrini S., Gall J., Sigal L., and van Gool L., Destination Flow for Crowd Simulation (PDF), Workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Streams (ARTEMIS'12), Springer, LNCS 7585, 162-171, 2012. ©Springer-Verlag

Razavi N., Alvar N., Gall J., and van Gool L., Sparsity Potentials for Detecting Objects with the Hough Transform (PDF, Images), British Machine Vision Conference (BMVC'12), 2012.

López-Méndez A., Gall J., Casas J., and van Gool L., Metric Learning from Poses for Temporal Clustering of Human Motion (PDF, Images/Videos), British Machine Vision Conference (BMVC'12), 2012.

Gall J., Razavi N., and van Gool L., An Introduction to Random Forests for Multi-class Object Detection (PDF, Images/Code), Theoretic Foundations of Computer Vision: Outdoor and Large-Scale Real-World Scene Analysis, Dellaert F, Frahm J.-M., Pollefeys M., Rosenhahn B., and Leal-Taixé L. (Eds.), Springer, LNCS 7474, 243-263, 2012. ©Springer-Verlag

Pons-Moll G., Leal-Taixé L., Gall J., and Rosenhahn B., Data-driven Manifolds for Outdoor Motion Capture (PDF, Images/Videos), Theoretic Foundations of Computer Vision: Outdoor and Large-Scale Real-World Scene Analysis, Dellaert F, Frahm J.-M., Pollefeys M., Rosenhahn B., and Leal-Taixé L. (Eds.), Springer, LNCS 7474, 305-328, 2012. ©Springer-Verlag

Yao A., Gall J., and van Gool L., Coupled Action Recognition and Pose Estimation from Multiple Views (PDF, Images/Videos/Code), International Journal of Computer Vision, Vol 100(1), 16-37, Springer, 2012. ©Springer-Verlag

Dantone M., Gall J., Fanelli G., and van Gool L., Real-time Facial Feature Detection using Conditional Regression Forests (PDF, Images/Videos/Data/Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'12), 2578-2585, 2012. ©IEEE

Yao A., Gall J., Leistner C., and van Gool L., Interactive Object Detection (PDF, Images/Videos), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'12), 3242-3249, 2012. ©IEEE

Fanelli G., Gall J., and van Gool L., Real Time 3D Head Pose Estimation: Recent Achievements and Future Challenges (PDF, Images/Videos/Data/Code), 5th International Symposium on Communications, Control and Signal Processing (ISCCSP'12), 2012. ©IEEE

Yao A., Gall J., van Gool L., and Urtasun R., Learning Probabilistic Non-Linear Latent Variable Models for Tracking Complex Activities (PDF, Images/Code), Neural Information Processing Systems (NeurIPS'11), 2011.

Pons-Moll G., Baak A., Gall J., Leal-Taixe L., Mueller M., Seidel H.-P., Rosenhahn B., Outdoor Human Motion Capture using Inverse Kinematics and von Mises-Fisher Sampling (PDF, Images/Videos), International Conference on Computer Vision (ICCV'11), 1243-1250, 2011. ©IEEE

Stoll C., Hasler N., Gall J., Seidel H.-P., and Theobalt C., Fast Articulated Motion Tracking using a Sums of Gaussians Body Model (PDF, Images/Videos), International Conference on Computer Vision (ICCV'11), 951-958, 2011. ©IEEE

Uebersax D., Gall J., van den Bergh M., and van Gool L., Real-time Sign Language Letter and Word Recognition from Depth Data (PDF, Supplemental material, Video), IEEE Workshop on Human Computer Interaction: Real-Time Vision Aspects of Natural User Interfaces (HCI'11), 383-390, 2011. ©IEEE

Gall J., Yao A., Razavi N., van Gool L., and Lempitsky V., Hough Forests for Object Detection, Tracking, and Action Recognition (PDF, Images/Code), IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, No. 11, 2188-2202, 2011. ©IEEE

Yao A., Gall J., Fanelli G., and van Gool L., Does Human Action Recognition Benefit from Pose Estimation? (PDF, Images), British Machine Vision Conference (BMVC'11), 2011.

Fanelli G., Weise T., Gall J., and van Gool L., Real Time Head Pose Estimation from Consumer Depth Cameras (PDF, Images/Videos/Data/Code), 33rd Annual Symposium of the German Association for Pattern Recognition (DAGM'11), Springer, LNCS 6835, 101-110, 2011. ©Springer-Verlag

Gall J., Fossati A., and van Gool L., Functional Categorization of Objects using Real-time Markerless Motion Capture (PDF, Images/Video/Data), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11), 1969-1976, 2011. ©IEEE

Grabner H., Gall J., and van Gool L., What Makes a Chair a Chair? (PDF, Images/Data), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11), 1529-1536, 2011. ©IEEE

Razavi N., Gall J., and van Gool L., Scalable Multi-class Object Detection (PDF, Images/Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11), 1505-1512, 2011. ©IEEE

Fanelli G., Gall J., and van Gool L., Real Time Head Pose Estimation with Random Regression Forests (PDF, Images/Videos/Data/Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11), 617-624, 2011. ©IEEE

Liu Y., Stoll C., Gall J., Seidel H.-P., and Theobalt C., Markerless Motion Capture of Interacting Characters Using Multi-view Image Segmentation (PDF, Images/Video/Data), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11), 1249-1256, 2011. ©IEEE

Hamer H., Gall J., Urtasun R., and van Gool L., Data-Driven Animation of Hand-Object Interactions (PDF, Video with Audio), IEEE Conference on Automatic Face and Gesture Recognition, 360-367, 2011. ©IEEE

Stoll C., Gall J., de Aguiar E., Thrun S., and Theobalt C., Video-based Reconstruction of Animatable Human Characters (PDF, Images/Videos), ACM Transactions on Graphics (SIGGRAPH Asia 2010), Vol. 29, No. 6, 2010. ©ACM

Yao A., Uebersax D., Gall J., and van Gool L., Tracking People in Broadcast Sports (PDF), 32nd Annual Symposium of the German Association for Pattern Recognition (DAGM'10), Springer, LNCS 6376, 151-161, 2010. ©Springer-Verlag

Gall J., Yao A., and van Gool L., 2D Action Recognition Serves 3D Human Pose Estimation (PDF, Images/Videos/Code), European Conference on Computer Vision (ECCV'10), Springer, LNCS 6313, 425-438, 2010. ©Springer-Verlag

Razavi N., Gall J., and van Gool L., Backprojection Revisited: Scalable Multi-view Object Detection and Similarity Metrics for Detections (PDF, Images/Videos), European Conference on Computer Vision (ECCV'10), Springer, LNCS 6311, 620-633, 2010. ©Springer-Verlag

Fanelli G., Yao A., Noel P.-L., Gall J., and van Gool L., Hough Forest-based Facial Expression Recognition from Video Sequences, Workshop on Sign, Gesture and Activity (SGA'10), 2010. ©Springer-Verlag

Gall J., Razavi N., and van Gool L., On-line Adaption of Class-specific Codebooks for Instance Tracking (PDF, Images/Videos), British Machine Vision Conference (BMVC'10), 2010.

Fanelli G., Gall J., Romsdorfer H., Weise T., and van Gool L., A 3D Audio-Visual Corpus of Affective Communication (PDF, Data), IEEE Transactions on Multimedia, Special Issue on Multimodal Affective Interaction, Vol. 12, No. 6, 591-598, 2010. ©IEEE

Waltisberg D., Yao A., Gall J., and van Gool L., Variations of a Hough-Voting Action Recognition System (PDF, Images/Videos), Proceedings of the ICPR 2010 Contests, Springer, LNCS 6388, 306-312, 2010. ©Springer-Verlag

Yao A., Gall J., and van Gool L., A Hough Transform-Based Voting Framework for Action Recognition (PDF, Images/Videos), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10), 2010. ©IEEE

Hamer H., Gall J., Weise T., and van Gool L., An Object-Dependent Hand Pose Prior from Sparse Training Data (PDF, Images/Videos), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10), 2010. ©IEEE

Fanelli G., Gall J., Romsdorfer H., Weise T., and van Gool L., 3D Vision Technology for Capturing Multimodal Corpora: Chances and Challenges (PDF), Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, 2010.

Brox T., Rosenhahn B., Gall J., and Cremers D., Combined Region- and Motion-based 3D Tracking of Rigid and Articulated Objects (PDF), IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 32(3), 402-415, 2010. ©IEEE

Gall J., Rosenhahn B., Brox T., and Seidel H.-P., Optimization and Filtering for Human Motion Capture - A Multi-layer Framework (PDF, Images/Videos), International Journal of Computer Vision, Special Issue on Evaluation of Articulated Human Motion and Pose Estimation, Vol 87(1), 75-92, Springer, 2010. ©Springer-Verlag

Shaheen M., Gall J., Strzodka R., van Gool L., and Seidel H.-P., A Comparison of 3D Model-based Tracking Approaches for Human Motion Capture in Uncontrolled Environments (PDF, Images/Videos), IEEE Workshop on Applications of Computer Vision (WACV'09), 2009. ©IEEE

Fanelli G., Gall J., van Gool L., Hough Transform-based Mouth Localization for Audio-Visual Speech Recognition (PDF, Images/Videos), British Machine Vision Conference (BMVC'09), 2009.

Gall J., Stoll C., de Aguiar E., Theobalt C., Rosenhahn B., and Seidel H.-P., Motion Capture Using Joint Skeleton Tracking and Surface Estimation (PDF, Images/Videos/Data), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09), 2009. ©IEEE

Gall J. and Lempitsky V., Class-Specific Hough Forests for Object Detection (PDF, Images/Code), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09), 2009. ©IEEE

Hasler N., Rosenhahn B., Thormählen T., Wand M., Gall J., and Seidel H.-P., Markerless Motion Capture with Unsynchronized Moving Cameras (PDF, Images/Videos), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09), 2009. ©IEEE

Gall J., Rosenhahn B., Gehrig S., and Seidel H.-P., Model-based Motion Capture for Crash Test Video Analysis (PDF, Images/Videos), 30th Annual Symposium of the German Association for Pattern Recognition (DAGM'08), Springer, LNCS 5096, 92-101, 2008. ©Springer-Verlag

Gall J., Rosenhahn B., and Seidel H.-P., Drift-free Tracking of Rigid and Articulated Objects (PDF, Images/Videos), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08), 2008. ©IEEE

Gall J., Rosenhahn B., and Seidel H.-P., An Introduction to Interacting Simulated Annealing (PDF), Human Motion - Understanding, Modeling, Capture and Animation, Klette R., Metaxas D., and Rosenhahn B. (Eds.), Computational Imaging and Vision, Vol 36, 319-345, Springer, 2008. ©Springer-Verlag

Gehrig S., Badino H., and Gall J., Accurate and Model-Free Pose Estimation of Crash Test Dummies Human Motion - Understanding, Modeling, Capture and Animation, Klette R., Metaxas D., and Rosenhahn B. (Eds.), Computational Imaging and Vision, Springer, Vol 36, 453-473, 2008. ©Springer-Verlag

Gall J., Rosenhahn B., and Seidel H.-P., Clustered Stochastic Optimization for Object Recognition and Pose Estimation (PDF, Images/Videos), 29th Annual Symposium of the German Association for Pattern Recognition (DAGM'07), Springer, LNCS 4713, 32-41, 2007. ©Springer-Verlag

Gall J., Potthoff J., Schnörr C., Rosenhahn B., and Seidel H.-P., Interacting and Annealing Particle Filters: Mathematics and a Recipe for Applications, Journal of Mathematical Imaging and Vision. Springer, 28(1), 1-18, 2007. ©Springer-Verlag

Gall J., Rosenhahn B., and Seidel H.-P., Robust Pose Estimation with 3D Textured Models (PDF, Images/Videos), IEEE Pacific-Rim Symposium on Image and Video Technology (PSIVT'06), Springer, LNCS 4319, 84-95, 2006. ©Springer-Verlag

Gall J., Rosenhahn B., Brox T., and Seidel H.-P., Learning for Multi-View 3D Tracking in the Context of Particle Filters (PDF, Images/Videos), International Symposium on Visual Computing (ISVC'06), Springer, LNCS 4292, 59-69, 2006. ©Springer-Verlag

Software

FlowNar: Scalable Streaming Narration for Long-Form Videos

EgoControl: Controllable Egocentric Video Generation via 3D Full-Body Poses

Video Panels for Long Video Understanding

RedSage: A Cybersecurity Generalist LLM

RiverMamba: A State Space Model for Global River Discharge and Flood Forecasting

CamC2V: Context-aware Controllable Video Generation

Scalable Video Action Anticipation with Cross Linear Attentive

Skeleton Motion Words for Unsupervised Skeleton-based Temporal Action Segmentation

Global-Aware Monocular Semantic Scene Completion with State Space Models

Self-Intersection-Aware 3D Human Motion Generation Using an Efficient Human Sphere Proxy

Canonical Rank Adaptation: An Efficient Fine-Tuning Strategy for Vision Transformers

Enhancing Video-Based Robot Failure Detection Using Task Knowledge

MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Anticipation

Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models

SyncVP: Joint Diffusion for Synchronous Multi-Modal Video Prediction

GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model

STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection

ClimSat - A diffusion autoencoder model for climate-conditional satellite image editing

Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation

Hierarchical Vector Quantization for Unsupervised Action Segmentation

Identifying Spatio-Temporal Drivers of Extreme Events

MV-Match: Multi-View Matching for Domain-Adaptive Identification of Plant Nutrient Deficiencies

Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation

Massively Multi-Person 3D Human Motion Forecasting with Scene Context

ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association

A Multimodal Handover Failure Detection Dataset and Baselines

Focal-TSMP: deep learning for vegetation health prediction and agricultural drought assessment from a regional climate simulation

Generating novel scene compositions from single images and videos

Social Diffusion: Long-term Multiple Human Motion Anticipation

How Much Temporal Long-Term Context is Needed for Action Segmentation?

3DMOTFormer: Graph Transformer for Online 3D Multi-Object Tracking

Leveraging Bioclimatic Context for Supervised and Self-Supervised Land Cover Classification

Action Anticipation with Goal Consistency

PowerBEV: A Powerful Yet Lightweight Framework for Instance Prediction in Bird's-Eye View

Location-Aware Adaptive Normalization: A Deep Learning Approach for Wildfire Danger Forecasting

Surrogate Modelling of a Detailed Farm-level Model using Deep Learning

ElliPose: Stereoscopic 3D Human Pose Estimation by Fitting Ellipsoids

Robust Action Segmentation from Timestamp Supervision

Dual Pyramid Generative Adversarial Networks for Semantic Image Synthesis

Adaptive Token Sampling for Efficient Vision Transformers

OASIS: Only Adversarial Supervision for Semantic Image Synthesis

One-Shot Synthesis of Images and Segmentation Masks

Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation

TAVA: Template-free Animatable Volumetric Actors

Ranking Info Noise Contrastive Estimation: Boosting Contrastive Learning via Ranked Positives

Rethinking 3D LiDAR Point Cloud Segmentation

Multi-scale Interaction for Real-time LiDAR Data Segmentation on an Embedded Platform

Fast Weakly Supervised Action Segmentation Using Mutual Consistency

Using Visual Anomaly Detection for Task Execution Monitoring

Moving Object Segmentation in 3D LiDAR Data: A Learning-based Approach Exploiting Sequential Data

Temporal Action Segmentation from Timestamp Supervision

3D CNNs with Adaptive Temporal Feature Resolutions

Pose Refinement Graph Convolutional Network for Skeleton-based Action Recognition

You Only Need Adversarial Supervision for Semantic Image Synthesis

Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting

SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation

MS-TCN++: Multi-Stage Temporal Convolutional Network for Action Segmentation

MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation

Sequence Prediction using Spectral RNNs

SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences

Joint Viewpoint and Keypoint Estimation with Real and Synthetic Data

Iterative Greedy Matching for 3D Human Pose Tracking from Multiple Views

What Object Should I Use? - Task Driven Object Detection

Unsupervised Learning of Action Classes with Continuous Temporal Embedding

AVID: Adversarial Visual Irregularity Detection

Convolutional Simplex Projection Network for Weakly Supervised Semantic Segmentation

NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning

When will you do what? - Anticipating Temporal Occurrences of Activities

Action Sets: Weakly Supervised Action Segmentation without Ordering Constraints

Open Set Domain Adaptation

SurfaceNet: An End-to-end 3D Neural Network for Multiview Stereopsis

Thinking Outside the Box: Spatial Anticipation of Semantic Categories

Weakly Supervised Action Learning with RNN based Fine-to-Coarse Modeling

PoseTrack: Joint Multi-Person Pose Estimation and Tracking

Weakly Supervised Affordance Detection

Pose for Action - Action for Pose

An Efficient Convolutional Network for Human Pose Estimation

Temporal Action Detection using a Statistical Language Model

A Dual-Source Approach for 3D Pose Estimation from a Single Image

A BoW-equivalent Recurrent Neural Network for Action Recognition

3D Object Reconstruction from Hand-Object Interactions

From Categories to Subcateories: Large-scale Image Classification with Partial Class Label Refinement

Efficient Pose-based Action Recognition

Incremental Learning of NCM Forests for Large-Scale Image Classification

Real-time Facial Feature Detection using Conditional Regression Forests

Learning Probabilistic Non-Linear Latent Variable Models for Tracking Complex Activities

Real Time Head Pose Estimation with Random Regression Forests

Scalable Multi-class Object Detection

2D Action Recognition Serves 3D Human Pose Estimation

A Hough Transform-Based Voting Framework for Action Recognition

Class-Specific Hough Forests for Object Detection

Data

RedSage: A Cybersecurity Generalist LLM

RiverMamba: A State Space Model for Global River Discharge and Flood Forecasting

Non-invasive diagnosis of nutrient deficiencies in winter wheat and winter rye using UAV-based RGB images

Enhancing Video-Based Robot Failure Detection Using Task Knowledge

STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection

Multi-modal temporal action segmentation for manufacturing scenarios

Identifying Spatio-Temporal Drivers of Extreme Events

A Multimodal Handover Failure Detection Dataset and Baselines

Focal-TSMP: deep learning for vegetation health prediction and agricultural drought assessment from a regional climate simulation

Humans in Kitchens: A Dataset for Multi-Person Human Motion Forecasting with Scene Context

Surrogate Modelling of a Detailed Farm-level Model using Deep Learning

PoseTrack21: A Dataset for Person Search, Multi-Object Tracking and Multi-Person Pose Tracking

Using Visual Anomaly Detection for Task Execution Monitoring

Deep Learning for Non-Invasive Diagnosis of Nutrient Deficiencies in Sugar Beet Using RGB Images

Large Scale Holistic Video Understanding

Bonn Activity Maps Dataset

SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences

What Object Should I Use? - Task Driven Object Detection

PoseTrack Challenge

PoseTrack: Joint Multi-Person Pose Estimation and Tracking

Weakly Supervised Affordance Detection

Reconstructing Articulated Rigged Models from RGB-D Video

Capturing Hands in Action using Discriminative Salient Points and Physics Simulation

3D Object Reconstruction from Hand-Object Interactions

A Semantic Occlusion Model for Human Pose Estimation from a Single Depth Image

From Categories to Subcateories: Large-scale Image Classification with Partial Class Label Refinement

Material Classification based on Training Data Synthesized Using a BTF Database

Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points

Discovering Object Classes from Activities

Towards Understanding Action Recognition

Human Pose Estimation using Body Parts Dependent Joint Regressors

Motion Capture of Hands in Action using Discriminative Salient Points

Real-time Facial Feature Detection using Conditional Regression Forests

Real Time Head Pose Estimation with Random Regression Forests

Functional Categorization of Objects using Real-time Markerless Motion Capture

What Makes a Chair a Chair?

Biwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2

Hand-Object Interaction (HOI) Dataset

Motion Capture Using Joint Skeleton Tracking and Surface Estimation

Markerless Motion Capture with Unsynchronized Moving Cameras

Projects

Lamarr - Institute for Machine Learning and Artificial Intelligence, National Competence Center

iBehave, NRW Netzwerke 2021

PhenoRob, Cluster of Excellence

DETECT, DFG Collaborative Research Centre

Anticipating Human Behavior, DFG Research Unit FOR 2535

Analysis and Representation of Complex Activities in Videos (ARCA), ERC Starting Grant

Mapping on Demand (MoD), DFG Research Unit FOR 1505

Human Focused Visual Scene Understanding (HFVSA), DFG Independent Junior Research Group

Workshops/Conferences/Special Issues/Tutorials

Machine Learning for Earth System Modelling, August 25-27, 2025, Bonn, Germany. Organizers: Matthew Chantry (European Centre for Medium-Range Weather Forecasts), Dale Durran (University of Washington), Juergen Gall (University of Bonn), Christian Lessig (European Centre for Medium-Range Weather Forecasts), Martin Schultz (Jülich Supercomputing Center), and Imme Ebert Uphoff (Colorado State University).

Special Session: Towards Realistic 3D Deep Learning with Limited Supervision, June 30 - July 4, 2025, in conjunction with ICME'25. Organizers: Xun Xu (Institute for Infocomm Research, A*STAR), Shijie Li (Institute for Infocomm Research, A*STAR), Hao Su (University of California San Diego), Xiatian Zhu (University of Surrey), Juergen Gall (University of Bonn), and Xulei Yang (Institute for Infocomm Research, A*STAR)

Workshop on Large Scale Holistic Video Understanding, June 11-12, 2025, in conjunction with CVPR'25. Organizers: Mohsen Fayyaz (Microsoft), Vivek Sharma (Massachusetts Institute of Technology), Shyamal Buch (Stanford University), Ali Diba (KU Leuven), Luc Van Gool (KU Leuven, ETH Zurich), Juergen Gall (University of Bonn), Ehsan Adeli (Stanford University), David Ross (Google Research), João Carreira (Google DeepMind), and Manohar Paluri (Facebook).

Large-scale Deep Learning for the Earth System, Bonn, Germany, 29th-30th August 2024. Organizers: Matthew Chantry (European Centre for Medium-Range Weather Forecasts), Juergen Gall (University of Bonn), Christian Lessig (European Centre for Medium-Range Weather Forecasts), and Martin Schultz (Jülich Supercomputing Center).

Workshop on Large Scale Holistic Video Understanding, June 17, 2024, in conjunction with CVPR'24. Organizers: Mohsen Fayyaz (Microsoft), Vivek Sharma (Massachusetts Institute of Technology), Shyamal Buch (Stanford University), Ali Diba (KU Leuven), Luc Van Gool (KU Leuven, ETH Zurich), Juergen Gall (University of Bonn), Ehsan Adeli (Stanford University), David Ross (Google Research), João Carreira (Google DeepMind), and Manohar Paluri (Facebook).

Anticipating Human Behavior, Bonn, Germany, 8th September 2023. Organizers: Sven Behnke (University of Bonn), Maren Bennewitz (University of Bonn), Anne Driemel (University of Bonn), Juergen Gall (University of Bonn), and Reinhard Klein (University of Bonn).

Large-scale Deep Learning for the Earth System, Bonn, Germany, 4th-5th September 2023. Organizers: Matthew Chantry (European Centre for Medium-Range Weather Forecasts), Juergen Gall (University of Bonn), Christian Lessig (University of Magdeburg), and Martin Schultz (Jülich Supercomputing Center).

Deep Nutrient Deficiency Challenge, October 2, 2023, as part of the Workshop on Computer Vision in Plant Phenotyping and Agriculture and in conjunction with ICCV'23.

Workshop on Large Scale Holistic Video Understanding, June 18, 2023, in conjunction with CVPR'23. Organizers: Vivek Sharma (Massachusetts Institute of Technology), Shyamal Buch (Stanford University), Ali Diba (KU Leuven), Mohsen Fayyaz (University of Bonn), Luc Van Gool (KU Leuven, ETH Zurich), Juergen Gall (University of Bonn), Rainer Stiefelhagen (Karlsruhe Institute of Technology), David Ross (Google Research), Ehsan Adeli (Stanford University), and Manohar Paluri (Facebook).

Perceiving Humans Under Occlusions, IEEE Transactions on Multimedia. Guest Editors: Cairong Zhao (Tongji University), Roger Zimmermann (National University of Singapore), Wei-Shi Zheng (Sun Yat-sen University), Shanshan Zhang (Nanjing University of Science and Technology), and Juergen Gall (University of Bonn).

Asian Conference on Computer Vision (ACCV), Macau SAR, China, December 4 - 8, 2022. General chairs: Gerard Medioni (USC, Amazon), Shiguang Shang (Chinese Academy of Sciences), Bohyung Han (Seoul National University), and Hongdong Li (Australian National University). Program chairs: Rama Chellappa (Johns Hopkins University), Juergen Gall (University of Bonn), Imari Sato (National Institute of Informatics, Japan), Tat-Jun Chin (University of Adelaide), and Lei Wang (University of Wollongong).

Workshop on Large Scale Holistic Video Understanding, 20th June 2022, in conjunction with CVPR'22. Organizers: Vivek Sharma (Massachusetts Institute of Technology), Shyamal Buch (Stanford University), Ali Diba (KU Leuven), Mohsen Fayyaz (University of Bonn), Luc Van Gool (KU Leuven, ETH Zurich), Juergen Gall (University of Bonn), Rainer Stiefelhagen (Karlsruhe Institute of Technology), David Ross (Google Research), Ehsan Adeli (Stanford University), and Manohar Paluri (Facebook).

DAGM German Conference on Pattern Recognition (GCPR), Bonn, Germany, September 28 - October 1, 2021. General chair: Juergen Gall (University of Bonn). Honorary Chair: Wolfgang Förstner (University of Bonn). Program chairs: Christian Bauckhage (University of Bonn) and Alexander Schwing (University of Illinois in Urbana-Champaign). Workshop/Tutorial Chair: Michael Möller (University of Siegen).

Perceiving Humans, Frontiers in Computer Science. Guest Editors: Sebastiano Vascon (Ca' Foscari University of Venice), Laura Leal-Taixe (Technical University of Munich), Giovanni Maria Farinella (University of Catania), Hilde Kuehne (Goethe University Frankfurt), and Juergen Gall (University of Bonn).

Tutorial on Large Scale Holistic Video Understanding, 11th-17th October 2021, in conjunction with ICCV'21. Organizers: Mohsen Fayyaz (University of Bonn), Ali Diba (KU Leuven), Vivek Sharma (Massachusetts Institute of Technology), Luc Van Gool (KU Leuven, ETH Zurich), Ehsan Adeli (Stanford University), David Ross (Google Research), Juergen Gall (University of Bonn), and Manohar Paluri (Facebook).

GigaVision: When Gigapixel Videography meets Computer Vision, 11th-17th October 2021, in conjunction with ICCV'21. Organizers: Lu Fang (Tsinghua University), George Barbastath (Massachusetts Institute of Technology), Juergen Gall (University of Bonn), David J. Brady (University of Arizona), Haifeng Wang (Baidu), and Feng Yang (Google Research).

Workshop on Large Scale Holistic Video Understanding, 19th-25th June 2021, in conjunction with CVPR'21. Organizers: Mohsen Fayyaz (University of Bonn), Vivek Sharma (Massachusetts Institute of Technology), Ali Diba (KU Leuven), Luc Van Gool (KU Leuven, ETH Zurich), Juergen Gall (University of Bonn), Ehsan Adeli (Stanford University), David Ross (Google Research), Rainer Stiefelhagen (Karlsruhe Institute of Technology), and Manohar Paluri (Facebook).

Multi-Modality Human Activity Understanding, Journal of Visual Communication and Image Representation (JVCI). Guest Editors: Zhigang Tu (Wuhan University), Wanqing Li (University of Wollongong), Jiaying Liu (Peking University), Juergen Gall (University of Bonn), and Junsong Yuan (University at Buffalo, State University of New York).

Tutorial on Large Scale Holistic Video Understanding, 19th June 2020, in conjunction with CVPR'20. Video. Organizers: Mohsen Fayyaz (University of Bonn), Ali Diba (KU Leuven), Vivek Sharma (Karlsruhe Institute of Technology), Luc Van Gool (KU Leuven, ETH Zurich), Juergen Gall (University of Bonn), Rainer Stiefelhagen (Karlsruhe Institute of Technology), Manohar Paluri (Facebook).

Workshop on Large Scale Holistic Video Understanding, Seoul, Korea, 27th October 2019, in conjunction with ICCV'19. Organizers: Vivek Sharma (Karlsruhe Institute of Technology), Mohsen Fayyaz (University of Bonn), Ali Diba (KU Leuven), Luc Van Gool (KU Leuven, ETH Zurich), Juergen Gall (University of Bonn), Rainer Stiefelhagen (Karlsruhe Institute of Technology), Manohar Paluri (Facebook).

BMVA Symposium on Video Understanding, London, United Kingdom, 25th September 2019. Organizers: Dima Damen (University of Bristol), Hildegard Kuehne (University of Bonn), Juergen Gall (University of Bonn), and Ivan Laptev (INRIA Paris).

Anticipating Human Behavior, Bonn, Germany, 2nd September 2019. Organizers: Sven Behnke (University of Bonn), Maren Bennewitz (University of Bonn), Juergen Gall (University of Bonn), Reinhard Klein (University of Bonn), Andreas Weber (University of Bonn), and Angela Yao (National University of Singapore).

Anticipating Human Behavior, Munich, Germany, 8th September 2018, in conjunction with ECCV'18. Organizers: Juergen Gall (University of Bonn), Jan van Gemert (Delft University of Technology), and Kris Kitani (Carnegie Mellon University).

PoseTrack Challenge: Human Pose Estimation and Tracking in the Wild, Munich, Germany, 8th September 2018, in conjunction with ECCV'18. Organizers: Mykhaylo Andriluka (Google), Umar Iqbal (University of Bonn), Eldar Insafutdinov (MPI for Informatics), Leonid Pishchulin (MPI for Informatics), Anton Milan (Amazon), Siyu Tang (MPI for Intelligent Systems), Christoph Lassner (Amazon), Juergen Gall (University of Bonn), and Bernt Schiele (MPI for Informatics).

Workshop on Interactive and Adaptive Learning in an Open World, Munich, Germany, 14th September 2018, in conjunction with ECCV'18. Organizers: Erik Rodner (Zeiss Corporate Research), Alexander Freytag (Zeiss Corporate Research), Vitto Ferrari (Google), Mario Fritz (MPI for Informatics), Uwe Franke (Daimler AG), Terrence Boult (University of Colorado), Juergen Gall (University of Bonn), Walter Scheirer (University of Notre Dame), and Angela Yao (University of Bonn).

PoseTrack Challenge: Human Pose Estimation and Tracking in the Wild, Venice, Italy, October 2017, in conjunction with ICCV'17. Organizers: Mykhaylo Andriluka (MPI for Informatics), Umar Iqbal (University of Bonn), Eldar Insafutdinov (MPI for Informatics), Leonid Pishchulin (MPI for Informatics), Anton Milan (University of Adelaide), Juergen Gall (University of Bonn), and Bernt Schiele (MPI for Informatics).

Imaging Depth Sensors—Sensors, Algorithms and Applications, Sensors. Guest Editors: Andreas Kolb (University of Siegen), Juergen Gall (University of Bonn), Adrian A. Dorrington (Chronoptics), and Lee Streeter (University of Waikato).

Cross-Media Big Data Analytics, Journal of Visual Communication and Image Representation (JVCI). Guest Editors: An-An Liu (Tianjin University), Ke Gao (Chinese Academy of Sciences), Liqiang Nie (National University of Singapore), Juergen Gall (University of Bonn), and Yi Yang (University of Technology Sydney).

Deep Learning with Applications to Visual Representation and Analysis, Signal Processing: Image Communication. Guest Editors: Lei Wang (University of Wollongong), Ce Zhu (University of Electronic Science and Technology of China), Jieping Ye (University of Michigan), and Juergen Gall (University of Bonn).

37th German Conference on Pattern Recognition (GCPR), Aachen, Germany, 7-10 October 2015. Call for papers. General chair: Bastian Leibe (RWTH Aachen University). Program chairs: Juergen Gall (University of Bonn) and Peter Gehler (Max Planck Institute for Intelligent Systems).

Visual Perception of Affordances and Functional Visual Primitives for Scene Analysis, Zurich, Switzerland, 7 September 2014, in conjunction with ECCV'14. Organizers: Karthik Mahesh Varadarajan (Technical University of Vienna), Alireza Fathi (Stanford University), Juergen Gall (University of Bonn), Markus Vincze (Technical University of Vienna).

Workshop on Consumer Depth Cameras for Computer Vision, Zurich, Switzerland, 6 September 2014, in conjunction with ECCV'14. Organizers: Andrea Fossati (ETH Zurich), Juergen Gall (University of Bonn), and Miles Hansard (Queen Mary University London).

Affordances in Vision for Cognitive Robotics, Berkeley, CA, USA, 13 July 2014, in conjunction with RSS'14. Organizers: Karthik Mahesh Varadarajan (Technical University of Vienna), Markus Vincze (Technical University of Vienna), Trevor Darrell (University of California, Berkeley), and Juergen Gall (University of Bonn).

Tutorial on Towards Solving Real-World Vision Problems with RGB-D Cameras, Columbus, OH, USA, 28 June 2014, in conjunction with CVPR'14. Organizers: Juergen Gall (University of Bonn), Xiaofeng Ren (Amazon), and Pushmeet Kohli (Microsoft Research Cambridge).

Workshop on Consumer Depth Cameras for Computer Vision, Sydney, Australia, 2 December 2013, in conjunction with ICCV'13. Organizers: Andrea Fossati (ETH Zurich), Juergen Gall (University of Bonn), Helmut Grabner (ETH Zurich), and Miles Hansard (Queen Mary University London).

Bonn Vision Workshop 2013, Bonn, Germany, 1 October 2013. Organizers: Simone Frintrop (University of Bonn), Juergen Gall (University of Bonn), and Armin B. Cremers (University of Bonn).

Tutorial on Towards Solving Real-World Vision Problems with RGB-D Cameras, Portland, OR, USA, 23 June 2013, in conjunction with CVPR'13. Organizers: Xiaofeng Ren (Amazon), Pushmeet Kohli (Microsoft Research Cambridge), and Juergen Gall (University of Bonn).

Workshop on Consumer Depth Cameras for Computer Vision, Firenze, Italy, 12 October 2012, in conjunction with ECCV'12. Organizers: Andrea Fossati (ETH Zurich), Juergen Gall (Max Planck Institute for Intelligent Systems), Helmut Grabner (ETH Zurich), Xiaofeng Ren (Intel Labs), Kurt Konolige (Willow Garage), Seungkyu Lee (Samsung Advanced Institute of Technology), and Miles Hansard (Queen Mary University London).

Visual Understanding and Applications with RGB-D Cameras, Journal of Visual Communication and Image Representation (JVCI). Guest Editors: Michael Beetz (TU Munich), Daniel Cremers (TU Munich), Juergen Gall (ETH Zurich), Wanqing Li (University of Wollongong), Zicheng Liu (Microsoft Research Redmond), Dejan Pangercic (TU Munich), Juergen Sturm (TU Munich), and Yu-Wing Tai (KAIST).

IEEE Workshop on Consumer Depth Cameras for Computer Vision, Barcelona, Spain, 12 November 2011, in conjunction with ICCV'11. Organizers: Andrea Fossati (ETH Zurich), Juergen Gall (ETH Zurich), Helmut Grabner (ETH Zurich), Xiaofeng Ren (Intel Labs Seattle), and Kurt Konolige (Willow Garage).

Editorial/Chair

Editor:

Computer Vision and Image Understanding (since 2023)

Journal of Visual Communication and Image Representation (since 2016)

IEEE Transactions on Pattern Analysis and Machine Intelligence (since 2023)

Journal of Mathematical Imaging and Vision (2017-2024)

Area Chair:

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR): 2020, 2022, 2023, 2025 (LAC), 2026 (SAC)

European Conference on Computer Vision (ECCV): 2016, 2018, 2022, 2024, 2026

International Conference on Computer Vision (ICCV): 2015, 2021, 2023, 2025

Conference on Neural Information Processing Systems (NeurIPS): 2023 (SAC), 2024 (SAC), 2025, 2026

International Conference on Machine Learning (ICML): 2024 (SAC), 2025 (SAC), 2026 (SAC)

Teaching

PhD Seminar - Graphics, Vision, Audio for Intelligent Systems, SS13, WS13/14, SS14, WS14/15, SS15, WS15/16, SS16, WS16/17, SS17, WS17/18, SS18, WS18/19, SS19, WS19/20

BA-INF 062 - Begleitseminar zur Bachelorarbeit: Computer Vision, SS14, WS14/15, SS15, WS15/16, SS16, WS16/17, SS17, WS17/18, SS18, WS18/19, SS19, WS19/20, SS20, WS20/21, SS21, WS21/22, SS22, WS22/23, SS23, WS23/24, SS24, WS24/25, SS25, WS25/26, SS26

MA-INF 2201/MA-MORO-M04 - Computer Vision, WS13/14, WS14/15, WS15/16, WS16/17, WS17/18, WS18/19, WS19/20, WS21/22, WS22/23, WS23/24, WS24/25, WS25/26

MA-INF 2213 - Advanced Computer Vision (Computer Vision II), SS14, SS15, SS16, SS17, SS18, SS19, SS20, SS21, SS22, SS23, SS24, , SS25, SS26

MA-INF 2218 - Video Analytics, SS17, SS18, SS19, SS21, SS22, SS23

MA-INF 2206/MA-MORO-E10 - Seminar Vision: Selected Topics in Computer Vision, WS13/14, SS14, WS14/15, SS15, WS15/16, SS16, WS16/17, SS17, WS17/18, SS18, WS18/19, SS19, WS19/20, SS20, WS20/21, SS21, WS21/22, SS22, WS22/23, SS23, WS23/24, SS24, WS24/25, SS25, WS25/26, SS26

MA-INF 2307/MA-MORO-E07 - Lab Vision, WS13/14, SS14, WS14/15, SS15, WS15/16, SS16, WS16/17, SS17, WS17/18, SS18, WS18/19, SS19, WS19/20, SS20, WS20/21, SS21, WS21/22, SS22, WS22/23, SS23, WS23/24, SS24, WS24/25, SS25, WS25/26, SS26

MA-INF 0402 - Master Thesis, Accompanying Seminar, WS14/15, SS15, WS15/16, SS16, WS16/17, SS17, WS17/18, SS18, WS18/19, SS19, WS19/20, SS20, WS20/21, SS21, WS21/22, SS22, WS22/23, SS23, WS23/24, SS24, WS24/25, SS25, WS25/26, SS26

Talks

Action Anticipation and Controllable Video Generation, INRIA Sophia Antipolis, 2026

PhenoRob2 - Robotik und Phänotypisierung für nachhaltige Pflanzenproduktion, HEF AgriScience Symposium, Freising, 2025

Mamba for Action Recognition and Forecasting, Video AI Symposium, Paris, 2025

Recent Advances in Video Understanding and Forecasting, GöAID, 2025

AI for Earth Science and Sustainable Agriculture, DHL, Bonn, 2025

From Forecasting Human Behavior to Forecasting Extreme Weather and Climate Events (Video), International AI Doctoral Academy (AIDA), 2024

A Brief History of Generative AI for Image Synthesis, Deep Fakes - Between Nudging and Manipluating, NeurotechEU, Bonn, 2024

Analyzing and Anticipating Human Behavior, Karlsruhe Institute of Technology, 2024

AI for Earth Science and Sustainable Agriculture, Dr. Hans-Riegel-Akademie, 2024

From Forecasting Human Behavior to Agricultural Droughts, Forschungszentrum Jülich, 2024

Analyzing Behavior in Video Recordings, Otto von Guericke University Magdeburg, 2024

Generating Images in Low Data Regimes and Socially Plausible Human Motion, Meta, Paris, 2024

Anticipation: From Human Motion to Wildfires, KAIST, 2024

Measuring the Quality of Generative Neural Networks - An Unsolved Problem, ELLIS Multimodal Learning Systems Workshop on Multimodal Foundation Models, Oberwolfach, 2024

Anticipation: From Human Motion to Wildfires, NeurIPS Fest, Amsterdam, 2023

Using AI for Sustainable Agriculture and Forecasting, AI for Good, International Telecommunication Union, 2023

Dealing with Rare Data: Few-Shot Semantic Image Synthesis and its Application to Nutrient Deficiency Detection, Workshop on Scene Understanding for Autonomous Drone Delivery, Heidelberg, 2023

Social Diffusion: Multiple Human Motion Anticipation, Workshop on Anticipating Human Behavior, Bonn, 2023

Efficient CNNs and Transformers for Video Understanding and Image Synthesis, ACM International Conference on Multimedia Retrieval, Thessaloniki, Greece, 2023

Anticipating Human Behavior with Artificial Intelligence, Pint of Science, Bonn, 2023

Efficient CNNs and Transformers for Video Understanding, International Conference on Signal Processing and Integrated Networks, 2023

Deep Learning for Analyzing Temporal Visual Data and Forecasting, Center for Earth System Observation and Computational Analysis, 2023

Efficient CNNs and Transformers for Video Understanding and Image Synthesis, IMAGINE, Ecole des Ponts ParisTech, 2023

From Efficient 3D CNNs and Transformers to Applications in Agriculture and Wildfire Forecasting, ELLIS Unit Jena, Germany, 2023

Analyzing Behavior in Video Recordings, Institute of Cognitive Neuroscience, Biopsychology, Ruhr University Bochum, Germany, 2022

3D LiDAR-based Semantic Scene Understanding, Workshop on 3D Perception for Autonomous Driving, ECCV, Tel Aviv, Isreal, 2022

Adaptive Token Sampling for Efficient Vision Transformers, European Conference on Computer Vision, Tel Aviv, Isreal, 2022

Self-supervised Learning for Unintentional Action Prediction, DAGM German Conference on Pattern Recognition, Konstanz, Germany, 2022

Efficient 3D CNNs and Transformers for Video Understanding, Video Understanding Symposium, Amsterdam, Netherlands, 2022

Challenges and Opportunities of Representation Learning (Video), DIGICROP, 2022

Anticipating Human Behavior, Winter School - Computer Animation, Bonn, 2022

KI für eine nachhaltige Landwirtschaft, Künstliche Intelligenz in der Landwirtschaft, NRW.BANK, 2021

Understanding and Anticipating Human Behavior, Intel Labs, 2021

What are Good Representations for Video Understanding?, Structure and Learning, Schloss Dagstuhl, Germany, 2021

Holistic and Continuous Video Understanding, Huawei CRI Vision Forum, 2021

Temporal Convolutional Networks for Continuous Video Analysis, ChaLearn Looking at People Sign Language Recognition in the Wild Workshop, CVPR, 2021

Understanding and Anticipating Activities, Workshop on Multi-visual-Modality Human Activity Understanding, ACCV, 2020

Understanding and Anticipating Activities (Video), AIR Distinguished Speaker Series, Boston University, 2020

An Introduction to Temporal Action Segmentation - From Fully Supervised Learning to Weakly Supervised Learning (Slides, Video), Tutorial on Large Scale Holistic Video Understanding, 2020

Analyze the Past - Anticipate the Future, BMVA Symposium on Video Understanding, London, UK, 2019

Anticipating Human Motion, Activities, and Semantic Scene Geometry, Workshop on Anticipating Human Behavior, Bonn, Germany, 2019

Forecasting Activities, Object Interactions, and Semantic Scene Geometry, Computer Vision Laboratory, ETH Zurich, Switzerland, 2019

Analyzing and Anticipating Human Behavior, Workshop on Advanced Robotics and AI Technologies, Bonn, Germany, 2019

Opportunities and Challenges of Deep Learning for Crop Production, IBG-2 Plant Sciences, Forschungszentrum Jülich, Germany, 2019

Recognizing and Anticipating Human Activities, INRIA, Paris, France, 2018

Analyzing and Anticipating Human Behavior in Video Sequences, Huawei Video Intelligence Forum, Trinity College Dublin, Ireland, 2018

Analyzing and Anticipating Human Behavior in Videos, RWTH Aachen, Germany, 2018

Mit Künstlicher Intelligenz Menschliches Verhalten Dekodieren, Sommerfest, Bonn, Germany, 2018

Video Analysis for Studying the Behavior of Humans and Mice, BIGS Neuroscience Summer School, Bonn, Germany, 2018

Open Set Domain Adaptation, Workshop: Imaging and Vision from Theory to Applications, Siegen, Germany, 2018

Künstliche Intelligenz - Zwischen Realität und Science Fiction, Interdisziplinäres Forum Heidelberg, Germany, 2018

Weakly Supervised Learning of Actions, Interdisciplinary Center for Scientific Computing, Heidelberg, Germany, 2018

Künstliche Intelligenz - Zwischen Realität und Science Fiction, Dies Academicus, Bonn, Germany, 2017

Analyzing Human Behavior in Video Sequences, Bosch, Hildesheim, Germany, 2017

Analyzing Human Behavior in Video Sequences, University of Hamburg, Germany, 2017

Analyzing Human Behavior in Video Sequences, Hausdorff Forum for Interaction with Mathematical Sciences, Bonn, Germany, 2017

Weakly Supervised Learning of Actions, Chalearn Looking at People Workshop, Venice, Italy, 2017

Hands and Objects, International Workshop on Observing and Understanding Hands in Action, Venice, Italy, 2017

Analyzing Human Behavior in Video Sequences, Pattern Recognition and Computer Vision Colloquium, Prague, Czech Republic, 2017

Künstliche Intelligenz - Zwischen Realität und Science Fiction, MinD-Akademie, Cologne, Germany, 2017

Recurrent Neural Networks and Open Sets, Deep Learning for Computer Vision, IBFI Schloss Dagstuhl, Germany, 2017

Video Analysis for Studying the Behavior of Humans and Mice, BIGS Neuroscience Summer School, Bonn, Germany, 2017

Analyzing Human Behavior in Video Sequences, Amazon, Berlin, Germany, 2016

Reconstructing Articulated Rigged Models from RGB-D Videos, International Workshop on Recovering 6D Object Pose, Amsterdam, Netherlands, 2016

Estimating Human Pose and Activity, Centre for Applied Autonomous Sensor Systems, Örebro University, Sweden, 2016

Estimating Human Pose and Activity, University of Essex, Colchester, UK, 2016

Capturing Interacting Characters and Hands, Chalmers University of Technology, Gothenburg, Sweden, 2016

Modeling Humans, Objects and their Relations, Perspectives in Computer Vision and Pattern Recognition, Siegen, Germany, 2015

Estimating Pose and Activity, IEEE Workshop on Interaction of Automated Vehicles with other Traffic Participants, Las Palmas, Gran Canaria, Spain, 2015

Capturing Interacting Hands and Objects, IEEE Workshop on Observing and Understanding Hands in Action, Boston, Massachusetts, USA, 2015

A Semantic Occlusion Model for Human Pose Estimation from a Single Depth Image, ChaLearn Looking at People Workshop, Boston, Massachusetts, USA, 2015

Capturing Hands in Action, Computer Vision Laboratory, ETH Zurich, Switzerland, 2015

Human Pose: A Cue for Activities and Objects, Holistic Scene Understanding, IBFI Schloss Dagstuhl, Germany, 2015

Human Pose: A Cue for Activities and Objects, Digital Image Computing: Techniques and Applications (DICTA), Wollongong, Australia, 2014

Classification and Regression Forests - Theory and Applications, ZESS, Siegen, Germany, 2014

Random Forests and their Applications in Computer Vision, German Conference on Pattern Recognition (GCPR), Münster, Germany, 2014

Activities, Objects, and Poses, Computer Vision Laboratory, ETH Zurich, Switzerland, 2014

Action Recognition, Tutorial Towards Solving Real-World Vision Problems with RGB-D Cameras, Columbus, Ohio, USA, 2014

Face Analysis, Tutorial Towards Solving Real-World Vision Problems with RGB-D Cameras, Columbus, Ohio, USA, 2014

Vom Mensch zum Avatar, Bonner Wissenschaftsnacht, Bonn, Germany, 2014

Wie Computer lernen, Bilder zu verstehen, Dies Academicus, Bonn, Germany, 2014

Auf Menschen fokussiertes visuelles Erkennen und Verstehen von Szenen, GIBU 2014: GI-Beirat der Universitätsprofessoren, IBFI Schloss Dagstuhl, Germany, 2014

From Pose to Affordance, International Time of Flight Workshop, Ein-Gedi, Israel, 2014

Is Human Pose an Important Cue for Scene Understanding?, Institute for Computer Graphics and Vision, TU Graz, Austria, 2014

Classification and Regression Forests - Applications, Summer School on Explorative Analysis and Visualization of Large Information Spaces, Gaschurn, Austria, 2013

Classification and Regression Forests - Theory, Summer School on Explorative Analysis and Visualization of Large Information Spaces, Gaschurn, Austria, 2013

Random Forests and their Applications in Computer Vision, Bonn Vision Workshop, Bonn, Germany, 2013

Towards Scene Understanding, Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS, Sankt Augustin, Germany, 2013

Action Recognition, Tutorial Towards Solving Real-World Vision Problems with RGB-D Cameras, Portland, Oregon, USA, 2013

Face Analysis, Tutorial Towards Solving Real-World Vision Problems with RGB-D Cameras, Portland, Oregon, USA, 2013

Hands and Humans in Action, University of Washington, Seattle, WA, USA, 2013

Pose Estimation of Hands and Faces, Institute for Neuro- and Bioinformatics, Lübeck, Germany, 2013

Random Forests for Face Analysis and Body Pose Estimation, Workshop Kinect untangled: from basics to applications, Valencia, Spain, 2013

Hands and Humans in Action, Universitat Politècnica de Catalunya, Barcelona, Spain, 2012

Will Depth Cameras Have a Long-term Impact on Computer Vision Research?, Time-of-Flight Imaging: Algorithms, Sensors and Applications, IBFI Schloss Dagstuhl, Germany, 2012

Hough Forests and their Applications in Computer Vision, University of Bonn, Germany, 2012

Hough Forests for Object Detection, Tracking, and Action Recognition, Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark, 2011

Real-time Sign Language Letter and Word Recognition from Depth Data, IEEE Workshop on Human Computer Interaction: Real-Time Vision Aspects of Natural User Interfaces (HCI'11), Barcelona, Spain, 2011

Capturing and Modeling Human Interactions, Max Planck Institute for Intelligent Systems, Tübingen, Germany, 2011

Capturing and Modeling Human Interactions, Computer Graphics Group, University of Bonn, Germany, 2011

Objects are More Than Bounding Boxes, Outdoor and Large-Scale Real-World Scene Analysis. 15th Workshop "Theoretical Foundations of Computer Vision", IBFI Schloss Dagstuhl, Germany, 2011

Markerless Motion Capture of Interacting Characters Using Multi-view Image Segmentation (PPTX), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'11), Colorado Springs, CO, USA, 2011

Hough Forests for Object Detection, Tracking, and Action Recognition, Max Planck Symposium on Intelligent Systems, Tübingen, Germany, 2011

Hough Forests for Object Detection, Tracking, and Action Recognition, MMCI, University of Saarland, Saarbrücken, Germany, 2011

Vision-based Human Motion Capture: State-of-the-art and Applications, Computer Vision Winter Workshop, Mitterberg, Austria, 2011

Marker-less Human Motion Capture for In-house Monitoring and Computer Games, Toshiba Cambridge Research Laboratory, Cambridge, UK, 2010

On-line Adaption of Class-specific Codebooks for Instance Tracking, British Machine Vision Conference (BMVC'10), Aberystwyth, UK, 2010

From 2D Object Detection to 3D Pose Estimation, Computer Vision Group, University of California, Berkeley, CA, USA, 2010

Evaluation of Face Motion in Video Sequences (PPT/ZIP, PDF), HERMES Final Review Meeting (PDF, Video), Barcelona, Spain, 2009

Class-Specific Hough Forests for Object Detection (PPT, PDF), IEEE Conference on Computer Vision and Pattern Recognition (CVPR'09), Miami, FL, USA, 2009

Filtering and Optimization Strategies for Marker-less Human Motion Capture with Skeleton-based Shape Models, CVG, ETH Zurich, Switzerland, 2009

Filtering and Optimization Strategies for Marker-less Human Motion Capture with Skeleton-based Shape Models, PERCEPTION, INRIA Grenoble Rhone-Alpes, France, 2009

Filtering and Optimization Strategies for Marker-less Human Motion Capture with Skeleton-based Shape Models, Computer Vision Group, University of Bonn, Germany, 2008

Filtering and Optimization Strategies for Marker-less Human Motion Capture with Skeleton-based Shape Models, BIWI, ETH Zurich, Switzerland, 2008

Model-based Motion Capture for Crash Test Video Analysis, 30th Annual Symposium of the German Association for Pattern Recognition (DAGM'08), Munich, Germany, 2008

Clustered Stochastic Optimization for Object Recognition and Pose Estimation, 29th Annual Symposium of the German Association for Pattern Recognition (DAGM'07), Heidelberg, Germany, 2007

Rendering for Tracking - A Perspective from Computer Vision, Visual Computing - Convergence of Computer Graphics and Computer Vision, IBFI Schloss Dagstuhl, Germany, 2007

Pose Estimation with 3D Textured Models, IEEE Pacific-Rim Symposium on Image and Video Technology (PSIVT'06), Hsinchu, Taiwan, 2006

Learning for Multi-View 3D Tracking in the Context of Particle Filters, Second International Symposium on Visual Computing (ISVC'06), Lake Tahoe, NV, USA, 2006

Generalised Annealed Particle Filter - Impact on Tracking Articulated Objects, SensoMotoric Instruments, Teltow / Berlin, Germany, 2006

Learning a Dynamic Independent Pose Distribution within a Bayesian Framework, Human Motion - Understanding, Modeling, Capture and Animation. 13th Workshop "Theoretical Foundations of Computer Vision", IBFI Schloss Dagstuhl, Germany, 2006

Awards

Marr Prize Honorable Mention, Venice, 2017

German Pattern Recognition Award, Münster, 2014

Nomination for ITG Innovation Award, Zurich, 2012

DAGM Main Prize, Heidelberg, 2007

SMI-DAGM Graduation Award, Berlin, 2006

Senior Foulkes Prize, University of Wales Swansea, 2004

Media Coverage

Technology Review, La Repubblica, ACM TechNews, TechCrunch, Metro, The Register, International Business Times, General-Anzeiger Bonn, SWR Fernsehen, Rhein-Neckar-Zeitung, Einstein (SF1), NewScientist, Planetopia (SAT.1)

Education/Experience

June 2013 - present:
Professor at the University of Bonn, Germany

April 2012 - May 2013
Senior Research Scientist at the Perceiving Systems Department, Max Planck Institute for Intelligent Systems, Germany

March 2009 - March 2012:
Postdoc at the Computer Vision Laboratory, ETH Zurich, Switzerland

November 2008 - February 2009:
Ph. D. student in Computer Science at the Universität des Saarlandes, Saarbrücken, Germany and the Max-Planck-Institut für Informatik
Title of Ph. D. Thesis: Filtering and Optimization Strategies for Marker-less Human Motion Capture with Skeleton-based Shape Models (supervisors: Prof. Dr. B. Rosenhahn, Prof. Dr. H.-P. Seidel)

August 2008 - October 2008:
Intern at the Machine Learning and Perception Group, Microsoft Research Cambridge, UK

January 2006 - July 2008:
Ph. D. student in Computer Science at the Universität des Saarlandes, Saarbrücken, Germany and the Max-Planck-Institut für Informatik

October 2004 - December 2005:
Studies in Mathematics at the University of Mannheim
Title of Master's Thesis (Diplomarbeit): Generalised Annealed Particle Filter - Mathematical Framework, Algorithms and Applications (supervisors: Prof. Dr. J. Potthoff, Prof. Dr. C. Schnörr) (PDF, PS)

September 2003 - July 2004:
Studies in Mathematics at the University of Wales Swansea, UK
Bachelor of Science in Mathematics with First Class Honours

October 2000 - August 2003:
Studies in Mathematics at the University of Mannheim

June 1999:
Abitur at the Gymnasium, Christian-von-Bomhard Schule Uffenheim, Germany

Researchers/Students/Visitors

Manager for Education and Teaching:
Benedikt Kolbe

Postdocs:
Marius Bock
Lars Doorenbos

Ph. D. students:
Alessio Pittiglio
Sassan Mokhtar
Anas Al-lahham
Enrico Pallotta
Serdar Ozsoy
Syed Talal Wasim
Fatemeh Jabbari
Yanan Luo
Sina Mokhtarzadeh
Hamid Suleman
Federico Spurio
Shuai Li
Emad Bahrami Rad
Olga Zatsarynna
Jinhui Yi

Associated Emmy Noether Group:
France Rose

Alumni:
Mohamad Hakam Shams Eddin (graduated 2026)
Santosh Thoduka (graduated 2026)
Andreas Doering (graduated 2025)
Elena Belén Bueno Benito (visitor 2025)
Sovan Biswas (graduated 2025)
Shi-Jie Li (graduated 2024)
Vadim Sushko (graduated 2024)
Julian Tanke (graduated 2024)
Mohsen Fayyaz (graduated 2024)
Rania Briq (researcher 2018-2023)
Nadine Behrmann (graduated 2023)
Jifeng Wang (researcher 2020-2022)
Laura Romeo (visitor 2022)
Yaser Souri (graduated 2023)
Mian Ahsan Iqbal (researcher 2017-2022)
Yazan Abu Farha (graduated 2022)
Fadime Sener (graduated 2021)
Johann Sawatzky (graduated 2020)
Umer Rafi (postdoc 2019-2020)
Pau Panareda-Busto (graduated 2020)
Martin Garbade (graduated 2019)
Alexander Richard (graduated 2019)
Hildegard Kühne (postdoc 2016-2018)
Umar Iqbal (graduated 2018)
Alejandro Hernandez (visitor 2018)
Mengqi Ji (visitor 2016/2017)
Abhilash Srikantha (graduated 2017)
Dimitrios Tzionas (graduated 2017)
Marko Ristin (graduated 2015)
Matthias Dantone (graduated 2014)
Nima Razavi (graduated 2012)
Uwe Hiemer (intern 2012)
Angela Yao (graduated 2012)
Gabriele Fanelli (graduated 2012)
Adolfo López Méndez (visitor 2012)
Henning Hamer (graduated 2011)