Publications
Please see Google Scholar for more recent works and arXiv papers.
2025
ICML
WOMD-Reasoning: A Large-Scale Language Dataset for Interaction and Driving Intentions ReasoningInternational Conference on Machine Learning (ICML) , 2025
ICML
Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement LearningInternational Conference on Machine Learning (ICML) , 2025
2024
NeurIPS
Localization and recognition of human action in 3D using transformersCommunications Engineering , 2024
TPAMI
Spatial Steerability of GANs via Self-Supervision from DiscriminatorIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) , 2024
Unsupervised Discovery of Steerable Factors When Graph Deep Generative Models Are EntangledTransactions on Machine Learning Research , 2024
TPAMI
In-Domain GAN Inversion for Faithful Reconstruction and EditabilityIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) , 2024
2023
ICLR Spotlight
Guarded Policy Optimization with Imperfect Online DemonstrationsInternational Conference on Learning Representations (ICLR Spotlight) , 2023
ICRA
V2XP-ASG: Generating Adversarial Scenes for Vehicle-to-Everything PerceptionIEEE International Conference on Robotics and Automation (ICRA) , 2023
TPAMI
GH-Feat: Learning Versatile Generative Hierarchical Features From GANsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) , 2023
2022
NeurIPS
Optimistic Curiosity Exploration and Conservative Exploitation with Linear Reward ShapingNeural Information Processing Systems (NeurIPS) , 2022
CoRL
CoBEVT: Cooperative Bird’s Eye View Semantic Segmentation with Sparse TransformersConference on Robot Learning (CoRL) , 2022
CVPR
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture GenerationIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 2022
CVPR Oral
Cross-Model Pseudo-Labeling for Semi-Supervised Action RecognitionIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR Oral) , 2022
AAAI
Visual Sound Localization in the Wild by Cross-Modal Interference ErasingAAAI Conference on Artificial Intelligence (AAAI) , 2022
AAAI
SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual RepresentationsAAAI Conference on Artificial Intelligence (AAAI) , 2022
IJCV
Disentangled inference for gans with latently invertible autoencoderInternational Journal on Computer Vision (IJCV) , 2022
TPAMI
Gan inversion: A surveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) , 2022
2021
CVPR
Instance localization for self-supervised detection pretrainingIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 2021
CVPR
Positional encoding as spatial inductive bias in gansIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 2021
HDSR
Deepminer: Discovering interpretable representations for mammogram classification and explanationHarvard Data Science Review (HDSR) , 2021
TIP
Texture Memory-Augmented Deep Patch-Based Image InpaintingIEEE Transactions on Image Processing (TIP) , 2021
Deep Learning for Scene Classification: A SurveyarXiv preprint arXiv:2101.10531 , 2021
Unsupervised Image Transformation Learning via Generative Adversarial NetworksarXiv preprint arXiv:2103.07751 , 2021
AAAI
HiABP: Hierarchical Initialized ABP for Unsupervised Representation LearningIn Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) , 2021
2020
CoRL
Learning Driving Decisions by Imitating Drivers’ Control BehaviorsConference on Robot Learning (CoRL) , 2020
AAAI
Every frame counts: joint learning of video segmentation and optical flowIn Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) , 2020
ECCV
A unified framework for shot type classification based on subject centric lensIn Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16 (ECCV) , 2020
Improving the Fairness of Deep Generative Models without RetrainingarXiv preprint arXiv:2012.04842 , 2020
Neuro-symbolic program search for autonomous driving decision module designConference on Robot Learning , 2020
2019
Discovering place-informative scenes and objects using social media photosRoyal Society open science , 2019
Comparing the interpretability of deep networks via network dissectionIn Explainable AI: Interpreting, Explaining and Visualizing Deep Learning , 2019
NeurIPS Spotlight
Policy Continuation with Hindsight Inverse DynamicsIn Advances in Neural Information Processing Systems (NeurIPS Spotlight) , 2019
CVPR Oral
A graph-based framework to bridge movies and synopsesIn Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR Oral) , 2019
2018
Expert identification of visual primitives used by CNNs during mammogram classificationIn Medical Imaging 2018: Computer-Aided Diagnosis , 2018
CVPR
Recurrent residual module for fast inference in videosIn Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2018
Revisiting the importance of individual units in cnns via ablationarXiv preprint arXiv:1806.02891 , 2018
ECCV
Factorizable net: an efficient subgraph-based framework for scene graph generationIn Proceedings of the European Conference on Computer Vision (ECCV) , 2018
ECCV
Single image intrinsic decomposition without a single intrinsic imageIn Proceedings of the European Conference on Computer Vision (ECCV) , 2018
Measuring human perceptions of a large-scale urban region using machine learningLandscape and Urban Planning , 2018
Facefeat-gan: a two-stage approach for identity-preserving face synthesisarXiv preprint arXiv:1812.01288 , 2018
2017
CVPR
Person search with natural language descriptionIn Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2017
IROS Oral
Segicp: Integrated deep semantic segmentation and pose estimationIn 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IROS Oral) , 2017
2016
AISTATS Oral
Optimization as estimation with Gaussian processes in bandit settingsIn Artificial Intelligence and Statistics (AISTATS Oral) , 2016
C-IMAGE: city cognitive mapping through geo-tagged photosGeoJournal , 2016
2015
CVPR
Conceptlearner: Discovering visual concepts from weakly labeled image collectionsIn Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) , 2015
ICLR Oral
Object detectors emerge in deep scene cnnsIn International Conference on Learning Representations (ICLR Oral) , 2015
2014
ECCV
Recognizing city identity via attribute analysis of geo-tagged imagesIn European conference on computer vision (ECCV) , 2014
NeurIPS Spotlight
NeurIPS
NeurIPS
ICCV
ICCV
ICCV
ICCV
ICCV
IROS
ICML
CVPR Highlight
CVPR
CVPR
ICRA
ICLR Spotlight
ICLR
ICLR
RAL
NeurIPS
NeurIPS
Nature
CVPR
CVPR
CVPR
RAL
3DV spotlight
NeurIPS Spotlight
NeurIPS
CoRL
ICCV
TMLR
CVPR Highlight
CVPR Highlight
ICLR
ICRA
NeurIPS
NeurIPS
ECCV
ECCV Oral
TPAMI
CVPR
CVPR
ICLR
ICRA+RAL
Book Chapter
NeurIPS
NeurIPS
CoRL
CVPR Oral
CVPR Oral
CVPR
ICCV
IJCV
IROS+RAL
IROS+RAL
CVPR
CVPR
CVPR
ECCV
CVPR
CVPR
CVPR
TPAMI
PNAS
TPAMI
ICLR
CVPR
CVPR
ICCV Oral
IJCV
CVPR
TPAMI
ECCV
ECCV
IROS
ECCV
ICCV
CVPR
CVPR Oral
TPAMI
ICCV
CVPR
TPAMI
IJCV
NeurIPS Spotlight
ECCV
CVPR