Amir  R. Zamir

About Me:

I'm a postdoctoral researcher at Stanford and UC Berkeley with Silvio Savarese and Jitendra Malik. I also work closely with Leonidas Guibas.

My research interests are broadly in computer vision, machine learning, and AI with a focus on transfer/selfsupervised/unsupervised learning and perception-for-robotics. The goal of my research is going beyond narrow and offline perception methods and moving toward a general multi-task visual perception that operates as part of an active body in the real world. Here are my sample research (Taskonomy, Mid-Level Vision, Gibson), community, and teaching works on this topic. I'm also interested in structured prediction, video understanding, and 3D vision.

Selected Honors and Awards:

• CVPR 2018 Best Paper Award, for Taskonomy. [ref]
• CVPR 2016 Best Student Paper Award, for structural-RNN. [ref]
• NVIDIA Pioneering Research Award 2018, for Gibson Environment. [ref]
• Stanford Inst. for Computational and Mathematical Engineering Seed Award. [ref]
• 2014 UCF Graduate Research Forum Award.
• UCF Research Excellence Fellowship.
• Winner of CVPR19 Habitat Embodied Agents Challenge. [ref]

Selected Projects:

        Taskonomy: Transfer Learning                         Mid-Level Vision for Robotics

          Gibson Environment       Out-of-the-Box Visual Sim-to-Real     Feedback Networks

Latest News:

Aug. 2019: Invited talk in IJCAI main conference (August 10-16 2019, Macao, China) on multi-task visual perception.

Jun. 2019: We won CVPR19 Habitat Embodied Agents Challenge. Mid-Level Vision Team, RGB Track. [Link]

Jun. 2019: Three invited talks in CVPR 2019 workshops:
• Title: Learning to Navigate using Mid-level Visual Priors: Enhanced Generalization and Sample Efficiency. In "Deep Learning for Semantic Visual Navigation" workshop.
• Title: Transfer Learning for Multi-Task Perception and Robotics. In "Perception Beyond the Visible Spectrum" workshop.
• Title: Collecting Large-scale Densely-labeled 3D Data from the Real World Without a Single Click. In "Image Matching: Local Features and Beyond" workshop.

May. 2019: Will be an Area Chair of CVPR 2020.

Feb. 2019: Co-organizing Computer Vision for Global Challenges in CVPR 2019.

Feb. 2019: Co-organizing ICML 2019 Workshop on Self-Supervised Learning, with Aaron van den Oord, Yusuf Aytar, Carl Doersch, Carl Vondrick, Alec Radford, Pierre Sermanet, and Pieter Abbeel.

Jun. 2018: Paper in CVPR18: Taskonomy: Disentangling Task Transfer Learning.
[Best Paper Award]
.
A. Zamir, S. Sax*, W. Shen*, L. Guibas, J. Malik, S. Savarese.
[ Transfer Learning API | Live Demo | Trained Models | Website | Paper].

Workshop/Conference Organization & Teaching:

CVPR 2020, Area Chair.
Computer Vision for Global Challenges workshop, in CVPR 2019, co-organizer.
Self-Supervised Learning workshop, in ICML 2019, co-organizer.
Visual Learning and Embodied Agents in Simulation Environments workshop, in ECCV 2018, co-organizer.
Beyond Supervised Learning workshop, in CVPR 2018, ICCV 2017, co-organizer.
Negative Results in Computer Vision workshop, in CVPR 2017, co-organizer.
Geo-Spatial Computer Vision workshop, in CVPR 2016, co-organizer.
THUMOS Action Recognition Challenge workshop, in ICCV 2013, ECCV 2014, CVPR 2015, co-organizer.
3DV 2016, Workshops and Tutorials Chair.
Teaching: Co-Instructing CS331B: Representation Learning in Computer Vision course at Stanford since Autumn 2016 (co-instructed with Silvio Savarese).

Selected Publications:

Taskonomy: Disentangling Task Transfer Learning,
Amir R. Zamir, Sasha Sax*, William Shen*, Leonidas Guibas, Jitendra Malik, Silvio Savarese,
In CVPR, 2018 [Best Paper Award]
In IJCAI, 2019 [Invited Paper, Sister Conference Best Papers Track]
[Transfer Learning API | Live Demo | Website | Paper]

Gibson Env: Real-World Perception for Embodied Agents,
Amir R. Zamir*, Fei Xia*, Jerry He*, Sasha Sax, Jitendra Malik, Silvio Savarese,
In CVPR, 2018 - [Spotlight Oral],[NVIDIA Pioneering Research Award]
[Gibson Environments | Github | Website | Paper]

Patent: Systems and Methods for Performing Three-Dimensional Semantic Parsing of Indoor Spaces,
Iro Armeni, Ozan Sener, Amir R. Zamir, Martin Fischer, Silvio Savarese,
US Patent App. 5/619,422, 2017.
[Link]

Feedback Networks,
Amir R. Zamir*, Te-Lin Wu*, Lin Sun, William B. Shen, Bertram Shi, Jitendra Malik, Silvio Savarese,
In CVPR, 2017.
[PDF | Project Page]

Generic 3D Representation via Pose Estimation and Matching,
Amir R. Zamir, Pulkit Agrawal, Tilman Wekel, Jitendra Malik, Silvio Savarese,
In ECCV, 2016.
[PDF | 3DRepresentation website | Dataset]

Structural-RNN: Deep Leaning on Spatio-Temporal Graphs, Ashesh Jain, Amir R. Zamir, Silvio Savarese, Ashutosh Saxena,
In CVPR, 2016 [Best Student Paper Award]
[PDF | Project Page ]

3D Semantic Parsing of Large-Scale Indoor Spaces , Iro Armeni, Ozan Sener, Amir R. Zamir, Martin Fischer, Silvio Savarese,
In CVPR, 2016 - [Oral] (acceptance rate ~3%)
[PDF | 3D PC Parser website (Demo, Code, Data)]

• Book: Large-Scale Visual Geo-Localization,
Amir R. Zamir, Asaad Hakeem, Luc Van Gool, Mubarak Shah, Richard Szeliski,
Springer, 2016 [Front Matter | Cover | Springer Page]

The THUMOS Challenge on Action Recognition for Videos "in the Wild", Haroon Idrees, Amir R. Zamir, Yu-Gang Jiang, Alex Gorban, Ivan Laptev, Rahul Sukthankar, Mubarak Shah,
In Computer Vision and Image Understanding (CVIU), 2016 [PDF | Project Page ]

Unsupervised Semantic Parsing of Video Collections, Ozan Sener, Amir R. Zamir, Silvio Savarese, Ashutosh Saxena,
In Proceedings of International Conference on Computer Vision (ICCV), 2015 [PDF | Project Page ]

Action Recognition by Hierarchical Mid-level Action Elements, Tian Lan, Yuke Zhu, Amir R. Zamir, Silvio Savarese,
In Proceedings of International Conference on Computer Vision (ICCV), 2015 [PDF | Project Page | 1 min Summary]

DaMN - Discriminative and Mutually Nearest: Exploiting Pairwise Category Proximity for Video Action Recognition, Rui Hou, Amir R. Zamir, Rahul Sukthankar, and Mubarak Shah,
In Proceedings of European Conference on Computer Vision (ECCV), 2014 [PDF | BibTeX | Project Page | 1 min Summary]
@inproceedings{DaMN_2014,
   Author = { Hou, R. and Roshan Zamir, A. and Sukthankar R. and Shah, M.},
   Booktitle = {Proceedings of the European Conference on Computer Vision ({ECCV})},
   Title = {{DaMN \96 Discriminative and Mutually Nearest}: Exploiting Pairwise Category Proximity for Video Action Recognition},
   Year = {2014}}

GIS-Assisted Object Detection and Geospatial Localization, Shervin Ardeshir, Amir R. Zamir and Mubarak Shah,
In Proceedings of European Conference on Computer Vision (ECCV), 2014 [PDF | BibTeX | Project Page | 1 min Summary]
@inproceedings{GIS_Assisted_ECCV14,
   Author = { Ardeshir, S. and Roshan Zamir, A. and Shah, M.},
   Booktitle = {Proceedings of the European Conference on Computer Vision ({ECCV})},
   Title = {{GIS}-Assisted Object Detection and Geospatial Localization},
   Year = {2014}}

GPS-Tag Refinement using Random Walks with an Adaptive Damping Factor, Amir R. Zamir, Shervin Ardeshir and Mubarak Shah,
in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2014. [PDF | 1 min Summary | 20 min Presentation | BibTeX | Project Page]
@inproceedings{ZamirCVPR14,
   Author = {Roshan Zamir, A. and Ardeshir S. and Shah, M.},
   Booktitle = {27th IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)},
   Title = {GPS-Tag Refinement using Random Walks with an Adaptive Damping Factor},
   Year = {2014}}

Video Classification using Semantic Concept Co-occurrences, Shayan Modiri, Amir R. Zamir and Mubarak Shah,
in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2014. [PDF | 1 min Summary | BibTeX | Project Page]
@inproceedings{GMCP_Classification,
   Author = {Modiri S., Roshan Zamir, A. and Shah, M.},
   Booktitle = {27th IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)},
   Title = {Video Classification using Semantic Concept Co-occurrences},
   Year = {2014}}

Invited Book Chapter: "Action Recognition in Realistic Sports Videos", Khurram Soomro and Amir R. Zamir,
in Computer Vision in Sports, Springer, 2014. [PDF | BibTeX ]
                @incollection{ActionRecognitionSports_2014Springer,
				  title={Action Recognition in Realistic Sports Videos},
				  author={Soomro, Khurram and  Zamir, Amir},
				  booktitle={Computer Vision in Sports},
				  year={2014},
				  publisher={Springer}
				  }

Image Geo-localization Based on Multiple Nearest Neighbor Feature Matching using Generalized Graphs, Amir R. Zamir and Mubarak Shah,
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2014 [Preprint PDF | BibTeX | Web Page]
@null{6710175,
author={Zamir, A.R. and Shah, M.},
journal={Pattern Analysis and Machine Intelligence, IEEE Transactions on},
title={Image Geo-localization Based on Multiple Nearest Neighbor Feature Matching using Generalized Graphs},
year={2014},
volume={PP},
number={99},
pages={1-1},
keywords={Generalized Minimum Clique Problem (GMCP);Generalized Minimum Spanning Tree (GMST);Geo-location;feature correspondence;feature matching;generalized graphs;image localization;multiple nearest neighbor feature matching},
doi={10.1109/TPAMI.2014.2299799},
ISSN={0162-8828},}

Visual Business Recognition - A Multimodal Approach, Amir R. Zamir, Afshin Dehghan and Mubarak Shah,
In Proceeding of ACM International Conference on Multimedia (ACM MM), 2013 [PDF | Video | BibTeX | Project Page]
@inproceedings{ZamirACMMM13,
   Author = {Roshan Zamir, A. and Dehghan, A. and Shah M.},
   Booktitle = {Proceeding of ACM International Conference on Multimedia ({ACM MM})},
   Title = {{Visual Business Recognition} - A Multimodal Approach},
   Year = {2013}}

GMCP-Tracker: Global Multi-object Tracking using Generalized Minimum Clique Graphs, Amir R. Zamir, Afshin Dehghan and Mubarak Shah,
In Proceedings of European Conference on Computer Vision (ECCV), 2012 [PDF | Project Page | 20 min Presentation | BibTeX ]
@inproceedings{ZamirECCV12,
   Author = {Roshan Zamir, A. and Dehghan, A. and Shah, M.},
   Booktitle = {Proceedings of the European Conference on Computer Vision ({ECCV})},
   Title = {{GMCP-Tracker}: Global Multi-object Tracking using Generalized Minimum Clique Graphs},
   Year = {2012}}

City Scale Geo-spatial Trajectory Estimation of a Moving Camera, Gonzalo Vaca, Amir R. Zamir and Mubarak Shah,
in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2012 [PDF | BibTeX | Project Page]
@inproceedings{VacaZamir12,
   Author = {Vaca, G. and Roshan Zamir, A. and Shah, M.},
   Booktitle = {25th IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)},
   Title = {City Scale Geo-spatial Trajectory Estimation of a Moving Camera},
   Year = {2012}}

Accurate Image Localization Based on Google Maps Street View, Amir R. Zamir and Mubarak Shah,
In Proceedings of European Conference on Computer Vision (ECCV), 2010 [PDF | BibTeX | Project Page]
@inproceedings{Zamir10,
   Author = {Roshan Zamir, A., and Shah,  M.},
   Booktitle = {Proceedings of the European Conference on Computer Vision ({ECCV})},
   Title = {Accurate Image Localization Based on Google Maps Street View},
   Year = {2010}}

Recognition of 101 human actions from videos in the wild, Khurram Soomro, Amir R. Zamir and Mubarak Shah,
In arXiv preprint arXiv:1212.0402, November, 2012. [PDF | BibTeX | Project Page | PDF2]
@inproceedings{UCF101,
   Author = {Soomro, k. and R. Zamir, A. and Shah, M.},
   Booktitle = {arXiv preprint arXiv:1212.0402},
   Title = {{UCF101}: A Dataset of 101 Human Actions Classes From Videos in The Wild},
   Year = {2012}}

Automatic Detection and Tracking of Pedestrians in Videos with Various Crowd Densities, Afshin Dehghan, Haroon Idrees, Amir R. Zamir and Mubarak Shah,
In Proceedings of PED, June 2012 [PDF | BibTeX | Project Page]
@incollection{
year={2014},
isbn={978-3-319-02446-2},
booktitle={Pedestrian and Evacuation Dynamics 2012},
editor={Weidmann, Ulrich and Kirsch, Uwe and Schreckenberg, Michael},
doi={10.1007/978-3-319-02447-9_1},
title={Automatic Detection and Tracking of Pedestrians in Videos with Various Crowd Densities},
url={http://dx.doi.org/10.1007/978-3-319-02447-9_1},
publisher={Springer International Publishing},
keywords={Human detection; Tracking; Data association; Crowd density; Crowd analysis; Automatic surveillance},
author={Dehghan, Afshin and Idrees, Haroon and Zamir, AmirRoshan and Shah, Mubarak},
pages={3-19},
language={English}}

Street View Challenge: Identification of Commercial Entities in Street View Imagery, Amir R. Zamir, Alexander Darino, Ryan Patrick and Mubarak Shah,
In Proceedings of ICMLA, 2011

Contact

Email:
zamir AT cs.stanford.edu
zamir AT eecs.berkeley.edu

Address:
Gates Computer Science, #133
353 Serra Mall
Stanford, CA 94305




Press Coverage