Amir  R. Zamir

About Me:

I'm a postdoctoral researcher at Stanford and UC Berkeley. My advisors are Silvio Savarese and Jitendra Malik, and I work closely with Leo Guibas.

My research interests are broadly in computer vision and machine learning with a focus on transfer/self/un supervised learning and perception-for-robotics. The goal of my research is going beyond task-specific and offline perception methods and moving toward a compositional and general perception that operates as part of an active intelligent framework in the real world. Here are my sample research (Taskonomy, Mid-Level Vision, Gibson), community, and teaching works on this topic. I'm also interested in structured prediction, video understanding, and 3D vision.

Selected Honors and Awards:

• CVPR 2018 Best Paper Award, for Taskonomy. [ref]
• CVPR 2016 Best Student Paper Award, for structural-RNN. [ref]
• NVIDIA Pioneering Research Award 2018, for Gibson Environment. [ref]
• Stanford Inst. for Computational and Mathematical Engineering Seed Award. [ref]
• 2014 UCF Graduate Research Forum Award.
• National Geospatial-Intelligence Agency, 2013 NARP-SW Award.
• UCF Research Excellence Fellowship.

Selected Projects:

        Taskonomy: Transfer Learning                         Mid-Level Vision for Robotics

          Gibson Environment       Out-of-the-Box Visual Sim-to-Real     Feedback Networks

Latest News:

Feb. 2019: Co-organizing Computer Vision for Global Challenges in CVPR 2019.

Feb. 2019: Co-organizing ICML 2019 Workshop on Self-Supervised Learning, with Aaron van den Oord, Yusuf Aytar, Carl Doersch, Carl Vondrick, Alec Radford, Pierre Sermanet, and Pieter Abbeel.

Jun. 2018: Paper in CVPR18: Taskonomy: Disentangling Task Transfer Learning.
[Best Paper Award]
.
A. Zamir, S. Sax*, W. Shen*, L. Guibas, J. Malik, S. Savarese.
[ Transfer Learning API | Live Demo | Trained Models | Website | Paper].

Feb. 2018: Paper in CVPR18: Gibson Env: Real-World Perception for Embodied Agents. [Spotlight Oral],[NVIDIA Pioneering Research Award].
A. R. Zamir*, F. Xia*, J. He*, S. Sax, J. Malik, S. Savarese.
Real-world learning environments and agents for perceptual robotics.
[ Gibson Environments | Github | Paper].

Mar. 2017: With J. Malik, A. Efros, A. Gupta, and S. Savarese, we started the Beyond Supervised Learning workshop series, to inaugurate in ICCV 2017.

Mar. 2017: Paper in CVPR17: Feedback Networks, A. Zamir*, T. Wu*, L. Sun, WB. Shen, B. Shi, J. Malik, S. Savarese.
[ Project Website | PDF ].

Jun. 2016: Received CVPR Best Student Paper Award for Structural-RNN.

Jun. 2016: Co-organizing The First Workshop on Negative Results in Computer Vision in CVPR 2017, along with A. Torralba, W. Freeman, D. Forsyth, A. Efros, R. Sukthankar, A. Oliva, O. Sener, and J. Malik.

Jun. 2016: Paper in ECCV16: Generic 3D Representation via Pose Estimation and Matching, Zamir et al.
[ 3D Representation Website | PDF | Dataset Repository ]

Jun. 2016: I will be instructing CS331B: Representation Learning in Computer Vision course at Stanford starting autumn 2016 (with Silvio Savarese).
[see Course Webpage | Stanford ExploreCourses ]

Jun. 2016: With A. Hakeem, L. Van Gool, M. Shah, and R. Szeliski, we published the book Large-Scale Visual Geo-Localization, with Springer.
[Front Matter | Cover | Springer Page]

Jun. 2016: Selected as a recipeint of Institute for Computational and Mathematical Engineering seed award for our work on Generic Representation Learning. Thank you Stanford ICME and NVIDIA GPU Center of Excellence!

Mar. 2016: Two papers accepted to CVPR16:
1) Structural-RNN: Deep Leaning on Spatio-Temporal Graphs
[Best Paper Award].
2) 3D Semantic Parsing of Large-Scale Indoor Spaces [Oral]
[see Building Parser website]

Selected Publications:

Taskonomy: Disentangling Task Transfer Learning,
Amir R. Zamir, Sasha Sax*, William Shen*, Leonidas Guibas, Jitendra Malik, Silvio Savarese,
In CVPR, 2018 [Best Paper Award]
In IJCAI, 2019 [Invited Paper, Sister Conference Best Papers Track]
[Transfer Learning API | Live Demo | Website | Paper]

Gibson Env: Real-World Perception for Embodied Agents,
Amir R. Zamir*, Fei Xia*, Jerry He*, Sasha Sax, Jitendra Malik, Silvio Savarese,
In CVPR, 2018 - [Spotlight Oral],[NVIDIA Pioneering Research Award]
[Gibson Environments | Github | Website | Paper]

Patent: Systems and Methods for Performing Three-Dimensional Semantic Parsing of Indoor Spaces,
Iro Armeni, Ozan Sener, Amir R. Zamir, Martin Fischer, Silvio Savarese,
US Patent App. 5/619,422, 2017.
[Link]

Feedback Networks,
Amir R. Zamir*, Te-Lin Wu*, Lin Sun, William B. Shen, Bertram Shi, Jitendra Malik, Silvio Savarese,
In CVPR, 2017.
[PDF | Project Page]

Generic 3D Representation via Pose Estimation and Matching,
Amir R. Zamir, Pulkit Agrawal, Tilman Wekel, Jitendra Malik, Silvio Savarese,
In ECCV, 2016.
[PDF | 3DRepresentation website | Dataset]

Structural-RNN: Deep Leaning on Spatio-Temporal Graphs, Ashesh Jain, Amir R. Zamir, Silvio Savarese, Ashutosh Saxena,
In CVPR, 2016 [Best Student Paper Award]
[PDF | Project Page ]

3D Semantic Parsing of Large-Scale Indoor Spaces , Iro Armeni, Ozan Sener, Amir R. Zamir, Martin Fischer, Silvio Savarese,
In CVPR, 2016 - [Oral] (acceptance rate ~3%)
[PDF | 3D PC Parser website (Demo, Code, Data)]

• Book: Large-Scale Visual Geo-Localization,
Amir R. Zamir, Asaad Hakeem, Luc Van Gool, Mubarak Shah, Richard Szeliski,
Springer, 2016 [Front Matter | Cover | Springer Page]

The THUMOS Challenge on Action Recognition for Videos "in the Wild", Haroon Idrees, Amir R. Zamir, Yu-Gang Jiang, Alex Gorban, Ivan Laptev, Rahul Sukthankar, Mubarak Shah,
In Computer Vision and Image Understanding (CVIU), 2016 [PDF | Project Page ]

Unsupervised Semantic Parsing of Video Collections, Ozan Sener, Amir R. Zamir, Silvio Savarese, Ashutosh Saxena,
In Proceedings of International Conference on Computer Vision (ICCV), 2015 [PDF | Project Page ]

Action Recognition by Hierarchical Mid-level Action Elements, Tian Lan, Yuke Zhu, Amir R. Zamir, Silvio Savarese,
In Proceedings of International Conference on Computer Vision (ICCV), 2015 [PDF | Project Page | 1 min Summary]

DaMN - Discriminative and Mutually Nearest: Exploiting Pairwise Category Proximity for Video Action Recognition, Rui Hou, Amir R. Zamir, Rahul Sukthankar, and Mubarak Shah,
In Proceedings of European Conference on Computer Vision (ECCV), 2014 [PDF | BibTeX | Project Page | 1 min Summary]
@inproceedings{DaMN_2014,
   Author = { Hou, R. and Roshan Zamir, A. and Sukthankar R. and Shah, M.},
   Booktitle = {Proceedings of the European Conference on Computer Vision ({ECCV})},
   Title = {{DaMN \96 Discriminative and Mutually Nearest}: Exploiting Pairwise Category Proximity for Video Action Recognition},
   Year = {2014}}

GIS-Assisted Object Detection and Geospatial Localization, Shervin Ardeshir, Amir R. Zamir and Mubarak Shah,
In Proceedings of European Conference on Computer Vision (ECCV), 2014 [PDF | BibTeX | Project Page | 1 min Summary]
@inproceedings{GIS_Assisted_ECCV14,
   Author = { Ardeshir, S. and Roshan Zamir, A. and Shah, M.},
   Booktitle = {Proceedings of the European Conference on Computer Vision ({ECCV})},
   Title = {{GIS}-Assisted Object Detection and Geospatial Localization},
   Year = {2014}}

GPS-Tag Refinement using Random Walks with an Adaptive Damping Factor, Amir R. Zamir, Shervin Ardeshir and Mubarak Shah,
in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2014. [PDF | 1 min Summary | 20 min Presentation | BibTeX | Project Page]
@inproceedings{ZamirCVPR14,
   Author = {Roshan Zamir, A. and Ardeshir S. and Shah, M.},
   Booktitle = {27th IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)},
   Title = {GPS-Tag Refinement using Random Walks with an Adaptive Damping Factor},
   Year = {2014}}

Video Classification using Semantic Concept Co-occurrences, Shayan Modiri, Amir R. Zamir and Mubarak Shah,
in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2014. [PDF | 1 min Summary | BibTeX | Project Page]
@inproceedings{GMCP_Classification,
   Author = {Modiri S., Roshan Zamir, A. and Shah, M.},
   Booktitle = {27th IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)},
   Title = {Video Classification using Semantic Concept Co-occurrences},
   Year = {2014}}

Invited Book Chapter: "Action Recognition in Realistic Sports Videos", Khurram Soomro and Amir R. Zamir,
in Computer Vision in Sports, Springer, 2014. [PDF | BibTeX ]
                @incollection{ActionRecognitionSports_2014Springer,
				  title={Action Recognition in Realistic Sports Videos},
				  author={Soomro, Khurram and  Zamir, Amir},
				  booktitle={Computer Vision in Sports},
				  year={2014},
				  publisher={Springer}
				  }

Image Geo-localization Based on Multiple Nearest Neighbor Feature Matching using Generalized Graphs, Amir R. Zamir and Mubarak Shah,
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2014 [Preprint PDF | BibTeX | Web Page]
@null{6710175,
author={Zamir, A.R. and Shah, M.},
journal={Pattern Analysis and Machine Intelligence, IEEE Transactions on},
title={Image Geo-localization Based on Multiple Nearest Neighbor Feature Matching using Generalized Graphs},
year={2014},
volume={PP},
number={99},
pages={1-1},
keywords={Generalized Minimum Clique Problem (GMCP);Generalized Minimum Spanning Tree (GMST);Geo-location;feature correspondence;feature matching;generalized graphs;image localization;multiple nearest neighbor feature matching},
doi={10.1109/TPAMI.2014.2299799},
ISSN={0162-8828},}

Visual Business Recognition - A Multimodal Approach, Amir R. Zamir, Afshin Dehghan and Mubarak Shah,
In Proceeding of ACM International Conference on Multimedia (ACM MM), 2013 [PDF | Video | BibTeX | Project Page]
@inproceedings{ZamirACMMM13,
   Author = {Roshan Zamir, A. and Dehghan, A. and Shah M.},
   Booktitle = {Proceeding of ACM International Conference on Multimedia ({ACM MM})},
   Title = {{Visual Business Recognition} - A Multimodal Approach},
   Year = {2013}}

GMCP-Tracker: Global Multi-object Tracking using Generalized Minimum Clique Graphs, Amir R. Zamir, Afshin Dehghan and Mubarak Shah,
In Proceedings of European Conference on Computer Vision (ECCV), 2012 [PDF | Project Page | 20 min Presentation | BibTeX ]
@inproceedings{ZamirECCV12,
   Author = {Roshan Zamir, A. and Dehghan, A. and Shah, M.},
   Booktitle = {Proceedings of the European Conference on Computer Vision ({ECCV})},
   Title = {{GMCP-Tracker}: Global Multi-object Tracking using Generalized Minimum Clique Graphs},
   Year = {2012}}

City Scale Geo-spatial Trajectory Estimation of a Moving Camera, Gonzalo Vaca, Amir R. Zamir and Mubarak Shah,
in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2012 [PDF | BibTeX | Project Page]
@inproceedings{VacaZamir12,
   Author = {Vaca, G. and Roshan Zamir, A. and Shah, M.},
   Booktitle = {25th IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)},
   Title = {City Scale Geo-spatial Trajectory Estimation of a Moving Camera},
   Year = {2012}}

Accurate Image Localization Based on Google Maps Street View, Amir R. Zamir and Mubarak Shah,
In Proceedings of European Conference on Computer Vision (ECCV), 2010 [PDF | BibTeX | Project Page]
@inproceedings{Zamir10,
   Author = {Roshan Zamir, A., and Shah,  M.},
   Booktitle = {Proceedings of the European Conference on Computer Vision ({ECCV})},
   Title = {Accurate Image Localization Based on Google Maps Street View},
   Year = {2010}}

Recognition of 101 human actions from videos in the wild, Khurram Soomro, Amir R. Zamir and Mubarak Shah,
In arXiv preprint arXiv:1212.0402, November, 2012. [PDF | BibTeX | Project Page | PDF2]
@inproceedings{UCF101,
   Author = {Soomro, k. and R. Zamir, A. and Shah, M.},
   Booktitle = {arXiv preprint arXiv:1212.0402},
   Title = {{UCF101}: A Dataset of 101 Human Actions Classes From Videos in The Wild},
   Year = {2012}}

Automatic Detection and Tracking of Pedestrians in Videos with Various Crowd Densities, Afshin Dehghan, Haroon Idrees, Amir R. Zamir and Mubarak Shah,
In Proceedings of PED, June 2012 [PDF | BibTeX | Project Page]
@incollection{
year={2014},
isbn={978-3-319-02446-2},
booktitle={Pedestrian and Evacuation Dynamics 2012},
editor={Weidmann, Ulrich and Kirsch, Uwe and Schreckenberg, Michael},
doi={10.1007/978-3-319-02447-9_1},
title={Automatic Detection and Tracking of Pedestrians in Videos with Various Crowd Densities},
url={http://dx.doi.org/10.1007/978-3-319-02447-9_1},
publisher={Springer International Publishing},
keywords={Human detection; Tracking; Data association; Crowd density; Crowd analysis; Automatic surveillance},
author={Dehghan, Afshin and Idrees, Haroon and Zamir, AmirRoshan and Shah, Mubarak},
pages={3-19},
language={English}}

Street View Challenge: Identification of Commercial Entities in Street View Imagery, Amir R. Zamir, Alexander Darino, Ryan Patrick and Mubarak Shah,
In Proceedings of ICMLA, 2011

Contact

Email:
zamir AT cs.stanford.edu
zamir AT eecs.berkeley.edu

Address:
Gates Computer Science, #133
353 Serra Mall
Stanford, CA 94305




Press Coverage