Shyamal Buch

I'm a PhD student in Computer Science at Stanford University, with the Stanford Vision and Learning Lab. My research focuses on methods for efficiently understanding events and activities in videos, images, and natural language. I am grateful to be supported by an NDSEG Fellowship.

Contact: shyamal (at) cs (dot) stanford (dot) edu

Other: [Google Scholar] [Github]

Publications (Please see Google Scholar for full list)

Revisiting the "Video" in Video-Language Understanding
Shyamal Buch, Cristóbal Eyzaguirre, Adrien Gaidon, Jiajun Wu, Li Fei-Fei, Juan Carlos Niebles
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
(Oral Presentation)
website / paper / bibtex

Neural Event Semantics for Grounded Language Understanding
Shyamal Buch, Li Fei-Fei, Noah D. Goodman
Transactions of the Association of Computational Linguistics (TACL), 2021 (Journal)
(ACL 2021, Oral Presentation)
website / paper / bibtex

BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments
{Sanjana Srivastava*, Chengshu Li*, Michael Lingelbach*, Roberto Martin-Martin*}, Fei Xia, Kent Vainio, Zheng Lian, Cem Gokmen, Shyamal Buch, C. Karen Liu, Silvio Savarese, Hyowon Gweon, Jiajun Wu, Li Fei-Fei
{*} = equal contribution, lead author
Conference on Robot Learning (CoRL), 2021
website / paper / bibtex

On the Opportunities and Risks of Foundation Models
Rishi Bommasani*,... full list of authors,... Percy Liang*
§ 2.2 (Vision): Shyamal Buch, Drew A. Hudson, Freida Rong, Alex Tamkin, Xikun Zhang, Bohan Wu, Ehsan Adeli, Stephano Ermon, Ranjay Krishna, Juan Carlos Niebles, Jiajun Wu, Li Fei-Fei
{*} = equal contribution author
Stanford CRFM Report, 2021
website / paper / bibtex / commentaries / reflections

iGibson, a Simulation Environment for Interactive Tasks in Large Realistic Scenes
{Bokui Shen*, Fei Xia*, Chengshu Li*, Roberto Martin-Martin*}, Linxi Fan, Guanzhi Wang, Claudia D'Arpino, Shyamal Buch, Sanjana Srivastava, Lyne P. Tchapmi, Michael E. Tchapmi, Kent Vainio, Silvio Savarese, Li Fei-Fei
{*} = equal contribution, lead author
International Conference on Intelligent Robots and Systems (IROS), 2021
website / paper / bibtex

RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition
{Linxi Fan*, Shyamal Buch*}, Guanzhi Wang, Ryan Cao, Yuke Zhu, Juan Carlos Niebles, Li Fei-Fei
{*} = equal contribution, lead author
European Conference on Computer Vision (ECCV), 2020
website / paper / bibtex

End-to-End Joint Semantic Segmentation of Actors and Actions in Video
Jingwei Ji, Shyamal Buch, Alvaro Soto, Juan Carlos Niebles
European Conference on Computer Vision (ECCV), 2018
(Oral Presentation)
website / paper / bibtex

Finding "It": Weakly-Supervised Reference Aware Visual Grounding in Instructional Video
{De-An Huang*, Shyamal Buch*}, Lucio Dery, Animesh Garg, Li Fei-Fei, Juan Carlos Niebles
{*} = equal contribution, lead author
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
(Oral Presentation)
website / paper / bibtex

End-to-End, Single-Stream Temporal Action Detection in Untrimmed Videos
Shyamal Buch, Victor Escorcia, Bernard Ghanem, Li Fei-Fei, Juan Carlos Niebles
British Machine Vision Conference (BMVC), 2017
(Oral Presentation)
website / paper / bibtex

SST: Single-Stream Temporal Action Proposals
Shyamal Buch, Victor Escorcia, Chuanqi Shen, Bernard Ghanem, Juan Carlos Niebles
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
website / paper / bibtex

Patents

System and Method for Leveraging End-to-End Driving Models for Improving Driving Task Modules
Shyamal Buch, Adrien Gaidon (assigned to Toyota Research Institute)
U.S. Patent No. 10,866,588 | Issued: 2020
patent / bibtex

Modified template from (this) and (this)