Drew A. Hudson

dorarad [at] cs.stanford.edu [scholar] [github] [linkedin] [twitter]

Hi! My name is Drew, I am a Research Scientist at DeepMind, and have recently completed my PhD in Computer Science at Stanford University. I was fortunate to work with my advisor Prof. Christopher D. Manning and collaborate with Dr. Larry Zitnick from FAIR, Meta AI and with Prof. James L. McClelland. I was a member of the Stanford AI Lab and the Stanford NLP group. My research focuses on reasoning, compositionality, and representation learning, at the intersection of vision and language.

I explore structural principles and inductive biases for making neural networks more interpretable, robust and data-efficient, and allow them to generalize effectively and systematically from a few samples only. I believe in the importance of multi-disciplinary both within the AI field and across domains, and draw high-level inspiration from the feats of the human mind, including its structural properties as well as cognitive capabilities.

I believe that compositionality is a key ingredient that, if incorporated successfully into neural models, may help bridging the gap between machine intelligence and natural intelligence. I explore ways to achieve compositionality both in terms of computation and representation.

Towards the former, I introduced, together with my advisor, models such as MAC and the Neural State Machine that perform transparent step-by-step reasoning, as well as the GQA dataset for real-world visual question answering.
Towards the latter, I began more recently to explore ways to learn compositional scene representations, and along with Larry, presented the Generative Adversarial Transformers, for fast, data-efficient and high-resolution image synthesis. I am actively researching this subject further and hope to present new findings on this exciting direction in the near future!

Papers

Compositional Transformers for Scene Generation

Drew A. Hudson, C. Lawrence Zitnick

NeurIPS 2021 [Abstract] [Paper] [Talk]

We propose a new model for sequential image generation, to explicitly account for differnet objects for enhanced controllability, disentanglement and interpretability.

On the Opportunities and Risks of Foundation Models

Rishi Bommasani Drew A. Hudson, et al. (the Center for Research on Foundation Models)

Preprint 2021, in submssion to a journal [Abstract] [Paper]

We thoroughly discuss the emergent paradigm shift of scalable self-supervision, and explore its potential benefits, technical innovations and societal impact.

Generative Adversarial Transformers

Drew A. Hudson, C. Lawrence Zitnick

ICML 2021 [Abstract] [Paper] [Code] [Talk]

Spotlight presentation

We introduce the Generative Adversarial Transformer model, a linearly efficient bipartite transformer, and combine it with the GAN framework for high-resolution scene generation.

SLM: Learning a Discourse Language Representation with Sentence Unshuffling

Haejun Lee, Drew A. Hudson, Kangwook Lee, Christopher D. Manning

EMNLP 2020 [Abstract] [Paper] [Talk]

We introduce a hierarchical transformer that is aware both semantics at the word and the sentence levels, allowing it to acquire better understanding of global properties and discourse relations.

Learning by Abstraction: The Neural State Machine

Drew A. Hudson, Christopher D. Manning

NeurIPS 2019 [Abstract] [Paper] [Talk]

Spotlight presentation, top 3%

We introduce a Neuro-Symbolic model that represent semantic knowledge in the form of scene graph to support iterative reasoning for the task of compositional visual question answering.

GQA: A new Dataset for Real-World Visual Reasoning and Compositional Question Answering

Drew A. Hudson, Christopher D. Manning

CVPR 2019 [Abstract] [Paper] [Website] [Dataset] [Talk]

Oral Presentation, top 5%

We introduce GQA, a large-scale dataset for real-world visual reasoning and compositional question answering, that focuses on biases reduction and full grounding of each object and entity in a provided scene graph.

Compositional Attention Networks for Machine Reasoning

Drew A. Hudson, Christopher D. Manning

ICLR 2018 [Abstract] [Paper] [Code] [Blog] [Website] [Talk]

We present the MAC network, a fully differentiable neural network for compositional reasoning, that achieved state-of-the-art 98.9% accuracy on the CLEVR dataset.

Tighter Bounds for Makespan Minimization on Unrelated Machines

Dor Arad, Yael Mordechai, Hadas Shachnai

Arxiv [Abstract] [Paper]

We obtain tight bounds for the problem of scheduling n jobs to minimize the makespan on m unrelated machines.

Selected Talks

Generative Adversarial Transformers Stanford, April 2021
Compositional Generative Networks for Scene Representation Stanford, August 2020
From Machine Learning to Machine Reasoning Evolution AI London, October 2020
Learning by Abstraction: The Neural State Machine Microsoft Redmond, September 2019
Compositional & Relational Visual Reasoning ICLR Representation Learning on Graphs and Manifolds Workshop, May 2019
Minimizing Rosenthal Potential in Multicast Games Technion, March 2014
Exact and Approximate Bandwidth Technion, Feb 2014

Activities, Associations & Community

Internships, Work and Awards

I received the Google Anita Borg Scholarship (2013, EMEA) for leading women in Computer Science.
I am an alumni of the Chais Scholars Program for Excellence that allowed me a wonderful opportunity to explore research for the first time in the early stages of my academic experience and connect with an amazing group of student peers.
I interned at Facebook AI Research, Menlo Park in Summer 2019.
I worked at Google as part of the LiveResults team in 2012-2013.
Recipient of the Stanford SoE fellowship for 1st year graduate students.
Valedictorian of the class of 2014 in the Technion, Institute of Technology (GPA: 97.4 / 100, Ranked 1st / 224).
Finalist of the 2020 Facebook Fellowship Awards and the The Open Phil AI Fellowship.
Technion President's list of honors (top 3%): Fall 2009/10 – Fall 2013/14.
Received the Excellent CS Students Program (SAMBA) award for academic excellence, 2013.

Workshops and Conferences

Co-lead organizer of the CtrlGen workshop (NeurIPS 2021) for Controllable Generative Modeling in Language and Vision.
Co-organizer of the the ViGIL workshop (NeurIPS 2019 and NAACL 2021) for multimodal grounding and interaction, the ALVR workshop at NAACL 2021 for connections between vision and language, and the VQA workshop at CVPR 2019 and 2020.
Organized the GQA challenge at CVPR 2019 for compositional reasoning over real-world images, which attracted more than 50 participating groups.
Reviewer for NeurIPS (2019-2022), ICML (2020-2022) and ICLR (2021-2022).

Teaching, Seminars and Mentorship

Over the recent years I mentor student teams in the CS224N class (Win 2019, Win 2021) about Deep Learning and NLP, at Stanford ACM, and in the independent study class.
A Teaching Assistant CS229: Machine Learning (Spr 2020) and CS230: Deep Learning (Win 2021, Spr 2021) and in particular responsible for the class projects.
Organizer of the Stanford NLP group meetings (Spring-Summer 2021) and the weekly seminar for external speakers (Summer-Fall 2021).
Organizer of the Job Talk Practice Session series at Stanford CS.
Participated in a mentoring program where I tutored freshmen and sophomores in STEM classes, 2012-2014.

Hobbies and Fun Facts

I began studying towards a B.Sc. degree in Computer Science as a full-time student when I was 14 years old.
I studied piano for 8 years in the Dunie Weizman Conservatory of Music.

Drew A. Hudson

Papers

Videos

Selected Talks

Activities, Associations & Community