About Me
This is my (now-old) academic homepage from my time as a Ph.D. student
at Stanford (2010 -- 2016), where I was
advised by Chris Manning
in the
natural language processing group.
Prior to that, I graduated from UC Berkeley in 2010 with a B.S.
in Electrical Engineering / Computer Science.
I'm now the co-founder and CTO of Eloquent Labs.
My interests are in
natural language understanding.
Recently I've been working on open-domain natural language inference
-- particularly common sense reasoning --
and some work in relation extraction.
In the past, I did some work on interpreting temporal expressions using
semantic parsing.
Otherwise,
In my free time I enjoy the outdoors (hiking / camping / backpacking),
board games, and movies.
My Publications
2019
Mimic and Rephrase: Reflective listening in open-ended dialogue
Conference on Natural language Learning (CoNLL). 2019
Reflective listening—-demonstrating that you have heard your conversational partner—-is key to effective communication. Expert human communicators often mimic and rephrase their conversational partner, e.g., when responding to sentimental stories or to questions they don’t know the answer to. We introduce a new task and an associated dataset wherein dialogue agents similarly mimic and rephrase a user’s request to communicate sympathy (I’m sorry to hear that) or lack of knowledge (I do not know that). We study what makes a rephrasal response good against a set of qualitative metrics. We then evaluate three models for generating responses: a syntax-aware rulebased system, a seq2seq LSTM neural models with attention (S2SA), and the same neural model augmented with a copy mechanism (S2SA+C). In a human evaluation, we find that S2SA+C and the rule-based system are comparable and approach human-generated response quality. In addition, experiences with a live deployment of S2SA+C in a customer support setting suggest that this generation task is a practical contribution to real world conversational agents.
@inproceedings{dieter2019mimic,
author = {Dieter, Justin and Wang, Tian and Angeli, Gabor and Chang, Angel X. and Chaganty, Arun},
booktitle = {Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL 2019)},
title = {Mimic and Rephrase: Reflective listening in open-ended dialogue},
url = {https://www.aclweb.org/anthology/K19-1037.pdf},
year = {2019}
}
2017
Position-aware Attention and Supervised Data Improve Slot Filling
Empirical Methods in Natural Language Processing (EMNLP). 2017
Organized relational knowledge in the form of “knowledge graphs” is important for many applications. However, the ability to populate knowledge bases with facts automatically extracted from documents has improved frustratingly slowly. This paper simultaneously addresses two issues that have held back prior work. We first propose an effective new model, which combines an LSTM sequence model with a form of entity position-aware attention that is better suited to relation extraction. Then we build TACRED, a large (119,474 examples) supervised relation extraction dataset, obtained via crowdsourcing and targeted towards TAC KBP relations. The combination of better supervised data and a more appropriate high-capacity model enables much better relation extraction performance. When the model trained on this new dataset replaces the previous relation extraction component of the best TAC KBP 2015 slot filling system, its F1 score increases markedly from 22.2% to 26.7%.
@inproceedings{zhang2017tacred,
author = {Zhang, Yuhao and Zhong, Victor and Chen, Danqi and Angeli, Gabor and Manning, Christopher D.},
booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017)},
title = {Position-aware Attention and Supervised Data Improve Slot Filling},
url = {https://nlp.stanford.edu/pubs/zhang2017tacred.pdf},
year = {2017}
}
2016
Learning Open Domain Knowledge From Text
Stanford University. 2016
The increasing availability of large text corpora holds the promise of acquiring an unprecedented amount
of knowledge from this text. However, current techniques are either special- ized to particular
domains or do not scale to large corpora. This dissertation develops a new technique for learning
open-domain knowledge from unstructured web-scale text corpora.
A first application aims to capture common sense facts: given a candidate statement about the world and a
large corpus of known facts, is the statement likely to be true? We appeal to a probabilistic relaxation of
natural logic – a logic which uses the syntax of natural language as its logical formalism –
to define a search problem from the query statement to its appropriate support in the knowledge
base over valid (or approximately valid) logical inference steps. We show a 4x improvement in recall over
lemmatized lookup for querying common sense facts, while maintaining above 90% precision.
This approach is extended to handle longer, more complex premises by segmenting these utterances
into a set of atomic statements entailed through natural logic. We evaluate this system in isolation
by using it as the main component in an Open Information Extrac- tion system, and show that it achieves
a 3% absolute improvement in F1 compared to prior work on a competitive knowledge base population task.
A remaining challenge is elegantly handling cases where we could not find a supporting premise for our query.
To address this, we create an analogue of an evaluation function in gameplaying search: a shallow lexical
classifier is folded into the search program to serve as a heuristic function to assess how likely we would
have been to find a premise. Results on answering 4thgrade science questions show that this method improves
over both the classifier in isolation and a strong IR baseline, and outperforms prior work on the task.
@phdthesis{angeli-thesis,
author = {Gabor Angeli},
title = {Learning Open Domain Knowledge From Text},
school = {Stanford University},
year = 2016,
month = 6
}
Combining Natural Logic and Shallow Reasoning for Question Answering.
Association for Computational Linguistics (ACL). 2016
Broad domain question answering is often
difficult in the absence of structured
knowledge bases, and can benefit from
shallow lexical methods (broad coverage)
and logical reasoning (high precision).
We propose an approach for incorporating
both of these signals in a unified framework
based on natural logic. We extend
the breadth of inferences afforded by natural
logic to include relational entailment
(e.g., buy → own) and meronymy (e.g.,
a person born in a city is born the city’s
country). Furthermore, we train an evaluation
function – akin to gameplaying –
to evaluate the expected truth of candidate
premises on the fly. We evaluate our approach
on answering multiple choice science
questions, , achieving strong results on
the dataset.
@inproceedings{2016angeli-naturalli,
author = {Gabor Angeli and Neha Nayak and Christopher D. Manning},
booktitle = {Association for Computational Linguistics (ACL)},
title = {Combining Natural Logic and Shallow Reasoning for Question Answering},
year = {2016}
}
Evaluating Word Embeddings Using a Representative Suite of Practical Tasks
First Workshop on Evaluating Vector Space Representations for NLP (RepEval). 2016
Word embeddings are now widely used
in natural language understanding tasks
requiring sophisticated semantic information.
However, the quality of new
embedding methods is usually evaluated
based on relatively simple word similarity
benchmarks. We propose evaluating word
embeddings by using them as features in
simple models for a suite of popular downstream
tasks. This gives a realistic view
of the utility of the embeddings in realworld
settings. The selection of diverse
set of tasks, including both semantic and
syntactic tasks, facilitates qualitative analysis
of the strengths and weaknesses of the
embeddings. The use of simple models allows
us to format this evaluation as a standardized
script that can be made available
publicly, and which can be run in a few
hours.
@inproceedings{2016nayak-veceval,
author = {Neha Nayak and Gabor Angeli and Christopher D. Manning},
booktitle = {RepEval Workshop},
title = {Evaluating Word Embeddings Using a Representative Suite of Practical Tasks},
year = {2016}
}
2015
A large annotated corpus for learning natural language inference
Empirical Methods in Natural Language Processing (EMNLP). 2015
Understanding entailment and contradiction
is fundamental to understanding natural language,
and inference about entailment and contradiction is a valuable test-
ing ground for the development of semantic representations. However, machine
learning research in this area has been dramatically limited by the lack of
large-scale
resources. To address this, we introduce
the Stanford Natural Language Inference
corpus, a new, freely available collection
of labeled sentence pairs, written by humans doing a novel grounded task based
on image captioning. At 570K pairs, it
is two orders of magnitude larger than
all other resources of its type. This increase in scale allows lexicalized
classifiers to outperform some sophisticated existing entailment models,
and it allows a
neural network-based model to perform
competitively on natural language inference benchmarks for the first time.
@inproceedings{2015bowman-snli,
title = {A large annotated corpus for learning natural language inference},
author = {Samuel R. Bowman and Gabor Angeli and Christopher Potts and Christopher D. Manning},
booktitle = {EMNLP},
year = {2015}
}
Leveraging Linguistic Structure For Open Domain Information Extraction
Association for Computational Linguistics (ACL). 2015
Relation triples produced by open domain
information extraction (open IE) systems
are useful for question answering, inference,
and other IE tasks. Traditionally
these are extracted using a large set of patterns;
however, this approach is brittle on
out-of-domain text and long-range dependencies,
and gives no insight into the substructure
of the arguments. We replace this
large pattern set with a few patterns for
canonically structured sentences, and shift
the focus to a classifier which learns to
extract self-contained clauses from longer
sentences. We then run natural logic inference
over these short clauses to determine
the maximally specific arguments for each
candidate triple. We show that our approach
outperforms a state-of-the-art open
IE system on the end-to-end TAC-KBP
2013 Slot Filling task.
@inproceedings{2015angeli-openie,
title = {Leveraging Linguistic Structure For Open Domain Information Extraction},
author = {Gabor Angeli and Melvin Johnson Premkumar and Christopher D. Manning},
booktitle = {ACL},
year = {2015}
}
Robust Subgraph Generation Improves Abstract Meaning Representation Parsing
Association for Computational Linguistics (ACL). 2015
The Abstract Meaning Representation
(AMR) is a representation for open-domain
rich semantics, with potential use
in fields like event extraction and machine
translation. Node generation, typically
done using a simple dictionary lookup, is
currently an important limiting factor in
AMR parsing. We propose a small set
of actions that derive AMR subgraphs by
transformations on spans of text, which
allows for more robust learning of this
stage. Our set of construction actions
generalize better than the previous approach,
and can be learned with a simple classifier.
We improve on the previous
state-of-the-art result for AMR parsing,
boosting end-to-end performance by
3 F1 on both the LDC2013E117 and
LDC2014T12 datasets.
@inproceedings{2015werling-amr,
title = {Robust Subgraph Generation Improves Abstract Meaning Representation Parsing},
author = {Keenon Werling and Gabor Angeli and Christopher D. Manning},
booktitle = {ACL},
year = {2015}
}
Bootstrapped Self Training for Knowledge Base Population
Text Analysis Conference Proceedings. 2015
A central challenge in relation extraction is the lack of supervised training data. Pattern-based relation extractors suffer from low recall, whereas distant supervision yields noisy data which hurts precision. We propose bootstrapped selftraining to capture the benefits of both systems: the precision of patterns and the generalizability of trained models. We show that training on the output of patterns drastically improves performance over the patterns. We propose self-training for further improvement: recall can be improved by incorporating the predictions from previous iterations; precision by filtering the assumed negatives based previous predictions. We show that even our patternbased model achieves good performance on the task, and the self-trained models rank among the top systems.
@inproceedings{2015angeli-kbp,
title = {Bootstrapped Self Training for Knowledge Base Population},
author = {Gabor Angeli and Victor Zhong and Danqi Chen and Arun Chaganty and Jason Bolton and Melvin Johnson Premkumar and Panupong Pasupat and Sonal Gupta and Christopher D. Manning},
booktitle = {TAC-KBP},
year = {2016}
}
2014
NaturalLI: Natural Logic Inference for Common Sense Reasoning
Empirical Methods in Natural Language Processing (EMNLP). 2014
Common-sense reasoning is important for
AI applications, both in NLP and many
vision and robotics tasks. We propose
NaturalLI: a Natural Logic inference system
for inferring common sense facts – for
instance, that cats have tails or tomatoes
are round – from a very large database
of known facts. In addition to being able
to provide strictly valid derivations, the
system is also able to produce derivations
which are only likely valid, accompanied
by an associated confidence. We both
show that our system is able to capture
strict Natural Logic inferences on the FraCaS
test suite, and demonstrate its ability
to predict common sense facts with 49%
recall and 91% precision.
@inproceedings{2014angeli-naturalli,
title = {NaturalLI: Natural Logic Inference for Common Sense Reasoning},
author = {Gabor Angeli and Christopher D. Manning},
booktitle = {EMNLP},
year = {2014}
}
Combining Distant and Partial Supervision for Relation Extraction
Empirical Methods in Natural Language Processing (EMNLP). 2014
Broad-coverage relation extraction either
requires expensive supervised training
data, or suffers from noise introduced by
distantly supervised methods. We present
an approach for providing partial supervision
to a distantly supervised relation
extractor using a small number of carefully
selected examples. We compare
against established active learning criteria
and propose a novel criterion to sample
examples which are both uncertain and
representative. In this way, we combine
the benefits of fine-grained supervision for
difficult examples with the coverage of a
large distantly supervised corpus. Our approach
gives a substantial increase of 3.9%
end-to-end F1 on the 2013 KBP Slot Filling
evaluation, yielding a net F1 of 37.7%.
@inproceedings{2014angeli-active,
title = {Combining Distant and Partial Supervision for Relation Extraction},
author = {Gabor Angeli and Julie Tibshirani and Jean Y. Wu and Christopher D. Manning},
booktitle = {EMNLP},
year = {2014}
}
Stanford's Distantly Supervised Slot Filling Systems for KBP 2014
Text Analysis Conference Proceedings. 2015
We describe Stanford’s entry in the TAC-KBP 2014 Slot Filling challenge.
We submitted two broad approaches to Slot Filling, both strongly based on the ideas of
distant supervision: one built on the DeepDive framework (Niu et al., 2012), and
another based on the multi-instance multilabel relation extractor of Surdeanu et al. (2012).
In addition, we evaluate the impact of learned and hard-coded patterns on
performance for slot filling, and the impact of the partial annotations
described in Angeli et al. (2014).
@inproceedings{2014angeli-kbp,
title = {Stanford's Distantly Supervised Slot Filling Systems for KBP 2014},
author = {Gabor Angeli and Sonal Gupta and Melvin Johnson Premkumar and Christopher D. Manning and Christopher R{\'e} and Julie Tibshirani and Jean Y. Wu and Sen Wu and Ce Zhang},
booktitle = {TAC-KBP},
year = {2015}
}
A Dictionary of Nonsubsective Adjectives
Stanford CS Technical Report. 2014
Computational approaches to inference and information extraction often assume that adjective-noun compounds maintain
all the relevant properties of the unmodified noun. A significant portion of nonsubsective adjectives violate this assumption.
We present preliminary work towards a classifier for these adjectives. We also compile a comprehensive list of 60 nonsubsective
adjectives including those used for training and those found by the classifiers.
@techreport{2014nayak-adjectives,
title = {A Dictionary of Nonsubsective Adjectives},
author = {Neha Nayak and Mark Kowarsky and Gabor Angeli and Christopher D. Manning},
number = {CSTR 2014-04},
institution = {Department of Computer Science, Stanford University},
month = {October},
year = {2014}
}
2013
Philosophers are Mortal: Inferring the Truth of Unseen Facts
Computational Natural Language Learning (CoNLL). 2013
Large databases of facts are prevalent in
many applications. Such databases are
accurate, but as they broaden their scope
they become increasingly incomplete. In
contrast to extending such a database, we
present a system to query whether it contains
an arbitrary fact. This work can be
thought of as re-casting open domain
information extraction: rather than growing
a database of known facts, we smooth this
data into a database in which any possible
fact has membership with some confidence.
We evaluate our system predicting
held out facts, achieving 74.2% accuracy
and outperforming multiple baselines. We
also evaluate the system as a common-sense
filter for the ReVerb Open IE system, and as
a method for answer validation in a
Question Answering task.
@inproceedings{2013angeli-truth,
title = {Philosophers are Mortal: Inferring the Truth of Unseen Facts},
author = {Gabor Angeli and Christopher Manning},
booktitle = {CoNLL},
year = {2013}
}
- Sim,
the code for the various similarity metrics, with backoffs
Language-Independent Discriminative Parsing of Temporal Expressions
Association for Computational Linguistics (ACL). 2013
Temporal resolution systems are traditionally
tuned to a particular language, requiring
significant human effort to translate them to
new languages. We present a language independent
semantic parser for learning the interpretation
of temporal phrases given only a corpus of
utterances and the times they reference. We
make use of a latent parse that encodes
a language-flexible representation of time,
and extract rich features over both the
parse and associated temporal semantics.
The parameters of the model are learned
using a weakly supervised bootstrapping
approach, without the need for manually
tuned parameters or any other language
expertise. We achieve state-of-the-art
accuracy on all languages in the TempEval-2
temporal normalization task, reporting
a 4% improvement in both English and
Spanish accuracy, and to our knowledge
the first results for four other languages.
@inproceedings{2013angeli-temporal,
title = {Language-Independent Discriminative Parsing of Temporal Expressions},
author = {Gabor Angeli and Jakob Uszkoreit},
booktitle = {ACL},
year = {2013}
}
Stanford's 2013 KBP System
Text Analysis Conference Proceedings. 2014
We describe Stanford’s entry in the TAC-KBP 2013 Slotfilling challenge.
Our system makes use of a distantly supervised approach,
implementing the multi-instance multi-label system of Surdeanu et
al. (2012). In addition, Stanford’s system significantly improved
the information retrieval component of the system, as well
as the consistency and inference procedure applied after
candidate relations have been extracted. Stanford’s 2013 KBP entry
achieved an F1 of 31.36 on the 2013 evaluation data,
performing above the median entry (15.32 F1).
@inproceedings{2013angeli-kbp,
title = {Stanford's 2013 {KBP} System},
author = {Gabor Angeli and Arun Chaganty and Angel Chang and Kevin Reschke and Julie Tibshirani and Jean Y. Wu and Osbert Bastani and Keith Siilats and Christopher D. Manning},
booktitle = {TAC-KBP},
year = {2014}
}
2012
Parsing Time: Learning to Interpret Time Expressions
North American Chapter of the Association for Computational Linguistics (NAACL). 2012
We present a probabilistic approach for learning
to interpret temporal phrases given only a
corpus of utterances and the times they reference.
While most approaches to the task
have used regular expressions and similar linear
pattern interpretation rules, the possibility
of phrasal embedding and modification in
time expressions motivates our use of a compositional
grammar of time expressions. This
grammar is used to construct a latent parse
which evaluates to the time the phrase would
represent, as a logical parse might evaluate to
a concrete entity. In this way, we can employ
a loosely supervised EM-style bootstrapping
approach to learn these latent parses while
capturing both syntactic uncertainty and pragmatic
ambiguity in a probabilistic framework.
We achieve an accuracy of 72% on an adapted
TempEval-2 task -- comparable to state of the
art systems.
@inproceedings{2012angeli-temporal,
title = {Parsing Time: Learning to Interpret Time Expressions},
author = {Gabor Angeli and Christopher D. Manning and Daniel Jurafsky},
booktitle = {NAACL-HLT},
year = {2012}
}
2010
A Simple Domain-Independent Probabilistic Approach to Generation
Empirical Methods in Natural Language Processing (EMNLP), 2010
We present a simple, robust generation system which performs content
selection and surface realization in a unified, domain-independent
framework. In our approach, we break up the end-to-end generation process
into a sequence of local decisions, arranged hierarchically and each
trained discriminatively. We deployed our system in three different
domains---Robocup sportscasting, technical weather forecasts, and common
weather forecasts, obtaining results comparable to state-of-the-art
domain-specific systems both in terms of BLEU scores and human evaluation.
@inproceedings{2010angeli-generation,
title = {A Simple Domain-Independent Probabilistic Approach to Generation},
author = {Gabor Angeli and Percy Liang and Dan Klein},
booktitle = {Empirical Methods in Natural Language Processing (EMNLP)},
year = {2010}
}
I've experiments to run; there is research to be done.