Narrative Chains Dev/Test Set Nate Chambers Stanford University --- Files chains-test/ 69 randomly selected news articles from the 2001 portion of the NYT subset of the Gigaword Corpus. Articles were required to have a protagonist with a minimum of 5 events. 100 were randomly selected initially, but 31 removed because they did not satisfy the 5 minimum. Each file is a separate article with a list of events, the grammatical argument position of the protagonist, and a before/after ordering between the events. chains-dev/ 17 randomly selected news articles also from the 2001 portion of the NYT subset of Gigaword. Same format as above, but without the temporal orderings encoded. --- Format The first lines of each file list the events in the document, one per line. There are no empty lines between events. The first blank line separates the events from the orderings of the events. The orderings list pairs of events in a before order, headed by "B" at the start of the line. Event format: id string lemma grammatical-function Event example: 3 obtained obtain subj The grammatical function is the argument that the protagonist (the main actor) filled in the document. Other arguments are not specified. Verbs with particles are indicated by an underscore (e.g. pays_back). Only the string contains the particle ... the lemma form is verb only. e.g. pays_back -> pay --- Example file format 1 worked work pp 2 had have pp 3 hired hire subj 4 bringing bring subj 5 has have subj 6 hired hire subj 7 acquired acquire subj 8 taken take subj 9 coming come pp 10 has have subj B 3 5 B 4 5 B 1 5 B 2 5 B 3 6 B 3 7 (3 in 2000) B 1 6 B 1 7 -- Example Explanation The above example contains 10 events. The protagonist fills the subject position in all but two events. It is a prepositional phrase (pp) in these two. There are 8 pairwise orderings between the events. The parenthetical (3 in 2000) is just debugging information and should be ignored. --- Methodology In determining which events would be included in the narrative chain, several constraints were followed: 1. ignore "be" properties like "X is Y" 2. ignore "do" actions like "X did Y" 2. include negated events as positives 3. future or hypothetical events should be considered in the future as occurring 4. ignore events in quotes 5. dominated clauses...take the second: "hurried to leave" "tried to express" 6. Ignore headlines 7. don't add duplicated events