Show simple item record

UCM/MIT Indications, Referring Expressions, and Coreference Corpus (UMIREC corpus) v1.1

dc.contributorPatrick Winstonen_US
dc.contributorGenesisen_US
dc.date.accessioned2010-08-19T18:15:22Z
dc.date.accessioned2018-11-26T22:26:22Z
dc.date.available2010-08-19T18:15:22Z
dc.date.available2018-11-26T22:26:22Z
dc.date.issued2010-05-12
dc.identifier.citationFinlayson, M.A. & Hervás, R. (2010) UCM/MIT Indications, Referring Expressions, and Co-Reference Corpus v1.1 (UMIREC corpus). MIT CSAIL Work Product.en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/57507
dc.identifier.urihttp://repository.aust.edu.ng/xmlui/handle/1721.1/57507
dc.description.abstractThe corpus comprises 62 files in "Story Workbench" annotation format: 30 folktales in English from a variety of sources, and 32 Wall Street Journal articles selected to coincide with articles found in the Penn Treebank. The files are annotated with the location of referring expressions, coreference relations between the referring expressions, and so-called "indication structures", which split referring expressions into constituents (nuclei and modifiers) and mark each constituent as either 'distinctive' or 'descriptive', indicating whether or not the constituent contains information required for uniquely identifying the referent. The files distributed in this corpus archive are the gold-standard files, which were constructed by merging annotations done by two trained annotators. The contents of this corpus, the annotation procedure, and the indication structures are described in more detail in a paper titled "The Prevalence of Descriptive Referring Expressions in News and Narrative" published in the proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, held in July 2010 in Uppsala, Sweden (ACL-2010). A near-final version of the paper is included in the doc/ directory of the compressed corpus archive file. This is version 1.1 of the UMIREC corpus, in which the coreference annotations have been fixed relative to version 1.0. UMIREC v1.0 suffered from a bug in the export script that corrupted the coreference data.en_US
dc.format.extent877 koen_US
dc.relation.isreferencedbyhttp://hdl.handle.net/1721.1/54765
dc.relation.replaceshttp://hdl.handle.net/1721.1/54766
dc.rightsCreative Commons Attribution 3.0 Unporteden
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/
dc.titleUCM/MIT Indications, Referring Expressions, and Coreference Corpus (UMIREC corpus) v1.1en_US


Files in this item

FilesSizeFormatView
umirec_corpus_1.1.zip877.4Kbapplication/octet-streamView/Open

This item appears in the following Collection(s)

Show simple item record

Creative Commons Attribution 3.0 Unported
Except where otherwise noted, this item's license is described as Creative Commons Attribution 3.0 Unported