Show simple item record

Identifying Expression Fingerprints using Linguistic Information

dc.date.accessioned2005-12-22T02:41:36Z
dc.date.accessioned2018-11-24T10:24:41Z
dc.date.available2005-12-22T02:41:36Z
dc.date.available2018-11-24T10:24:41Z
dc.date.issued2005-11-18
dc.identifier.urihttp://hdl.handle.net/1721.1/30587
dc.identifier.urihttp://repository.aust.edu.ng/xmlui/handle/1721.1/30587
dc.description.abstractThis thesis presents a technology to complement taxation-based policy proposals aimed at addressing the digital copyright problem. Theapproach presented facilitates identification of intellectual propertyusing expression fingerprints. Copyright law protects expression of content. Recognizing literaryworks for copyright protection requires identification of theexpression of their content. The expression fingerprints described inthis thesis use a novel set of linguistic features that capture boththe content presented in documents and the manner of expression usedin conveying this content. These fingerprints consist of bothsyntactic and semantic elements of language. Examples of thesyntactic elements of expression include structures of embedding andembedded verb phrases. The semantic elements of expression consist ofhigh-level, broad semantic categories. Syntactic and semantic elements of expression enable generation ofmodels that correctly identify books and their paraphrases 82% of thetime, providing a significant (approximately 18%) improvement over modelsthat use tfidf-weighted keywords. The performance of models builtwith these features is also better than models created with standardfeatures used in stylometry (e.g., function words), which yield anaccuracy of 62%.In the non-digital world, copyright holders collect revenues bycontrolling distribution of their works. Current approaches to thedigital copyright problem attempt to provide copyright holders withthe same kind of control over distribution by employing Digital RightsManagement (DRM) systems. However, DRM systems also enable copyrightholders to control and limit fair use, to inhibit others' speech, andto collect private information about individual users of digitalworks.Digital tracking technologies enable alternate solutions to thedigital copyright problem; some of these solutions can protectcreative incentives of copyright holders in the absence of controlover distribution of works. Expression fingerprints facilitatedigital tracking even when literary works are DRM- and watermark-free,and even when they are paraphrased. As such, they enable meteringpopularity of works and make practicable solutions that encouragelarge-scale dissemination and unrestricted use of digital works andthat protect the revenues of copyright holders, for example throughtaxation-based revenue collection and distribution systems, withoutimposing limits on distribution.
dc.format.extent216 p.
dc.format.extent179019584 bytes
dc.format.extent5410679 bytes
dc.language.isoen_US
dc.subjectAI
dc.subjectnatural language processing
dc.subjectsyntactic information
dc.subjectcontent
dc.subjectexpression
dc.titleIdentifying Expression Fingerprints using Linguistic Information


Files in this item

FilesSizeFormatView
MIT-CSAIL-TR-2005-077.pdf5.410Mbapplication/pdfView/Open
MIT-CSAIL-TR-2005-077.ps179.0Mbapplication/postscriptView/Open

This item appears in the following Collection(s)

Show simple item record