Show simple item record

Permutation Tests for Classification

dc.date.accessioned2005-12-19T23:02:49Z
dc.date.accessioned2018-11-24T10:23:50Z
dc.date.available2005-12-19T23:02:49Z
dc.date.available2018-11-24T10:23:50Z
dc.date.issued2003-08-28
dc.identifier.urihttp://hdl.handle.net/1721.1/30408
dc.identifier.urihttp://repository.aust.edu.ng/xmlui/handle/1721.1/30408
dc.description.abstractWe introduce and explore an approach to estimating statisticalsignificance of classification accuracy, which is particularly usefulin scientific applications of machine learning where highdimensionality of the data and the small number of training examplesrender most standard convergence bounds too loose to yield ameaningful guarantee of the generalization ability of theclassifier. Instead, we estimate statistical significance of theobserved classification accuracy, or the likelihood of observing suchaccuracy by chance due to spurious correlations of thehigh-dimensional data patterns with the class labels in the giventraining set. We adopt permutation testing, a non-parametric techniquepreviously developed in classical statistics for hypothesis testing inthe generative setting (i.e., comparing two probabilitydistributions). We demonstrate the method on real examples fromneuroimaging studies and DNA microarray analysis and suggest atheoretical analysis of the procedure that relates the asymptoticbehavior of the test to the existing convergence bounds.
dc.format.extent22 p.
dc.format.extent22876548 bytes
dc.format.extent882217 bytes
dc.language.isoen_US
dc.subjectAI
dc.subjectClassification
dc.subjectPermutation testing
dc.subjectStatistical significance.
dc.titlePermutation Tests for Classification


Files in this item

FilesSizeFormatView
MIT-CSAIL-TR-2003-016.pdf882.2Kbapplication/pdfView/Open
MIT-CSAIL-TR-2003-016.ps22.87Mbapplication/postscriptView/Open

This item appears in the following Collection(s)

Show simple item record