Home

Text categorization based on regularized linear classification methods


Author(s) : Frank J. Oles Tong Zhang, 
Publisher : N/A
Publication Date : 2001
ISSN : N/A
Abstract : A number of linear classication methods such as the linear least squares t (LLSF), logistic regression, and support vector machines (SVM's) have been applied to text categorization problems. These methods share the similarity by nding hyperplanes that approximately separate a class of document vectors from its complement. However, support vector machines are so far considered special in that they have been demonstrated to achieve the state of the art performance. It is therefore worthwhile to understand whether such good performance is unique to the SVM design, or if it can also be achieved by other linear classication methods. In this paper, we compare a number of known linear classication methods as well as some variants in the framework of regularized linear systems. We will discuss the statistical and numerical properties of these algorithms, with a focus on text categorization. We will also provide some numerical experiments to illustrate these algorithms on a number of datasets. 1,