Improving the precision of Bayesian classifier for text documents

Information and Signal Processing

The problem of automatic text document categorization is considered. It is shown that the immediate implementation of Bayesian classifier provides quite low categorization precision due to strong statistical dependence of feature vector elements. A solution to this problem is proposed, which is based on transforming the document feature vector into the one with higher dimension and weaker statistical dependency between its elements.