Computing, Telecommunication and Control

Информатика, телекоммуникации и управление

2687-0517

Improving the precision of Bayesian classifier for text documents

Повышение точности байесовского классификатора текстовых документов

0000-0001-6960-0942

22956402700

K-3059-2012

Peter

petert@dcn.icc.spbstu.ru

Peter the Great St.Petersburg Polytechnic University

10 02 2010

12 18

The problem of automatic text document categorization is considered. It is shown that the immediate implementation of Bayesian classifier provides quite low categorization precision due to strong statistical dependence of feature vector elements. A solution to this problem is proposed, which is based on transforming the document feature vector into the one with higher dimension and weaker statistical dependency between its elements.

automatic categorization Bayesian classifier