Development and research of models of multi-class classifiers for a recommended system for preparing applications on the e-procurement

Simulations of Computer, Telecommunications, Control and Social Systems
Authors:
Abstract:

As a result of the analysis, the relevance of developing services that contribute to the preparation of tender documentation, in terms of determining the OKPD 2 code for the generated application, is indicated. To solve the problem of automatic classification of applications in accordance with OKPD 2, an algorithm for the system of comparative analysis of classifier models was developed. Further, preprocessing was carried out, and the collected information was written to the database in json format. Labeling and preparation of data for training classifier models was carried out in the PolyAnalyst environment. As a result of the analysis, a naive Bayes classifier, an SVM classifier, and a random forest classifier were selected as models of multiclass classifiers from the Scikit-Learn library. The TFIDF and word-haching models were chosen as vectorizers. The ruBert-base neural network model was chosen as the fourth classifier. Classifiers were trained and the quality of their work was assessed. According to the results of validation and testing, two models turned out to be the best: ruBert-base and a model of a naive Bayes classifier with a word-hashing vectorizer. Based on the results, a test classification of applications was made.