<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "https://jats.nlm.nih.gov/publishing/1.3/JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xml:lang="ru">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Computing, Telecommunication and Control</journal-title>
        <trans-title-group xml:lang="ru">
          <trans-title>Информатика, телекоммуникации и управление</trans-title>
        </trans-title-group>
      </journal-title-group>
      <issn pub-type="epub">2687-0517</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="publisher-id">2</article-id>
      <article-id pub-id-type="doi">10.18721/JCSTCS.18302</article-id>
      <title-group>
        <article-title>Application of machine learning algorithms and neural networks for analyzing the influence of data type in hate speech detection</article-title>
        <trans-title-group xml:lang="ru">
          <trans-title>Применение алгоритмов машинного обучения и нейронных сетей для анализа влияния типа данных при выявлении ненавистнических высказываний</trans-title>
        </trans-title-group>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Mbele Ossiyi</surname>
            <given-names>L.P.</given-names>
          </name>
          <xref ref-type="aff" rid="aff1"/>
          <email>lucprucell@gmail.com</email>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid">0000-0003-1116-7765</contrib-id>
          <contrib-id contrib-id-type="scopus">56049610600</contrib-id>
          <name>
            <surname>Drobintsev</surname>
            <given-names>Pavel</given-names>
          </name>
          <xref ref-type="aff" rid="aff1"/>
          <email>drobintsev_pd@spbstu.ru</email>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="scopus">6603839750</contrib-id>
          <name>
            <surname>Sergey M. Ustinov</surname>
            <given-names>Сергей</given-names>
          </name>
          <xref ref-type="aff" rid="aff2"/>
          <email>usm50@yandex.ru</email>
        </contrib>
      </contrib-group>
      <aff id="aff1">Peter the Great St. Petersburg Polytechnic University</aff>
      <aff id="aff2">Peter the Great St.Petersburg Polytechnic University</aff>
      <pub-date publication-format="electronic" date-type="pub" iso-8601-date="2025-09-30">
        <day>30</day>
        <month>09</month>
        <year>2025</year>
      </pub-date>
      <volume>18</volume>
      <issue>3</issue>
      <fpage>23</fpage>
      <lpage>35</lpage>
      <abstract xml:lang="en">
        <p>At present, communication has reached an unprecedented level of activity thanks to online social platforms that have overcome geographical and linguistic barriers. However, the shift to online communication is accompanied by the spread of hate speech, which negatively affects the social environment of these platforms. In the field of natural language processing, research is being conducted to develop models for detecting and classifying hate speech, aimed at improving the safety and quality of the online environment. However, many of these studies are based on commonly used datasets that turn out to be unbalanced and insufficiently adapted to the new grammatical features of hate speech. This article presents a comparative study of the effectiveness of machine and deep learning algorithms in detecting hate speech based on a synthetic dataset. Three separate experiments were conducted using original and synthetically perturbated data. The findings indicate that employing a synthetic dataset enhances the representation of extremely negative or infrequently encountered communication scenarios, contributing to their more effective detection. Deep learning algorithms demonstrated superior performance in all experiments. The top-performing models in the first and second experiments, both using zero-shot learning, yielded accuracies of 52.04% and 62.13%, respectively. The last experiment revealed that the BiGRU + fastText architecture outperformed other models, achieving an accuracy of 72.68%.</p>
      </abstract>
      <kwd-group xml:lang="en">
        <kwd>sentiment analysis</kwd>
        <kwd>emotion recognition in text</kwd>
        <kwd>attention mechanism</kwd>
        <kwd>embedding</kwd>
        <kwd>CNN</kwd>
        <kwd>LSTM</kwd>
        <kwd>GRU</kwd>
      </kwd-group>
    </article-meta>
  </front>
</article>
