<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "https://jats.nlm.nih.gov/publishing/1.3/JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xml:lang="en">
  <front xmlns:xlink="http://www.w3.org/1999/xlink">
    <journal-meta>
      <journal-title-group>
        <journal-title>Computing, Telecommunication and Control</journal-title>
        <trans-title-group xml:lang="ru">
          <trans-title>Информатика, телекоммуникации и управление</trans-title>
        </trans-title-group>
      </journal-title-group>
      <issn pub-type="epub">2687-0517</issn>
    </journal-meta>
    <article-meta xmlns:xlink="http://www.w3.org/1999/xlink">
      <article-id pub-id-type="publisher-id">9</article-id>
      <article-id pub-id-type="doi">10.18721/JCSTCS.17309</article-id>
      <title-group>
        <article-title>Development of the system of automatic generation of database model on the basis of the task text in natural language</article-title>
        <trans-title-group xml:lang="ru">
          <trans-title>Разработка системы автоматической генерации модели базы данных на основе текста задания на естественном языке</trans-title>
        </trans-title-group>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Lapin</surname>
            <given-names>Igor</given-names>
          </name>
          <xref ref-type="aff" rid="aff1"/>
          <email>lapin_ia@spbstu.ru</email>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Sabinin</surname>
            <given-names>Oleg</given-names>
          </name>
          <xref ref-type="aff" rid="aff2"/>
          <email>olegsabinin@mail.ru</email>
        </contrib>
      </contrib-group>
      <aff id="aff1">Peter the Great St. Petersburg Polytechnic University</aff>
      <aff id="aff2">Peter the Great St.Petersburg Polytechnic University</aff>
      <pub-date publication-format="electronic" date-type="pub" iso-8601-date="2024-09-30">
        <day>30</day>
        <month>09</month>
        <year>2024</year>
      </pub-date>
      <volume>17</volume>
      <issue>3</issue>
      <fpage>93</fpage>
      <lpage>102</lpage>
      <self-uri xmlns:xlink="http://www.w3.org/1999/xlink" content-type="pdf" xlink:href="https://infocom.spbstu.ru/userfiles/files/articles/2024/3/93-102.pdf"/>
      <abstract xml:lang="en">
        <p>This paper describes an approach to the implementation of a system that would allow automatic database model generation from a natural language description given by the user. Different machine learning technique, such as transformer, named entity recognition and relation extraction are considered and applied. The implementation of the neural network model uses the capabilities of the spaCy framework to organize a generic pipeline for training. Off-the-shelf implementations of some individual components from spaCy are also used, while the rest are custom. Moreover, we describe the process of gathering and preparing raw data for training a neural network model, and generating a proper corpus from them. For this purpose, a specialized annotating tool, Doccano, is used, which satisfies all requirements and is freely available. Finally, the paper presents the model parameters used in training and the performance metrics obtained. We’ve been able to achieve great results for the named entity recognition component, while the performance metrics of the relation extraction component can still be improved. The paper concludes with possible directions for further work on the implementation of the described system, including the relation extraction component improvements and new features implementation.</p>
      </abstract>
      <kwd-group xml:lang="en">
        <kwd>natural language processing</kwd>
        <kwd>named entity recognition</kwd>
        <kwd>relation extraction</kwd>
        <kwd>text analysis</kwd>
        <kwd>classification</kwd>
        <kwd>relational databases</kwd>
        <kwd>model building</kwd>
      </kwd-group>
    </article-meta>
  </front>
</article>
