<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "https://jats.nlm.nih.gov/publishing/1.3/JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xml:lang="en">
  <front xmlns:xlink="http://www.w3.org/1999/xlink">
    <journal-meta>
      <journal-title-group>
        <journal-title>Computing, Telecommunication and Control</journal-title>
        <trans-title-group xml:lang="ru">
          <trans-title>Информатика, телекоммуникации и управление</trans-title>
        </trans-title-group>
      </journal-title-group>
      <issn pub-type="epub">2687-0517</issn>
    </journal-meta>
    <article-meta xmlns:xlink="http://www.w3.org/1999/xlink">
      <article-id pub-id-type="publisher-id">5</article-id>
      <article-id pub-id-type="doi">10.18721/JCSTCS.18305</article-id>
      <title-group>
        <article-title>Method for automated enrichment of a knowledge base on glass compositions and properties based on data from scientific publications</article-title>
        <trans-title-group xml:lang="ru">
          <trans-title>Метод автоматизированного пополнения базы знаний о составах и свойствах стекол на основе данных из научных публикаций</trans-title>
        </trans-title-group>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid">0000-0002-7437-6153</contrib-id>
          <name>
            <surname>Pavlov</surname>
            <given-names>Evgeniy</given-names>
          </name>
          <xref ref-type="aff" rid="aff1"/>
          <email>pavlov_ea@spbstu.ru</email>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid">0000-0003-1116-7765</contrib-id>
          <contrib-id contrib-id-type="scopus">56049610600</contrib-id>
          <name>
            <surname>Drobintsev</surname>
            <given-names>Pavel</given-names>
          </name>
          <xref ref-type="aff" rid="aff2"/>
          <email>drobintsev_pd@spbstu.ru</email>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Klinkov</surname>
            <given-names>Victor</given-names>
          </name>
          <xref ref-type="aff" rid="aff2"/>
          <email>klinkovvictor@yandex.ru</email>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Semencha</surname>
            <given-names>Alexander</given-names>
          </name>
          <xref ref-type="aff" rid="aff2"/>
          <email>asemencha@spbstu.ru</email>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Igor</surname>
            <given-names>G.</given-names>
          </name>
          <xref ref-type="aff" rid="aff3"/>
          <email>igcher@spbstu.ru</email>
        </contrib>
      </contrib-group>
      <aff id="aff1">Санкт-Петербургский политехнический университет Петра Великого</aff>
      <aff id="aff2">Peter the Great St. Petersburg Polytechnic University</aff>
      <aff id="aff3">Peter the Great St.Petersburg Polytechnic University</aff>
      <pub-date publication-format="electronic" date-type="pub" iso-8601-date="2025-09-30">
        <day>30</day>
        <month>09</month>
        <year>2025</year>
      </pub-date>
      <volume>18</volume>
      <issue>3</issue>
      <fpage>58</fpage>
      <lpage>67</lpage>
      <self-uri xmlns:xlink="http://www.w3.org/1999/xlink" content-type="pdf" xlink:href="https://infocom.spbstu.ru/userfiles/files/articles/2025/3/58-67.pdf"/>
      <abstract xml:lang="en">
        <p>Automating the extraction of glass composition and property data from scientific literature is critically important for accelerating the development of new material. This work presents a method integrating: 1) the collection of full-text articles using the Elsevier Research Products APIs, 2) text preprocessing, 3) context-dependent extraction of structured data using a large language model (LLM) and a domain-specific prompt, 4) enrichment of a knowledge base on glasses. The key achievement is the development of a prompt that yields an F1-score of 0.99 for extracting chemical compositions, their properties and correctly establishing relationships between them on a sample of 50 articles. The proposed method significantly simplifies the automatic creation and continuous updating of knowledge bases on glasses, thereby eliminating the traditional reliance on manually curated, potentially outdated resources and providing a robust, data-driven foundation for the efficient designing of glasses with target properties using machine learning.</p>
      </abstract>
      <kwd-group xml:lang="en">
        <kwd>data extraction</kwd>
        <kwd>natural language processing</kwd>
        <kwd>LLM</kwd>
        <kwd>prompt engineering</kwd>
        <kwd>knowledge base</kwd>
        <kwd>glass</kwd>
        <kwd>glass properties</kwd>
      </kwd-group>
    </article-meta>
  </front>
</article>
