Using and analysis of formal methods for evaluating the relevance of automatically generated summaries of informational texts

Кузнєцов, ОлексійОлексійКузнєцовКисельов, ГеннадійГеннадійКисельов2026-03-132026-03-132024-12-20Кузнєцов, О., & Кисельов, Г. (2024). Застосування та аналіз формальних методів оцінювання релевантності автоматично створених рефератів інформаційних текстів. Сучасні інформаційні технології, (1), 32–48. https://doi.org/10.17721/AIT.2024.1.04УДК 519.81610.17721/AIT.2024.1.04https://ir.library.knu.ua/handle/15071834/12506B a c k g r o u n d . The article reviews existing approaches to evaluating the quality of automatically generated summaries of informational texts. It provides an overview of automatic summarization methods, including classical approaches and modern models based on artificial intelligence. The review covers extractive summarization methods such as TF-IDF and PageRank, as well as graph-based methods, specifically TextRank. Special attention is given to abstractive approaches, including Generative Pretrained Transformer (GPT) and Bidirectional and Auto-Regressive Transformers (BART) models. The quality of generated summaries is evaluated using quantitative metrics of summary relevance, particularly ROUGE and BLEU. M e t h o d s . The article analyzes several approaches to automatic text summarization. Classical extractive methods, such as TF -IDF, calculate the importance of terms based on their frequency within a document and across a collection of documents. PageRank and TextRank utilize graph models to determine the significance of sentences based on the connections between them. Abstractive methods, s uch as GPT and BART, generate new sentences that succinctly convey the content of the original text. The effectiveness of each approach is assessed usi ng ROUGE and BLEU metrics, which measure the overlap between automatically generated summaries and reference texts. Particular a ttention is given to analyzing their accuracy, flexibility, resource requirements, and ease of implementation. R e s u l t s . The results of the study show that ROUGE metrics demonstrate good accuracy in measuring n-gram overlaps (sequences of n words), while BLEU is effective in machine translation tasks but may not account for certain syntactic features of the text. The evaluation of automatic summarization methods using these metrics revealed that extractive summarization methods, such as TF -IDF, are effective for processing simple texts but may lose important context in complex texts. PageRank and TextRank consider the connections between sentences but may produce less relevant results for texts with weak structural connections. Abstractive models like GPT and BA RT provide a more flexible approach to summarization, creating new sentences that better convey the meaning, though they require significant computational resources and are complex to implement. C o n c l u s i o n s . Combining classical and modern methods of automatic text summarization allows for achieving higher quality results. It is important to consider the specificity of the text and the requirements for the final outcome, adapting the selected approa ches and metrics according to the task.В с т у п . Розглянуто існуючі підходи до оцінювання якості автоматично створених рефератів інформаційних текстів. Дано огляд методів автоматичного реферування, включаючи класичні підходи та сучасні моделі на основі штучного інтелекту. Огляд містить екстрактивні методи реферування, такі як TF-IDF та PageRank, а також графові методи, зокрема TextRank. Особливу увагу приділено абстрактним підходам, що включають моделі Generative Pretrained Transformer (GPT) і Bidirectional and Auto-Regressive Transformers (BART). Оцінювання якості генерованих рефератів виконують за допомогою кількісних метрик оцінювання релевантності рефератів, зокрема і ROUGE та BLEU. М е т о д и . Проаналізовано кілька підходів до автоматичного реферування текстів. Класичні екстрактивні методи, зокрема і TF-IDF, обчислюють важливість термів на основі частоти їхнього вживання в документі та в колекції документів. PageRank і TextRank використовують графові моделі для визначення значущості речень на основі зв’язків між ними. Абстрактні методи, такі як GPT і BART, генерують нові речення, що стисло передають зміст оригінального тексту. Оцінювання ефективності кожного підходу здійснюється метриками ROUGE і BLEU, які вимірюють збіг між автоматично згенерованими рефератами й еталонними текстами. Особливу увагу приділено аналізу їхньої точності, гнучкості, вимогам до ресурсів і простоті реалізації. Р е з у л ь т а т и . Результати дослідження свідчать, що метрики ROUGE показують хорошу точність у вимірюванні збігів n-грам (послідовностей з n слів), тоді як BLEU ефективна у завданнях машинного перекладу, але може не враховувати деякі синтаксичні особливості тексту. Оцінювання методів автоматичного реферування за допомогою цих метрик показала, що екстрактивні методи реферування, такі як TF-IDF, є ефективними для оброблення простих текстів, але можуть втратити важливий контекст у складних текстах. PageRank і TextRank дозволяють враховувати зв’язки між реченнями, проте можуть давати менш релевантні результати для текстів із слабко вираженими структурними зв’язками. Абстрактні моделі GPT і BART забезпечують гнучкіший підхід до реферування, створюючи нові речення, що краще передають зміст, однак потребують значних обчислювальних ресурсів і складні у впровадженні. В и с н о в к и . Поєднання класичних і сучасних методів автоматичного реферування текстів дозволяє досягти вищої якості результатів. Важливо враховувати специфіку тексту та вимоги до кінцевого результату, адаптуючи обрані підходи та метрики відповідно до завдання.ukautomatic summarizationextractive methodsabstractive methodsGPTBARTROUGEBLEUTextRankPageRankTF-IDF.автоматичне реферуванняекстрактивні методиабстрактні методиGPTBARTROUGEBLEUTextRankPageRankTF-IDF.Using and analysis of formal methods for evaluating the relevance of automatically generated summaries of informational textsЗастосування та аналіз формальних методів оцінювання релевантності автоматично створених рефератів інформаційних текстівСтаття