Development of a cross-platform Model of the Unified Algorithmic Environment for the recognition of explosive objects

Kunichik, OleksandrOleksandrKunichikТерещенко, Василь Миколайович2026-01-282026-01-282025Kunichik O. Development of a cross-platform model of the unified algorithmic environment for the recognition of explosive objects : PhD thesis : 121 Software Engineering. Kyiv, 2025. 256 p.UDC 004.93https://ir.library.knu.ua/handle/15071834/9962Oleksandr Kunichik. Development of a cross-platform Model of the Unified Algorithmic Environment for the recognition of explosive objects. — Qualifying scientific work as a manuscript. Thesis for the Doctor of Philosophy Degree in Specialty 121 "Software Engineering" (12 — Information Technology). — Taras Shevchenko National University of Kyiv, Kyiv, 2025. Abstract The persistent threat of landmines, explosive ordnance, and other explosive objects (EO) continues to claim lives, inflict devastating injuries, and impede socio-economic development in conflict-affected regions worldwide. EO comprise a variety of munitions containing explosives, including bombs, warheads, rockets, artillery shells, mines, torpedoes, as well as improvised explosive devices and other dangerous objects that can explode under certain conditions [12]. The Russian aggression against Ukraine has resulted in large-scale contamination of the territory with mines and unexploded ordnance (UXO). The process of demining and overcoming the consequences of the war is complicated by the large number of types of EO, the significant areas of contaminated territories, and the variety of mining methods and tactics used. The widespread presence of minefields makes it impossible to safely access land for agriculture, infrastructure development, and housing reconstruction, which significantly hinders the recovery of deoccupied areas. Mine action in Ukraine began in 2014, when Russia illegally annexed Crimea and launched the war in the Donbas. After the full-scale invasion in February 2022, Ukraine faced unprecedented levels of mine contamination and, according to preliminary estimates, became the most mined country in the world [13]. Traditional, mostly manual, demining methods remain essential, but are associated with high risk, significant time consumption, and low efficiency, especially in large areas. The search for and removal of EO is further complicated by the wide variety of their types, their unpredictable and often concealed lo- cations, and the diversity of landscapes and weather conditions encountered in contaminated areas. For example, dense vegetation, uneven terrain, and varying weather conditions can significantly hinder the detection process. In this context, innovative technologies, such as deep learning and computer vision methods, are opening up new opportunities to improve the effectiveness and safety of mine action. To address the urgent global challenge of EO contamination, this thesis develops and evaluates a unified algorithmic environment for real-time EO detection, leveraging 3D printing, advanced data augmentation, deep learning, and a user-friendly cross-platform application. This research encompasses the creation of a comprehensive dataset using 3D-printed replicas of prevalent land- mine types, the application of a novel two-stage data augmentation strategy, the training and optimization of a YOLO object detection model (tested were versions 5, 8, and 11), and the development of a cross-platform application, along with a messenger bot interface, for efficient and accessible EO detection. The aim of this dissertation is to develop and evaluate a unified algorithmic environment for real-time EO detection that utilizes a cross-platform application, a messenger bot interface, and cloud-based processing. This research endeavors to provide an efficient and accessible method for EO detection by leveraging the scalability of cloud computing, the accessibility of mobile devices and messaging platforms, and the power of deep learning. To accomplish this goal, the following objectives were established: 1. Develop methodologies to address the lack of data by using augmentation methods, particularly a two-stage augmentation approach. 2. Create a comprehensive dataset for training computer vision models that includes images of 3D-printed replicas of the most common anti- personnel landmines in Ukraine, obtained under different weather conditions (clear, cloudy, rain, snow, etc.). 3. Train and optimize YOLOv8 and YOLOv11 computer vision models on the created dataset by leveraging the increased number and diversity of images from the previous steps, applying the developed two-stage augmentation methodology, and fine-tuning model hyperparameters. 4. Evaluate the effectiveness of trained models on real landmine images from a distinct dataset obtained from professional deminers. 5. Design, implement, and evaluate a user-friendly cross-platform application capable of both online (utilizing the Google Cloud Platform API) and offline (using an optimized on-device model) EO detection. 6. Develop and integrate a messenger bot interface that interacts with the same Google Cloud Platform API for EO detection, providing an alternative access point to the system. 7. Implement the ability within the cross-platform application to correct recognition results by marking objects and selecting the correct EO type, contributing to ongoing model refinement. 8. Establish a secure data transmission mechanism within the application for sending operational data, including user feedback and corrections, to a central server to facilitate continuous improvement of the machine learning models. The object of the present study is the Model of the Unified Algorithmic Environment (MUAE) 1 for unifying the process of recognizing explosive objects and its practical implementation in the form of a cross-platform application and messenger bot. The subject of research includes deep learning models for EO recognition, data augmentation methods, dataset creation technologies, data collection methods, as well as the architecture of a multi-platform EO detection system focused on online and offline operation. 1The Model of a Unified Algorithmic Environment (MUAE) is a conceptual model for a universal system employing a single algorithmic framework, shared data structures, and a common toolkit to solve a complex set of interrelated applied tasks. Methodology The research methodology employed in this thesis encompasses a multi- faceted approach, integrating techniques from data collection and curation to model development and system implementation. The key components of the methodology include: — Dataset Collection and Preparation: This involves the acquisition of EO images through various methods, including the creation of a novel dataset based on 3D-printed replicas (Chapter 4), the collection of real landmine images from demining professionals, and the utilization of an expert data contribution platform integrated within the developed application (Chapter 6). — Data Augmentation: To enhance the size and diversity of the training dataset, a two-stage data augmentation strategy is employed, as detailed in Chapter 3. This includes both basic transformations (e.g., rotation, scaling, flipping) and advanced techniques (e.g., MixUp, Cut- Mix, mosaic). — Deep Learning Model Development and Adaptation: The re- search focuses on the development, optimization, and adaptation of deep learning models for landmine detection. Initially, the YOLOv8 architecture was employed. The process included hyperparameter tuning, model training, and validation using the augmented dataset of 3D- printed replicas. Subsequently, in Chapter 7, the methodology was extended to incorporate a new, previously untrained EO type using the YOLOv11 model, demonstrating the system’s adaptability to evolving real-world scenarios. — Cross-Platform Application Development: A cross-platform ap- plication is developed using the Qt framework, QML, and C++ (Chapter 6). The application incorporates the trained YOLO models for both online (cloud-based) and offline (on-device) detection, and provides fea tures for user interaction, data annotation, and feedback. — Cloud Infrastructure and Distributed Systems: The system lever- ages the Google Cloud Platform (GCP) for scalable and reliable operation. This includes the use of Google Cloud Functions for serverless computing, Google Cloud Storage for data storage, and Google Fire- store for database management. — Object-Oriented Programming: The software components of the system, including the cross-platform application and some parts of the cloud infrastructure, are implemented using object-oriented program- ming principles in C++ and Python to ensure modularity, maintain- ability, and code reusability. — Evaluation and Validation: The performance of the developed models and the application is evaluated using standard metrics such as precision, recall, mAP, and others (Section 2.9). Experiments are conducted on both synthetic datasets (3D-printed replicas) and real landmine im- ages to assess the system’s effectiveness and generalizability. At the outset of this research, a comprehensive review of existing studies in the field of EO detection and recognition was conducted. This review identified the primary challenges faced by researchers in developing state-of-the-art EO recognition methods based on computer vision. The analysis revealed that while deep learning models offer the potential for high accuracy and speed in the recognition process, their effective training is critically hindered by the severe scarcity of high-quality training data in this domain. To address the issue of limited data, this study employed 3D printing to create models replicating the visual characteristics of anti-personnel landmines commonly found in Ukraine. Furthermore, the role of data augmentation techniques in enhancing the performance of deep learning models was investigated. A wide range of augmentation methods were considered, from basic spatial transformations and pixel-level adjustments to more advanced techniques such as MixUp and CutMix. This research proposes a two-stage augmentation strategy (see Section 3), which, as demonstrated, increased the recall of the YOLOv8 model from 89.2% (without augmentation) to 92.6% (with two-stage augmentation) on the test dataset. The methodology was initially developed for the YOLOv5 model and subsequently successfully adapted to newer versions – YOLOv8, and to YOLOv11. A significant advantage of the two-stage augmentation method was demonstrated when transitioning from the YOLOv8 model to YOLOv11. Specifically, applying the mosaic type augmentation in both the first and second stages improved precision from 95% to 98% and recall from 92% to 97% (see Table 7.1). These results underscore the effectiveness of the YOLO architecture, particularly when combined with the developed two-stage augmentation strategy. However, the study also highlights the challenges and limitations as- sociated with data augmentation in EO detection, such as the risk of creating irrelevant or harmful augmentations, over-reliance on augmentation, and the need to ensure a balanced representation of all classes in the training dataset. The results of this stage of the research will be used in the following sections, which focus on creating a dataset based on 3D-printed synthetic images of landmines and developing a cross-platform system that utilizes a mobile app for both EO detection (recognizing objects in images) and data collection (al- lowing in-app annotation and upload) to improve the model. In accordance with the formulation of the strategy for applying augmentation methods (see Section 3), an investigation was conducted into existing methods for creating datasets of EO images. Among the approaches considered for dataset creation, the use of 3D printing to create replicas of common EO types and the photographing of these replicas was selected. The resulting 3D-printed landmine replicas were then employed to create a dataset, with manual annotation of the data including the marking of object boundaries (bounding boxes). Subsequently, experiments were conducted involving the training of deep learning models, specifically YOLOv8, where the previously developed methodology for applying augmentation methods was also employed. The trained models were then tested on images of real landmines obtained from professional deminers. The evaluation of this dataset revealed that while the models exhibited the capacity to recognize EO, further enhancement was necessary, particularly through the augmentation of the training dataset with additional images obtained from professional deminers. The efficacy of the synthetic dataset, comprising 1,438 photographs of 3D-printed replicas of five prevalent EO types, in training the YOLOv8 model was substantiated, attaining a precision of 98.0% and a recall of 98.2% on the test set. The research findings underscore the potential of 3D printing in creating diverse and representative training datasets for computer vision models, offering a safe, ethical, and cost-effective approach for both research and practical training. Despite the high performance on synthetic data, a difference in effectiveness was observed when compared to the performance on real landmine images, underscoring the necessity for further advancements in the 3D printing process and data generation. The analysis of individual class performance reveals substantial variations in precision and recall across different types of landmines, underscoring the need for further investigation into the features utilized by the model for classification and methods to enhance the model’s capacity to discern between EO and visually similar background objects and debris. The next stage involved developing a cloud-based service with a messenger bot interface to provide practical access to the trained EO detection models. The service integrates with a popular messenger platform, Telegram, that sup- ports bot creation. This platform was chosen for its large user base, robust bot API, and end-to-end encryption capabilities. The system utilizes the trained YOLOv8 model and integrates it with the GCP for efficient processing and scalability. The bot is designed to identify various EO types with high accuracy and provides users with supplementary information through integration with Google Gemini. This feature enhances user understanding of the detected threats and contributes to raising public awareness about the safety measures associated with explosive objects. To improve the performance metrics of the models and expand the scope of their practical application, a multi-platform application has been developed that is capable of operating on a variety of operating systems. The application utilizes previously developed and trained models. The application has been tested with the help of volunteers, including professional deminers and military engineers. The developed cross-platform application for real-time EO detection is a valuable tool for demining teams, humanitarian organizations, and civilians in explosive-affected areas. The application supports both online and offline modes, using deep learning models. On the test dataset, the application demonstrates a recall of 88% with an average processing time of 2.1 seconds per image. An important feature of the application is the ability for users to directly adjust the recognition results. This allows for the collection of feedback data to correct model errors and augment the training set. The application also provides for the transmission of detection results, including operational logs, local database copies, and images (both original and user-modified) to a remote server. The data obtained in this manner serves as a valuable source of information for the further improvement of the models. Future development initiatives include expanding recognition to new types of EO, improving the existing model based on collected data and user feedback, and the creation of a training module. Continuous development and improvement of this application can make a significant contribution to global demining efforts and enhance the safety of communities. Building on the previous sections, the developed software components, including the cross-platform application and messenger bot, together form a unified algorithmic environment. This environment supports the centralized training, deployment, and ongoing refinement of deep learning models for a wide range of EO detection tasks, including user education, real-time detection, and the collection of user-annotated data for continuous model improvement. The research encompasses several key stages. Firstly, a comprehensive review of existing methods for detecting EO was conducted, along with an in- depth analysis of the challenges associated with data limitations and potential solutions. Secondly, various deep learning model architectures, specifically YOLOv5, YOLOv8, and YOLOv11, were developed, trained, and comparatively analyzed. Thirdly, a software package, including a cross-platform application and a messenger bot, was developed, and its effectiveness was evaluated under conditions closely approximating real-world scenarios. The results of this research are expected to make a significant contribution to solving the problem of mine hazards, thus contributing to restoring the safety and well-being of the population in Ukraine and other affected regions. While the primary focus of the research is on computer vision-based detection methods using visible light images, the proposed approaches can be adapted for use with other types of sensors, such as infrared cameras, lidars, magnetometers, and ground-penetrating radar (GPR). The effectiveness of the developed methods is evaluated through training and testing of machine learning models on the collected and augmented datasets, using the metrics described in Section 2.9. The scientific novelty of the study is as follows: For the first time: • A new algorithmic platform based on the Model of the Unified Algorithmic Environment (MUAE) for Explosive Objects (EO) detection has been developed, based on common principles of data processing, unified methods of augmentation, dataset creation, annotation and tuning of hyperparameters of deep learning models. The MUAE platform integrates a cross-platform application with offline and semi-automatic annotation capabilities, a messenger bot, and a cloud API, providing data collection, training, deployment, and iterative model improvement. • An offline EO detection capability integrated into a cross-platform application has been implemented, enabling the detection of explosives on mobile devices without an internet connection. • A semi-automatic annotation mechanism integrated into a cross-platform application, allowing users to correct recognition results (add, delete, change labels and boundaries of objects) and send annotated data for further model improvement. • A cloud API based on GCP has been created for EO recognition tasks, which allows online recognition and is integrated with a cross-platform application and a messenger bot; in addition to recognition, this bot provides additional information about the recognized objects using the large Google Gemini language model. Improvements were made to: • The methodology for forming training datasets using 3D-printed copies that reproduce the visual characteristics of EO common in Ukraine was developed. The trained models were then tested on a separate set of real explosive objects provided by demining specialists and volunteers. • A methodology for applying data augmentation is proposed by developing and implementing a two-stage augmentation optimized for the YOLO family of models (YOLOv5, YOLOv8, YOLOv11). The theoretical significance of the obtained results is as follows: • Extension of the methodology for creating training data: The efficacy of a methodology for generating training datasets for EO recognition based on 3D-printed replicas has been experimentally validated, paving the way for the generation of controlled and secure data for training deep learning models. This methodology can be adapted to other computer vision tasks where there is a shortage of real-world data. • Improvement of application of augmentation methods: A two- stage data augmentation strategy was developed and validated for the YOLO family of models. This strategy exhibited a substantial enhancement in the recognition completeness rate, a critical factor for EO detection. • Development of the concept of a single algorithmic environment: A Model of Unified Algorithmic Environment (MUAE) was pro- posed and implemented to integrate different stages of recognition system development (data collection, training, deployment, feedback) into a single, manageable process. This model has the potential to serve as a foundational framework for the development of analogous systems in other application domains. The Practical Value of the results obtained is the development of a ready- to-use software package for the recognition of EO. This software package has the following advantages: • Increased efficiency and safety of humanitarian demining: A cross-platform application with offline recognition capability allows deminers to quickly identify EO on mobile devices directly in the field, even in the absence of an internet connection. The average image recognition time is 2.1 seconds, which ensures near real-time operation. • Simplifying the process of training and informing: The messenger bot with the integrated Google Gemini language model can be used to train deminers, military personnel and civilians by providing information on the types of EO and rules of safe behavior. • Ensures fast data collection and updating: A semi-automatic an- notation mechanism integrated into the application allows for the rapid collection of data on new types of EO, recognition errors, and peculiarities of real-world conditions, which contributes to the continuous improvement of models. • Ability to integrate with other systems: The developed cloud API can be used both independently for online recognition and integrated into other systems, for example, to automate the demining process using unmanned aerial vehicles (UAV) and robots. • Potential for scaling and adaptation: The developed system can be adapted to recognize other types of threats (not only EO) and for use in other regions. The results of this study have significant potential for practical application in the field of humanitarian demining, especially in Ukraine and other regions affected by armed conflicts. The developed cross-platform software system, which integrates innovative methods of data collection and analysis, con- tributes to solving the urgent problem of mine risk and will positively impact the reconstruction and development of the affected areas. The proposed data collection and augmentation approaches, including the innovative use of 3D printing to create realistic EO models, enable the creation of high-quality and representative training sets. This improves the accuracy and reliability of the machine learning models, which is critical for effective and accurate EO detection and localization. The developed cross-platform application facilitates the involvement of a wide range of experts and volunteers in the data collection process, ensuring the data is up-to-date and relevant to real-world conditions. This enables a rapid response to changes in the mine situation and increases the efficiency of mine action. The integration of the trained machine learning models into a multi-platform software system creates a powerful tool for automating the process of detecting and locating EO. This will significantly reduce risks to personnel involved in demining operations and accelerate the clearance of areas from mines and other unexploded objects. Furthermore, the software package can be used to monitor and control the effectiveness of demining operations and to plan further activities. The use of the developed system will accelerate the recovery of the affected regions by enabling access to agricultural land, restoring infrastructure, and creating conditions for sustainable economic development. Reducing the number of casualties among civilians and soldiers by improving the accuracy and safety of EO detection is a crucial humanitarian outcome of this research. The results and developed technologies have the potential for wide application not only in Ukraine but also in other countries facing the problem of mine hazards. The proposed approaches can be adapted to different types of terrain and conditions, making them a versatile tool for humanitarian demining. Thus, this thesis makes a significant contribution to the global problem of mine hazards and helps to restore the safety and well-being of the population in conflict-affected regions. Personal contribution of the applicant: The dissertation is an in- dependent scientific work that highlights the author’s ideas and developments that allowed him to solve the tasks. The work contains theoretical and methodological provisions and conclusions formulated by the author personally. The ideas, provisions, or hypotheses of other authors used in the dissertation have appropriate references and are used only to support the applicant’s ideas. The author conducted the research and experiments independently, the created soft- ware product is entirely the result of the applicant’s work.Кунiчiк О.В. Розробка кросплатформної Моделi Єдиного Алгоритмiчного Середовища для розпiзнавання вибухонебезпечних предметiв. Квалiфiкацiйна наукова праця на правах рукопису. Дисертацiя на здобуття наукового ступеня доктора фiлософiї за спецi- альнiстю 121 «Iнженерiя програмного забезпеченняк (12 — Iнформацiйнi технологiї). — Київський нацiональний унiверситет iменi Тараса Шевченка, Київ, 2025. Забруднення територiй наземними мiнами та вибухонебезпечними предметами (ВНП) є однiєю з найсерйознiших проблем, що постають перед свiтом сьогоднi, несучи загрозу життю людей та перешкоджаючи соцiально-економiчному розвитку постраждалих вiд конфлiктiв регiонiв. До ВНП належать рiзноманiтнi боєприпаси, що мiстять вибуховi речовини, зокрема: бомби, боєголовки, ракети, артилерiйськi снаряди, мiни, торпеди, глибиннi бомби, а також саморобнi вибуховi пристрої та iншi небезпечнi предмети, здатнi вибухати за певних умов [12]. росiйська агресiя проти України призвела до масштабного забруднення територiї мiнами та боєприпасами, що не розiрвалися. Процес розмiнування та подолання наслiдкiв вiйни ускла- днює велика кiлькiсть рiзновидiв ВНП, значнi площi забруднених терито- рiй, а також рiзноманiття застосованих методiв i тактик мiнування. Широке застосування мiнних загороджень унеможливлює безпечний доступ до земель для ведення сiльського господарства, розвитку iнфраструктури та вiдбудови житла, що суттєво стримує вiдновлення деокупованих терито- рiй. Протимiнна дiяльнiсть в Українi розпочалася у 2014 роцi, коли росiя незаконно анексувала Крим та розв’язала вiйну на Донбасi. Пiсля повно- масштабного вторгнення в лютому 2022 року Україна зiткнулася з безпрецедентним рiвнем мiнного забруднення, ставши, за попереднiми оцiнками, найбiльш замiнованою країною у свiтi [13]. Традицiйнi, переважно ручнi, методи розмiнування залишаються незамiнними, проте вони пов’язанi з високим ризиком, значними часовими витратами та низькою ефективнiстю, особливо на великих площах. Пошук та знешкодження ВНП ускладнюється широким спектром їх типiв, непередбачуванiстю розташування, а також рiзноманiттям ландшафтiв та погодних умов. У цьому контекстi iнновацiйнi технологiї, зокрема методи глибокого навчання та комп’ютерного зору, вiдкривають новi можливостi для пiдвищення ефективностi та безпеки протимiнної дiяльностi. Ця дисертацiя спрямована на вирiшення нагальної глобальної проблеми забруднення мiсцевостi вибухонебезпечними предметами шляхом розробки нового унiфiкованого алгоритмiчного середовища для виявлення ВНП у реальному часi. Дослiдження зосереджене на створеннi комплексного на- бору даних з використанням 3D друкованих копiй поширених типiв ВНП, застосуваннi передових методiв доповнення даних, навчаннi та оптимiзацiї моделi виявлення об’єктiв YOLO (було протестовано версiї 5, 8 та 11), розробцi крос-платформного додатка, а також месенджер-бота для доступного та ефективного виявлення ВНП. Метою цiєї дисертацiйної роботи є розробка та оцiнка унiфiкованого алгоритмiчного середовища для виявлення ВНП у режимi реального часу, що використовує крос-платформний додаток, iнтерфейс месенджер-бота та хмарну обробку даних. Це дослiдження має на метi запропонувати ефе- ктивний та доступний метод виявлення ВНП, використовуючи масштабованiсть хмарних обчислень, доступнiсть мобiльних пристроїв та платформ обмiну повiдомленнями, а також потужнiсть глибокого навчання. Для досягнення цiєї мети були поставленi наступнi завдання: 1. Розробити методологiї вирiшення нестачi даних шляхом використання методiв аугментацiї, зокрема, двоетапного пiдходу до аугментацiї. 2. Створити вичерпний набiр даних для навчання моделей комп’ютерного зору, що включає зображення 3D друкованих копiй найпоширенiших в Українi протипiхотних мiн, отриманих за рiзних погодних умов (яскраве сонце, хмарнiсть, дощ, снiг тощо). 3. Навчити та оптимiзувати моделi комп’ютерного зору YOLOv8 та YOLOv11 на створеному наборi даних, використовуючи збiльшену кiлькiсть та рiзноманiтнiсть зображень, отриманих на попереднiх етапах, застосовуючи розроблену двоетапну методологiю аугментацiї та точно налаштовуючи гiперпараметри моделi. 4. Оцiнити ефективнiсть навчених моделей на реальних зображеннях ВНП з окремого набору даних, отриманого вiд професiйних саперiв. 5. Спроектувати, реалiзувати та оцiнити зручний крос-платформний додаток, здатний як до онлайн (з використанням API Google Cloud Platform), так i до офлайн (з використанням оптимiзованої моделi на пристрої) виявлення ВНП. 6. Розробити та iнтегрувати iнтерфейс месенджер-бота, який взаємо- дiє з тим самим API Google Cloud Platform для виявлення ВНП, забезпечуючи альтернативну точку доступу до системи. 7. Реалiзувати в рамках крос-платформного застосунку можливiсть виправлення результатiв розпiзнавання шляхом маркування об’єктiв та вибору правильного типу ВНП, що сприятиме постiйному вдосконаленню моделi. 8. Створити безпечний механiзм передачi даних у додатку для надсилання операцiйних даних, включаючи вiдгуки користувачiв та виправлення, на центральний сервер для сприяння безперервному вдосконаленню моделей машинного навчання. Об’єктом дослiдження є процес розпiзнавання вибухонебезпечних предметiв (ВНП), що включає побудову, навчання, оптимiзацiю та розгортання вiдповiдних моделей глибокого навчання, а також створення Моделi Єдиного Алгоритмiчного Середовища (МЄАС) для iнтеграцiї цих моделей у крос-платформенний застосунок та месенджер-бот. Модель Єдиного Алгоритмiчного Середовища (МЄАС) – це алгоритмiчна архiтектура, що дозволяє створювати унiверсальнi системи для розв’язання комплексу взаємопов’язаних задач на базi єдиної алгоритмi- чної платформи або каркасу (framework). В основi МЄАС лежить використання спiльних структур даних та унiфiкованого набору алгоритмiчних iнструментiв i процедур, що забезпечує єдиний пiдхiд до вирiшення рiзноманiтних проблем у межах однiєї системи. Така архiтектура може бути реалiзована рiзними способами, наприклад, через рекурсивно-паралельний алгоритм або як мульти-алгоритмiчна платформа, але ключовим залишається спiльне середовище, яке уможливлює ефективне розв’язання широкого класу прикладних задач шляхом унiфiкацiї та перевикористання обчислювальних компонентiв. Предмет дослiдження є моделi глибокого навчання для розпiзнавання ВНП, методи аугментацiї даних, технологiї створення наборiв даних, методи збору даних, а також архiтектура мультиплатформної системи ви- явлення ВНП, орiєнтованої на роботу в режимах онлайн та офлайн. Методологiя Застосована в цiй дисертацiї методологiя дослiдження охоплює багатогранний пiдхiд, що поєднує методи збору й обробки даних, розробки моделей та впровадження систем. Основнi складовi методологiї включають: — Збiр та пiдготовка даних: Це включає отримання зображень ВНП рiзними методами, зокрема створення оригiнального набору даних на основi 3D друкованих копiй (Роздiл 4), збiр зображень реальних мiн вiд професiйних саперiв, а також, потенцiйно, вико- ристання краудсорсингової платформи, iнтегрованої в розроблений додаток (Роздiл 6). — Аугментацiя даних: Для збiльшення розмiру та рiзноманiтностi навчального набору даних застосовується двоетапна стратегiя аугментацiї, детально описана в Роздiлi 3. Вона включає як базовi трансформацiї (наприклад, обертання, масштабування, перевертання), так i просунутi методи (наприклад, MixUp, CutMix, мозаїка). — Розробка та адаптацiя моделей глибокого навчання: Дослiдження зосереджено на розробцi, оптимiзацiї та адаптацiї моделей глибокого навчання для виявлення ВНП. Спочатку була використана архiтектура YOLOv8. Процес включав налаштування гiперпараметрiв, навчання моделi та перевiрку з використанням доповненого набору даних 3D друкованих копiй. Згодом, у Роздiлi 7, методологiю було розширено для включення нового, ранiше не тренованого типу ВНП з використанням моделi YOLOv11, демонструючи адаптивнiсть системи до мiнливих реальних сценарiїв. — Розробка крос-платформного застосунку: Крос-платформний застосунок розроблено з використанням фреймворку Qt, QML та C++ (Роздiл 6). Застосунок включає навченi моделi YOLO як для онлайн (на основi хмарних сервiсiв), так i для офлайн (на пристрої) виявлення ВНП, а також надає можливостi для взаємодiї з користувачем, анотацiї даних та зворотного зв’язку. — Хмарна iнфраструктура та розподiленi системи: Система використовує Google Cloud Platform (GCP) для забезпечення масштабованостi та надiйностi роботи. Це включає використання Google Cloud Functions для хмарних обчислень, Google Cloud Storage для зберiгання даних та Google Firestore для управлiння базами даних. — Об’єктно-орiєнтоване програмування: Програмнi компоненти системи, включаючи крос-платформний застосунок та деякi частини хмарної iнфраструктури, реалiзованi з використанням принципiв об’єктно-орiєнтованого програмування на C++ та Python для забезпечення модульностi, зручностi обслуговування та можливостi повторного використання коду. — Оцiнка та валiдацiя: Ефективнiсть розроблених моделей i застосунку ретельно оцiнюється з використанням стандартних метрик, таких як точнiсть, повнота, mAP та iнших (Роздiл 2.9). Експери- менти проводяться як на синтетичних наборах даних (3D друкованi копiї), так i на реальних зображеннях ВНП для оцiнки ефективностi та узагальнювальної здатностi системи. На початку дослiдження було проведено огляд iснуючих робiт у галузi виявлення та розпiзнавання вибухонебезпечних предметiв (ВНП), що дозволило виокремити основнi проблеми та виклики, з якими стикаються дослiдники при розробцi сучасних методiв розпiзнавання ВНП на основi комп’ютерного зору. Аналiз виявив, що моделi глибокого навчання здатнi забезпечити високу точнiсть та швидкiсть розпiзнавання, проте критичною перешкодою для їх ефективного навчання є гостра нестача якiсних навчальних даних у цiй предметнiй областi. Для подолання проблеми обмежених даних було використано 3D друк моделей, що вiдтворюють вiзуальнi характеристики протипiхотних мiн, поширених в Українi. Також було дослiджено роль методiв аугментацiї (штучного розширення даних) у покращеннi результатiв роботи моделей глибокого навчання. Розглянуто широкий спектр методiв аугментацiї, вiд базових просторових перетворень i перетворень на рiвнi пiкселiв до просунутих методiв, таких як MixUp, CutMix. Було запропоновано двоетапну стратегiю аугментацiї (див. Роздiл 3), яка, як було продемонстровано, пiдвищила повноту (recall) моделi YOLOv8 з 89,2% (без аугментацiї) до 92,6% (з двоетапною аугментацiєю) на тестовому наборi даних. Методологiя була спочатку розроблена для моделi YOLOv5, а згодом успiшно адаптована до новiших версiй — YOLOv8, а та- кож до YOLOv11. При переходi вiд моделi YOLOv8 до YOLOv11 було про- демонстровано значну перевагу двоетапного методу аугментацiї. Зокрема, застосування аугментацiї типу «мозаїкак як на першому, так i на другому етапi, покращило показники точнiстi (precision) з 95% до 98%, а повноту з 92% до 97% (див. Таблицю 7.1). Отриманi результати пiдкреслюють ефективнiсть архiтектури YOLO, особливо у поєднаннi з розробленою двоетапною стратегiєю аугментацiї. Однак дослiдження також висвiтлює виклики i обмеження, пов’язанi з аугментацiєю даних при виявленнi ВНП, такi як ризик створення нерелевантних або шкiдливих аугментацiй, надмiрна залежнiсть вiд аугментацiї та необхiднiсть забезпечення збалансованого представлення всiх класiв у навчальному наборi даних. Результати цього етапу дослiдження будуть використанi у наступних роздiлах, присвячених створенню датасету на основi 3D друку синтетичних зображень ВНП i розробцi крос-платформної системи, яка використовує мобiльний додаток як для виявлення ВНП (розпiзнавання об’єктiв на зображеннях), так i для збору даних (з можливiстю редагування та вивантаження на сервер в додатку) з метою вдосконалення моделi. Пiсля формулювання стратегiї застосування методiв аугментацiї (Роз- дiл 3) дослiджено iснуючi методи створення наборiв даних iз зображень ВНП. Серед розглянутих методiв створення датасетiв обрано використання тривимiрного друку для створення копiй поширених типiв ВНП та фотографування цих копiй. Отриманi 3D друкованi копiї використовуються для створення датасету, проводиться ручна анотацiя даних з розмiткою меж об’єктiв (bounding boxes). Потiм проводяться експерименти з навчання моделей глибокого навчання, зокрема YOLOv8, де також використовується розроблена ранiше методика застосування методiв аугментацiї. Отриманi моделi тестуються на зображеннях справжнiх ВНП, отриманих вiд профе- сiйних саперiв. На основi результатiв тестування робиться висновок, що, хоча отриманi моделi i демонструють можливiсть розпiзнавання ВНП, їх необхiдно продовжувати покращувати, зокрема шляхом розширення навчального набору даних за допомогою додаткових зображень, отриманих вiд професiйних саперiв. Створений синтетичний набiр даних, що складає- ться з 1438 фотографiй 3D друкованих копiй п’яти поширених типiв ВНП, продемонстрував свою ефективнiсть у навчаннi моделi YOLOv8, досягнувши точностi 98,0% i повноти 98,2% на тестовому наборi даних. Результати дослiдження також пiдкреслюють потенцiал 3D друку у створеннi рiзноманiтних i репрезентативних навчальних наборiв даних для навчання моделей комп’ютерного зору, пропонуючи безпечний, етичний та економiчно ефективний пiдхiд як для дослiджень, так i для практичної пiдготовки. Незважаючи на високi показники на синтетичних даних, ви- явлено рiзницю в ефективностi при порiвняннi з показниками на реальних зображеннях ВНП, що пiдкреслює необхiднiсть подальшого вдосконалення процесу 3D друку та генерацiї даних. Аналiз продуктивностi окремих класiв також показує значнi вiдмiнностi в точностi та повнотi розпiзнавання мiж рiзними типами ВНП, що наголошує на необхiдностi подальшого дослiдження ознак, якi використовуються моделлю для класифiкацiї, i методiв покращення здатностi моделi розрiзняти наземнi ВНП i вiзуально схожi фоновi об’єкти та смiття. Наступним етапом стала розробка хмарного сервiсу з iнтегрованим iнтерфейсом месенджер-бота, що забезпечує практичний доступ до навчених моделей виявлення ВНП. Цей сервiс iнтегровано з популярною платформою обмiну повiдомленнями Telegram, яка пiдтримує створення ботiв. Такий вибiр платформи зумовлений її широкою розповсюдженiстю, надiйним API для створення ботiв та можливостями наскрiзного шифрування. Система використовує навчену модель YOLOv8 та iнтегрована з GCP для ефективної обробки даних i масштабованостi. Бот розроблений для iдентифiкацiї рiзних типiв ВНП з високою точнiстю та надає користувачам додаткову iнформацiю завдяки iнтеграцiї з Google Gemini. Ця функцiя покращує розумiння користувачем виявлених загроз i сприяє пiдвищенню обiзнаностi про мiнну безпеку. З метою покращення показникiв роботи моделей та розширення можливостей їх практичного застосування, було розроблено мультиплатформений додаток, здатний функцiонувати в рiзних операцiйних системах. У додатку використовуються ранiше розробленi та навченi моделi. Тестування додатку проводилося за допомогою волонтерiв, серед яких були професiйнi сапери та вiйськовi iнженери. Створений крос-платформний додаток для виявлення ВНП у режимi реального часу є цiнним iнструментом для груп розмiнування, гуманiтарних органiзацiй та цивiльного населення на територiях, що постраждали вiд ВНП. Додаток пiдтримує роботу як в онлайн, так i в офлайн-режимах, використовуючи навченi на 3D друкованих копiях ВНП моделi глибокого навчання. На тестовому наборi даних додаток демонструє повноту розпiзнавання 88% при середньому часi обробки одного зображення 2,1 секунди. Важливою особливiстю додатку є можливiсть коригування результатiв розпiзнавання безпосередньо користувачем. Це дозволяє збирати данi зворотного зв’язку для виправлення помилок моделi та поповнення навчальної вибiрки. Додаток також забезпечує передачу результатiв виявлення, включаючи журнали роботи, локальнi копiї баз даних, а також зображення (як оригiнальнi, так i модифiкованi користувачем) до вiддаленого серверу. Отриманi данi є цiнним джерелом iнформацiї для подальшого вдосконалення моделей. Майбутнi iнiцiативи з розвитку включатимуть розширення кiлькостi типiв ВНП, що розпiзнаються, вдосконалення iснуючої моделi з урахуванням зiбраних даних та вiдгукiв користувачiв. Постiйна розробка i вдосконалення цiєї програми може зробити значний внесок у глобальнi зусилля з розмiнування i пiдвищити безпеку громад. Спираючись на попереднi роздiли, розробленi програмнi компоненти, включаючи крос-платформний додаток та месенджер-бот, разом утворюють єдине алгоритмiчне середовище. Це середовище пiдтримує централiзоване навчання, розгортання та постiйне вдосконалення моделей глибокого навчання для широкого спектру завдань виявлення ВНП, включаючи навчання користувачiв, розпiзнавання в режимi реального часу та збiр анотованих користувачами даних для безперервного покращення моделi. Дослiдження включає декiлька ключових етапiв. По-перше, проведено комплексний огляд iснуючих методiв виявлення ВНП, а також поглиблений аналiз проблем, пов’язаних з обмеженiстю даних, та потенцiйних шляхiв їх вирiшення. По-друге, розроблено та проведено порiвняльний аналiз рiзних архiтектур моделей глибокого навчання, зокрема YOLOv5, YOLOv8 та YOLOv11. По-третє, розроблено програмний комплекс, що включає крос- платформний додаток та месенджер-бот, та проведено оцiнку його ефективностi в умовах, наближених до реальних. Очiкується, що результати цього дослiдження зроблять вагомий внесок у вирiшення проблеми мiнної небезпеки, сприяючи вiдновленню безпеки i добробуту населення в Українi та iнших постраждалих регiонах. Основна увага в дослiдженнi придiляється методам виявлення на осно- вi комп’ютерного зору, що використовують зображення у видимому свiтлi. Проте, запропонованi пiдходи можуть бути адаптованi для застосування з iншими типами сенсорiв, такими як iнфрачервонi камери, лiдари, магнiто- метри та георадари. Ефективнiсть розроблених методiв оцiнюється шляхом навчання та тестування моделей машинного навчання на зiбраних наборах даних, якi було розширено iз застосуванням описаних у Роздiлi 3 методiв аугментацiї, з використанням метрик, описаних у Роздiлi 2.9. Наукова новизна одержаних результатiв полягає у наступному: 1. Вперше розроблено нову алгоритмiчну платформу на основi Моделi Єдиного Алгоритмiчного Середовища (МЄАС) для розпiзнавання вибухонебезпечних предметiв (ВНП), що базується на загальних принципах обробки даних, єдиних методиках аугментацiї, створення наборiв даних, анотування та налаштування гiперпараметрiв моделей глибокого навчання. МЄАС iнтегрує крос-платформний застосунок (з можливостями офлайн-роботи та напiвавтоматичного анотування), месенджер-бот та хмарний API, забезпечуючи збiр даних, навчання, розгортання та iтеративне вдосконалення моделей. 2. Вперше реалiзовано функцiю офлайн розпiзнавання ВНП, iнтегровану в крос-платформенний застосунок, що дозволяє проводити виявлення вибухонебезпечних предметiв на мобiльних пристроях без пiдключення до мережi Iнтернет. 3. Вперше реалiзовано iнтегрований у крос-платформний застосунок механiзм напiвавтоматичного анотування, що дозволяє користувачам коригувати результати розпiзнавання (додавати, видаляти, змiнювати мiтки та межi об’єктiв) i вiдправляти анотованi данi для подальшого вдосконалення моделi. 4. Вперше для задач розпiзнавання ВНП створено хмарний API на базi GCP, що забезпечує онлайн розпiзнавання та iнтегрований з крос-платформним застосунком i месенджер-ботом; цей бот, окрiм розпiзнавання, надає додаткову iнформацiю про виявленi об’єкти за допомогою великої мовної моделi Google Gemini. 5. Запропоновано методику формування навчальних наборiв даних шляхом використання 3D-друкованих копiй, що вiдтворюють вiзуальнi характеристики ВНП, поширених в Українi, з подальшим те- стуванням навчених моделей на окремому наборi реальних ВНП, наданих фахiвцями з розмiнування та волонтерами. 6. Запропоновано методику застосування аугментацiї даних шляхом розробки та впровадження двоетапної аугментацiї, оптимiзованої для моделей сiмейства YOLO (YOLOv5, YOLOv8, YOLOv11). Теоретичне значення одержаних результатiв полягає у наступному: • Розширення методологiї створення навчальних даних: За- пропоновано та експериментально пiдтверджено ефективнiсть методологiї створення навчальних наборiв даних для розпiзнавання ВНП на основi 3D-друкованих реплiк, що дозволяє генерувати контрольованi та безпечнi данi для навчання моделей глибокого навчання. Ця методологiя може бути адаптована для iнших задач комп’ютерного зору, де iснує дефiцит реальних даних. • Вдосконалення методiв аугментацiї: Розроблено та валiдовано двоетапну стратегiю аугментацiї даних, оптимiзовану для моделей сiмейства YOLO. Ця стратегiя продемонструвала значне покращення показникiв повноти розпiзнавання, що є критично важливим для виявлення ВНП. • Розвиток концепцiї єдиного алгоритмiчного середовища: Запропоновано та реалiзовано Модель Єдиного Алгоритмiчного Середовища (МЄАС), яка забезпечує iнтеграцiю рiзних етапiв розробки систем розпiзнавання (збiр даних, навчання, розгортання, зворотний зв’язок) в єдиний, керований процес. Ця модель може слугувати основою для створення подiбних систем в iнших прикладних областях. Практичне значення одержаних результатiв полягає у розробцi готового до впровадження програмного комплексу для розпiзнавання вибухонебезпечних предметiв (ВНП), що має такi переваги: • Пiдвищення ефективностi та безпеки гуманiтарного розмiнування: Крос-платформний застосунок з можливiстю офлайн- розпiзнавання дозволяє саперам оперативно iдентифiкувати ВНП на мобiльних пристроях безпосередньо в польових умовах, навiть за вiдсутностi iнтернет-з’єднання. Середнiй час розпiзнавання зображення становить 2.1 секунди, що забезпечує роботу в режимi, близькому до реального часу. • Спрощення процесу навчання та iнформування: Месенджер-бот з iнтегрованою мовною моделлю Google Gemini може бути використаний для навчання саперiв, вiйськовослужбовцiв та цивiльного населення, надаючи iнформацiю про типи ВНП та правила безпечної поведiнки. • Забезпечення швидкого збору та оновлення даних: Iнтегрований у застосунок механiзм напiвавтоматичного анотування дозволяє оперативно збирати данi про новi типи ВНП, помилки розпiзнавання та особливостi реальних умов, що сприяє постiйному вдосконаленню моделей. • Можливiсть iнтеграцiї з iншими системами: Розроблений хмарний API може використовуватись як самостiйно для онлайн розпiзнавання, так i бути iнтегрованим у iншi системи, наприклад, для автоматизацiї процесу розмiнування з використанням БПЛА та роботiв. • Потенцiал для масштабування та адаптацiї: Розроблена система може бути адаптована для розпiзнавання iнших типiв загроз (не лише ВНП) та для використання в iнших регiонах. Результати цього дослiдження мають значний потенцiал для практичного застосування у сферi гуманiтарного розмiнування, особливо в Українi та iнших регiонах, що постраждали вiд збройних конфлiктiв. Розроблений крос-платформний програмний комплекс, який iнтегрує iнновацiйнi методи збору та аналiзу даних, сприяє вирiшенню нагальної проблеми мiнної небезпеки та матиме позитивний вплив на вiдбудову та розвиток постраждалих територiй. Запропонованi в роботi пiдходи до збору та аугментацiї даних, включаючи iнновацiйне використання 3D друку для створення реалiстичних моделей ВНП, дозволяють формувати високоякiснi та репрезентативнi навчальнi набори. Це покращує точнiсть та надiйнiсть моделей машинного навчання, що є критично важливим для ефективного та точного виявлення i локалiзацiї ВНП. Розроблений крос-платформний застосунок сприяє залученню широкого кола фахiвцiв та волонтерiв до процесу збору iнформацiї про ВНП, забезпечуючи її актуальнiсть та вiдповiднiсть реальним умовам. Це дозволяє оперативно реагувати на змiни в мiннiй обстановцi та пiдвищити ефективнiсть протимiнної дiяльностi. Iнтеграцiя навчених моделей машинного навчання у багатоплатформ- ний програмний комплекс створює потужний iнструмент для автоматизацiї процесу виявлення та локалiзацiї ВНП. Це дозволить суттєво знизити ризики для особового складу, залученого до операцiй з розмiнування, та значно прискорити процес очищення територiй вiд мiн та iнших вибухонебезпечних предметiв. Крiм того, програмний комплекс може бути використаний для монiторингу та контролю ефективностi операцiй з розмiнування, а також для планування подальших заходiв. Використання розробленої системи сприятиме пришвидшенню вiдбудови постраждалих регiонiв, вiдкриваючи доступ до сiльськогосподарських угiдь, вiдновлюючи iнфраструктуру та створюючи умови для сталого економiчного розвитку. Зменшення кiлькостi жертв серед цивiльного населення та вiйськовослужбовцiв завдяки пiдвищенню точностi та безпеки виявлення ВНП є найважливiшим гуманiтарним аспектом дослiдження. Отриманi результати та розробленi технологiї мають потенцiал для широкого застосування не лише в Українi, але й в iнших країнах, якi стикаються з проблемою мiнної небезпеки. Запропонованi пiдходи можуть бути адаптованi до рiзних типiв мiсцевостi та умов, що робить їх унiверсальним iнструментом для гуманiтарного розмiнування. Таким чином, ця дисер- тацiйна робота робить вагомий внесок у вирiшення глобальної проблеми мiнної небезпеки та сприяє вiдновленню безпеки i добробуту населення у постраждалих вiд вiйни регiонах. Особистий висновок здобувача: Дисертацiя є самостiйною науко- вою працею, в якiй висвiтленi власнi iдеї i розробки автора, що дозволили вирiшити поставленi завдання. Робота мiстить теоретичнi та методичнi по- ложення i висновки, сформульованi дисертантом особисто. Використанi в дисертацiї iдеї, положення чи гiпотези iнших авторiв мають вiдповiднi посилання i використанi лише для пiдкрiплення iдей здобувача. Автор провiв дослiдження i експерименти самостiйно, створений програмний продукт є повнiстю результатом роботи дисертанта.enMachine LearningDeep LearningArtificial IntelligenceCom- puter VisionHumanitarian DeminingExplosive Objects (EO) RecognitionData AugmentationTwo-step AugmentationYOLOLandmine RecognitionCross-platform applicationCloud ComputingModel of Unified Algorithmic Environment (MUAE)Messenger BotMobile applicationмашинне навчанняГлибоке навчанняШтучний iн- телектКомп’ютерний зiрГуманiтарне розмiнуванняРозпiзнавання вибу- хонебезпечних предметiв (ВНП)Доповнення данихДвоетапна аугмента- цiяРозпiзнавання мiнКрос-платформний застосунокХмарнi Об- численняМодель Єдиного Алгоритмiчного Середовища (МЄАС)Месенджер- БотМобiльний додатокDevelopment of a cross-platform Model of the Unified Algorithmic Environment for the recognition of explosive objectsРозробка кросплатформної Моделi Єдиного Алгоритмiчного Середовища для розпiзнавання вибухонебезпечних предметiвДисертація