IDENTIFICATION PARAMETERS OF DYNAMIC OBJECTS USING TRANSFORMER WITH OPTICAL FLOW AND ENSEMBLE METHODS

Authors

DOI:

https://doi.org/10.20998/2079-0023.2025.01.16

Keywords:

Remote identification of dynamic objects, object detection, computer vision, ensemble methods, deep learning, convolutional neural networks, system architecture, software, machine learning, artificial intelligence

Abstract

The article presents an approach to identifying the parameters of dynamic objects in a video stream using a transformer-based architecture, the GeoNet model, and ensemble machine learning methods, namely bagging and boosting. The identification of parameters such as position, velocity, direction of movement, and depth is of significant importance for a wide range of applications, including autonomous driving, robotics, and video surveillance systems. The paper describes a comprehensive system that integrates the spatiotemporal characteristics of a video stream by computing optical flow and depth maps using GeoNet, further analyzing them through a transformer, and enhancing accuracy via ensemble methods. GeoNet, as a deep convolutional neural network, combines the tasks of depth estimation and optical flow within a single architecture, enabling accurate 3D scene reconstruction. The use of a transformer allows modeling global dependencies across video frames and improves the accuracy of object classification and detection. At the same time, bagging reduces variance by averaging the results of several models trained on different subsets, while boosting focuses on difficult examples to improve prediction accuracy. The proposed system achieves high accuracy under conditions of dynamic background, lighting changes, occlusions, and noise, making it adaptable for real-time use in complex scenes. A detailed description of each system component is provided: the GeoNet architecture, transformer modules, implementation of bagging and boosting, and the result fusion algorithm. The expected results are intended to demonstrate the effectiveness of integrating deep learning methods with classical ensemble approaches for high-precision dynamic object identification tasks. The proposed methodology opens new prospects for the development of next-generation intelligent computer vision systems.

Author Biographies

Oleksii Kondratov, National Technical University «Kharkiv Polytechnic Institute»

Postgraduate, senior lecturer of Department Information Systems and Technologies National Technical University «Kharkiv Polytechnic Institute», Kharkiv, Ukraine; senior lecturer of Department IT, Analysis and Project Decisions, Technical University "METINVEST POLYTECHNICS", LLC, Zaporizhzhia, Ukraine

Olena Nikulina, National Technical University «Kharkiv Polytechnic Institute»

Doctor of Technical Sciences, Professor, Head of Department Information Systems and Technologies National Technical University «Kharkiv Polytechnic Institute», Kharkiv, Ukraine; Professor of Department IT, Analysis and Project Decisions, Technical University "METINVEST POLYTECHNICS", LLC, Zaporizhzhia, Ukraine

References

Wang Z., Turko R., Shaikh O., Park H., Das N., Hohman F., Kahng M., Chau D. CNN Explainer: Learning Convolutional Neural Networks with Interactive Visualization. URL: https://arxiv.org/abs/2004.15004 (accessed: 05.05.2025).

Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A., Kaiser L., Polosukhin I. Attention Is All You Need. URL: https://arxiv.org/abs/1706.03762 (accessed: 05.05.2025).

Carion N., Massa F., Synnaeve G., Usunier N., Kirillov A., Zagoruyko S. End-to-End Object Detection with Transformers. URL: https://arxiv.org/abs/2005.12872v3 (accessed: 05.05.2025).

Zou Z., Chen K., Shi Z., Shi Z., Guo Y., Ye J. Object

Detection in 20 Years: A Survey. URL: https://arxiv.org/pdf/1905.05055.pdf?fbclid=IwAR0ILGAWTwU-9-iH6lZyPFXYXA5JRWarM_XoSJ78QEhmnn-txvr_iGEzCio (accessed: 05.05.2025).

Ammar A., Chebbah A., Fredj H., Souani C. Comparative Study of latest CNN based Optical Flow Estimation. URL: https://ieeexplore.ieee.org/document/9806070/references#references. (accessed: 05.05.2025).

Yin Z., Shi J. GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose. URL: https://arxiv.org/abs/1803.02276v2 (accessed: 05.05.2025).

Girshick R., Donahue J., Darrell T., and Malik J. Region-based convolutional networks for accurate object detection and segmentation. IEEE transactions on pattern analysis and machine intelligence. 2016, vol. 38, no. 1, pp. 142–158.

Nikulina O. M., Severin V. P., Kondratov O. М, Olhovoy O. M. Models of remote identification of parameters of dynamic objects using detection transformers and optical flow. Vestnik Nats. tekhn. un-ta "KhPI": sb. nauch. tr. Temat. vyp.: Sistemnyy analiz, upravlenie i informatsionnye tekhnologii [Bulletin of the National Technical University "KhPI": a collection of scientific papers. Thematic issue: System analysis, management and information technology]. Kharkiv, NTU "KhPI" Publ., no. 1 (11), pp. 52–57.

Kondratov O. М., Nikulina O. M. Software implementation using transformer with optical flow and geonet for identifying parameters of dynamic objects. Vestnik Nats. tekhn. un-ta "KhPI": sb. nauch. tr. Temat. vyp.: Sistemnyy analiz, upravlenie i informatsionnye tekhnologii [Bulletin of the National Technical University "KhPI": a collection of scientific papers. Thematic issue: System analysis, management and information technology]. Kharkiv, NTU "KhPI" Publ., no. 2 (12), pp. 86–91.

Nikulina O. M., Severin V. P., Kondratov O. М., Rekova N. Y. Analysis of information technologies for remote identification of dynamic objects. Vestnik Nats. tekhn. un-ta "KhPI": sb. nauch. tr. Temat. vyp.: Sistemnyy analiz, upravlenie i informatsionnye tekhnologii [Bulletin of the National Technical University "KhPI": a collection of scientific papers. Thematic issue: System analysis, management and information technology]. Kharkiv, NTU "KhPI" Publ., no. 1 (9), pp. 110–115.

Gracyk A., Chen X. GeONet: a neural operator for learning the Wasserstein geodesic. URL: https://arxiv.org/abs/2209.14440 (accessed: 05.05.2025).

Inomata T., Kimura K., Hagiwara M. Object Tracking and Classification System Using Agent Search. URL: https://www.jstage.jst.go.jp/article/ieejeiss/129/11/129_11_2065/_pdf/-char/ja (accessed: 05.05.2025).

Gavrylenko S., Chelak V., Hornostal O. Construction Method Of Fuzzy Decision Trees For Identification The Computer System State. 2022 XXXII International Scientific Symposium Metrology and Metrology Assurance (MMA). 2022, pp. 1–5.

Published

2025-07-11

How to Cite

Kondratov, O., & Nikulina, O. (2025). IDENTIFICATION PARAMETERS OF DYNAMIC OBJECTS USING TRANSFORMER WITH OPTICAL FLOW AND ENSEMBLE METHODS. Bulletin of National Technical University "KhPI". Series: System Analysis, Control and Information Technologies, (1 (13), 106–111. https://doi.org/10.20998/2079-0023.2025.01.16

Issue

Section

INFORMATION TECHNOLOGY