IDENTIFICATION PARAMETERS OF DYNAMIC OBJECTS USING TRANSFORMER WITH OPTICAL FLOW AND ENSEMBLE METHODS
DOI:
https://doi.org/10.20998/2079-0023.2025.01.16Keywords:
Remote identification of dynamic objects, object detection, computer vision, ensemble methods, deep learning, convolutional neural networks, system architecture, software, machine learning, artificial intelligenceAbstract
The article presents an approach to identifying the parameters of dynamic objects in a video stream using a transformer-based architecture, the GeoNet model, and ensemble machine learning methods, namely bagging and boosting. The identification of parameters such as position, velocity, direction of movement, and depth is of significant importance for a wide range of applications, including autonomous driving, robotics, and video surveillance systems. The paper describes a comprehensive system that integrates the spatiotemporal characteristics of a video stream by computing optical flow and depth maps using GeoNet, further analyzing them through a transformer, and enhancing accuracy via ensemble methods. GeoNet, as a deep convolutional neural network, combines the tasks of depth estimation and optical flow within a single architecture, enabling accurate 3D scene reconstruction. The use of a transformer allows modeling global dependencies across video frames and improves the accuracy of object classification and detection. At the same time, bagging reduces variance by averaging the results of several models trained on different subsets, while boosting focuses on difficult examples to improve prediction accuracy. The proposed system achieves high accuracy under conditions of dynamic background, lighting changes, occlusions, and noise, making it adaptable for real-time use in complex scenes. A detailed description of each system component is provided: the GeoNet architecture, transformer modules, implementation of bagging and boosting, and the result fusion algorithm. The expected results are intended to demonstrate the effectiveness of integrating deep learning methods with classical ensemble approaches for high-precision dynamic object identification tasks. The proposed methodology opens new prospects for the development of next-generation intelligent computer vision systems.
References
Wang Z., Turko R., Shaikh O., Park H., Das N., Hohman F., Kahng M., Chau D. CNN Explainer: Learning Convolutional Neural Networks with Interactive Visualization. URL: https://arxiv.org/abs/2004.15004 (accessed: 05.05.2025).
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A., Kaiser L., Polosukhin I. Attention Is All You Need. URL: https://arxiv.org/abs/1706.03762 (accessed: 05.05.2025).
Carion N., Massa F., Synnaeve G., Usunier N., Kirillov A., Zagoruyko S. End-to-End Object Detection with Transformers. URL: https://arxiv.org/abs/2005.12872v3 (accessed: 05.05.2025).
Zou Z., Chen K., Shi Z., Shi Z., Guo Y., Ye J. Object
Detection in 20 Years: A Survey. URL: https://arxiv.org/pdf/1905.05055.pdf?fbclid=IwAR0ILGAWTwU-9-iH6lZyPFXYXA5JRWarM_XoSJ78QEhmnn-txvr_iGEzCio (accessed: 05.05.2025).
Ammar A., Chebbah A., Fredj H., Souani C. Comparative Study of latest CNN based Optical Flow Estimation. URL: https://ieeexplore.ieee.org/document/9806070/references#references. (accessed: 05.05.2025).
Yin Z., Shi J. GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose. URL: https://arxiv.org/abs/1803.02276v2 (accessed: 05.05.2025).
Girshick R., Donahue J., Darrell T., and Malik J. Region-based convolutional networks for accurate object detection and segmentation. IEEE transactions on pattern analysis and machine intelligence. 2016, vol. 38, no. 1, pp. 142–158.
Nikulina O. M., Severin V. P., Kondratov O. М, Olhovoy O. M. Models of remote identification of parameters of dynamic objects using detection transformers and optical flow. Vestnik Nats. tekhn. un-ta "KhPI": sb. nauch. tr. Temat. vyp.: Sistemnyy analiz, upravlenie i informatsionnye tekhnologii [Bulletin of the National Technical University "KhPI": a collection of scientific papers. Thematic issue: System analysis, management and information technology]. Kharkiv, NTU "KhPI" Publ., no. 1 (11), pp. 52–57.
Kondratov O. М., Nikulina O. M. Software implementation using transformer with optical flow and geonet for identifying parameters of dynamic objects. Vestnik Nats. tekhn. un-ta "KhPI": sb. nauch. tr. Temat. vyp.: Sistemnyy analiz, upravlenie i informatsionnye tekhnologii [Bulletin of the National Technical University "KhPI": a collection of scientific papers. Thematic issue: System analysis, management and information technology]. Kharkiv, NTU "KhPI" Publ., no. 2 (12), pp. 86–91.
Nikulina O. M., Severin V. P., Kondratov O. М., Rekova N. Y. Analysis of information technologies for remote identification of dynamic objects. Vestnik Nats. tekhn. un-ta "KhPI": sb. nauch. tr. Temat. vyp.: Sistemnyy analiz, upravlenie i informatsionnye tekhnologii [Bulletin of the National Technical University "KhPI": a collection of scientific papers. Thematic issue: System analysis, management and information technology]. Kharkiv, NTU "KhPI" Publ., no. 1 (9), pp. 110–115.
Gracyk A., Chen X. GeONet: a neural operator for learning the Wasserstein geodesic. URL: https://arxiv.org/abs/2209.14440 (accessed: 05.05.2025).
Inomata T., Kimura K., Hagiwara M. Object Tracking and Classification System Using Agent Search. URL: https://www.jstage.jst.go.jp/article/ieejeiss/129/11/129_11_2065/_pdf/-char/ja (accessed: 05.05.2025).
Gavrylenko S., Chelak V., Hornostal O. Construction Method Of Fuzzy Decision Trees For Identification The Computer System State. 2022 XXXII International Scientific Symposium Metrology and Metrology Assurance (MMA). 2022, pp. 1–5.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).