OPTIMIZATION OF THE ANNOTATION PROCESS FOR BIOLOGICAL OBJECT IMAGES USING COMPUTER VISION METHODS

Authors

DOI:

https://doi.org/10.20998/2079-0023.2025.01.11

Keywords:

computer vision, optimization, edge detection, contour detection, object segmentation, machine learning, blood cell classification

Abstract

This study presents an approach to the automated creation of an annotated dataset containing images of biological objects, particularly cells. The proposed methodology is based on a modified CRISP-DM framework, adapted to the specifics of computer vision tasks. A sequence of stages and steps has been developed to enable effective detection and localization of biological objects in microscopic images. The process involves preprocessing the images, including binarization, filtering, brightness and contrast adjustment, as well as correction of illumination artifacts. These operations help enhance the quality of the input images and improve the accuracy of subsequent detection steps. Detected objects are automatically localized based on morphological analysis, followed by clustering using the k-means algorithm. Grouping is based on features such as object size and mean color value, which allows for distinguishing between different types of cells or structures based on visual characteristics. Bounding boxes are automatically generated for the localized objects, and their coordinates are stored in a structured tabular format (.csv). The resulting dataset can be used to train or test deep learning models, particularly for tasks such as object localization, classification, or segmentation. The proposed approach was validated using images of blood smears containing various types of cells. All computations were carried out using the Python programming language and libraries such as Pandas, NumPy, OpenCV, and Matplotlib. The analysis of detection and classification accuracy demonstrated satisfactory results, confirming the feasibility of using the developed pipeline for automated generation of annotated biological image datasets.

Author Biography

Anton Kovalenko, National Technical University "Kharkiv Polytechnic Institute"

National Technical University "Kharkiv Polytechnic Institute", PhD Student,
Kharkiv, Ukraine

References

Kovalenko A. S., Severyn V. P. Vykorystannia kompiuternoho zoru v intelektualnykh systemakh [Using computer vision in intelligent systems]. XVI Mizhnarodna naukovo-praktychna konferentsiia mahistrantiv ta aspirantiv «Teoretychni ta praktychni doslidzhennia molodykh naukovtsiv» (14-16 hrudnia 2022 roku) : materialy konferentsii [XVI International Scientific and Practical Conference of Master's and PhD Students "Theoretical and Practical Research of Young Scientists" (December 14-16, 2022): conference materials]. Kharkiv, NTU "KhPI" Publ., 2022, p. 38. (In Ukr.).

Lin J, Partick C. BakuFlow: A Streamlining Semi-Automatic Label Generation Tool. arXiv preprint arXiv:2506.09083, 2025. https://doi.org/10.48550/arXiv.2506.09083.

2022 State of Data Science by Anaconda. URL: https://www.anaconda.com/resources/whitepapers/state-of-data- science-report-2022 (accessed 02.06.2025).

Kovalenko S. M., Kutsenko O. S., Kovalenko S. V., Kovalenko A. S. Approach to the automatic creation of an annotated dataset for the detection, localization and classification of blood cells in an image. Radio Electronics, Computer Science, Control, 2024, no. 1, pp. 128–139. https://doi.org/10.15588/1607-3274-2024-1-12.

Kovalenko S., Kovalenko S., Mikhnova O., Kovalenko A., Pelikh D., Severin V., An Approach to Blood Cell Classification Based on Object Segmentation and Machine Learning IEEE 4th KhPI Week on Advanced Technology (KhPIWeek). 2023. P. 1–6. DOI: 10.1109/KhPIWeek61412.2023.10312903.

Kutsenko A., Megel Y., Kovalenko S., Kovalenko S., Pelikh D., Rybalka A. Methods for Medical Images Contrast Measuring and Enhancement to Improve the Accuracy of Pathology Detection, 2022 XXXII International Scientific Symposium Metrology and Metrology Assurance (MMA), Sozopol, Bulgaria, 2022, pp. 1-6. doi: 10.1109/MMA55579.2022.9993261.

Raman Thakur. How Human-in-the-Loop is used in Data Annotation? URL: https://www.labellerr.com/blog/why-is-hitl-needed-in-annotation/ (accessed 02.06.2025).

Kovalenko S., Kovalenko S., Kutsenko A., Godlevskyi M., Severin V., Kovalenko A., Methodology for Creating Annotated Datasets of Biological Objects in Microscopic Images, 2024 IEEE 5th KhPI Week on Advanced Technology (KhPIWeek), Kharkiv, Ukraine, 2024, pp. 1–6, doi: 10.1109/KhPIWeek61434.2024.10878016.

Kutsenko A., Megel Y., Kovalenko S., Kovalenko S., Pelikh D., Rybalka A. Methods for Medical Images Contrast Measuring and Enhancement to Improve the Accuracy of Pathology Detection. 2022 XXXII International Scientific Symposium Metrology and Metrology Assurance (MMA), Sozopol, Bulgaria, 2022, pp. 1–6, doi: 10.1109/MMA55579.2022.9993261.

Yadav J., Sharma M., A Review of K-mean Algorithm. Int. J. Eng. Trends Technol. 2013, vol. 4, iss.7, pp. 2972–2976.

Acevedo A., Merino A., Alférez S., Molina Á., Boldú L., Rodellar J. A dataset for microscopic peripheral blood cell images for development of automatic recognition systems. Hospital Clinic de Barcelona. https://doi.org/10.1016/j.dib.2020.105474.

Alam M. М, Islam М. Т. Machine learning approach of automatic identification and counting of blood cells. Healthcare Technology Letters. 2019, vol. 6, iss. 4, pp. 103–108.

Published

2025-07-11

How to Cite

Kovalenko, A. (2025). OPTIMIZATION OF THE ANNOTATION PROCESS FOR BIOLOGICAL OBJECT IMAGES USING COMPUTER VISION METHODS. Bulletin of National Technical University "KhPI". Series: System Analysis, Control and Information Technologies, (1 (13), 77–82. https://doi.org/10.20998/2079-0023.2025.01.11

Issue

Section

INFORMATION TECHNOLOGY