INFORMATION TECHNOLOGIES FOR THE INTEGRATION OF CUSTOMER AND CONSUMER DATA
DOI:
https://doi.org/10.20998/2079-0023.2025.02.09Keywords:
database, data warehouse, data integration, star schema, dimension/fact tables, ETLAbstract
Using the example of a book enterprise that combines the functions of a publisher, distributor, and retailer, it is shown how multi-channel operational activities lead to the accumulation of vast arrays of information in databases that are fragmented, incomplete, unstructured, and contain duplicates. This situation makes it impossible to effectively analyze customer behavior, including the accurate calculation of key performance indicators. The relevance of the work lies in reducing this critical gap between the volume of accumulated information and the business's ability to make effective management decisions based on it. The purpose of this work is to develop a methodological approach to creating a data warehouse based on the star schema architecture and to implement an adaptive ETL chain with built-in quality control rules. An analysis of modern data warehouse design methods was conducted, including the transition from the entity-relationship model to the star schema. Based on the structure of the transactional database and business requirements for data analysis, an analytical warehouse using the star schema was designed, and key facts and dimensions necessary to support comprehensive customer analytics were identified. To transfer data from the transactional system to the warehouse, an extract, transform, and load (ETL) process was developed, and its logic was described: data extraction from sources, its cleaning and transformation in a staging area, and loading into the target warehouse tables. The effectiveness of the developed processes was evaluated based on event log data. The analysis results confirm the reliability and high performance of the proposed solution. The approach proposed in the article provides automated, reliable, and efficient updating of the data warehouse, creating a single source of truth for business analytics.
References
Press G. Data Quality: Best Practices for Accurate Insights. Available at: https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says (accessed: 15.06.2025)
Tsvenger D. Leveraging data analytics to enhance customer loyalty and retention. Available at: https://paylode.com/articles/data-analytics-customer-loyalty-retention (accessed: 15.06.2025)
Inmon W. H. Building the Data Warehouse. Available at: http://www.r-5.org/files/books/computers/databases/warehouses/W_H_Inmon-Building_the_Data_Warehouse-EN.pdf (accessed: 15.06.2025)
Kimball R., Ross M. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. Available at: http://www.r-5.org/files/books/computers/databases/warehouses/Ralph_Kimball_Margy_Ross-The_Data_Warehouse_Toolkit-EN.pdf (accessed: 15.06.2025)
Golfarelli M., Rizzi S. Data Warehouse Design: Modern Principles and Methodologies. Available at: https://cs09lects.wordpress.com/wp-content/uploads/2012/12/book-data-warehouse-design-golfarelli-_-rizzi.pdf (accessed: 15.06.2025)
Franconi E., Sattler U. A Data Warehouse Conceptual Data Model for Multidimensional Aggregation. Available at: https://ceur-ws.org/Vol-19/paper13.pdf (accessed: 15.06.2025)
Tomashevskyi V. M. Osoblyvosti proektuvannia hibrydnykh skhovyshch danykh z vrakhuvanniam dzherel danykh [Features of designing hybrid data storage systems taking into account data sources]. Available at: https://science.lpnu.ua/sites/default/files/journal-paper/2017/jun/3398/2369.pdf (accessed: 15.06.2025)
Debbarma N., Nath G., Das H. Analysis of Data Quality and Performance Issues in Data Warehousing and Business Intelligence. Available at: https://ijcaonline.org/archives/volume79/number15/13818-1862/ (accessed: 15.06.2025)
Shakhovska N. B., Vykliuk Y. I. Skhovyshcha ta prostory danykh – informatsiinyi fundament system pryiniattia rishen [Data warehouses and data spaces – the information foundation of decision-making systems]. Available at: http://nbuv.gov.ua/UJRN/etks_2012_8_17 (accessed: 15.06.2025)
Batini C., Scannapieco M. Data Quality: Concepts, Methodologies and Techniques. Available at: https://link.springer.com/book/10.1007/3-540-33173-5 (accessed: 15.06.2025)
Ali S. M. F., Wrembel R. From conceptual design to performance optimization of ETL workflows: current state of research and open problems. Available at: https://doi.org/10.1007/s00778-017-0477-2 (accessed: 15.06.2025)
The DAMA Guide to the Data Management Body of Knowledge (DAMA-DMBOK2). Available at: https://www.google.com.ua/books/edition/DAMA_DMBOK/YjacswEACAAJ (accessed: 15.06.2025)
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
