ОЦІНЮВАННЯ ЗДАТНОСТІ МОДЕЛЕЙ ВИЯВЛЕННЯ AI-ЗГЕНЕРОВАНИХ ТЕКСТІВ ДО УЗАГАЛЬНЕННЯ В УМОВАХ НЕВІДОМОГО ГЕНЕРАТОРА

Taras Petryshak; Viktoriia Vysotska

doi:10.20998/2079-0023.2026.01.16

Authors

Taras Petryshak Lviv Polytechnic National University, Ukraine https://orcid.org/0009-0006-6296-3867
Viktoriia Vysotska Lviv Polytechnic National University, Ukraine https://orcid.org/0000-0001-6417-3689

DOI:

https://doi.org/10.20998/2079-0023.2026.01.16

Keywords:

AI-generated text detection, generalization ability, unseen generator, stylometric features, transformer-based models, text classification

Abstract

Many AI-generated text detectors demonstrate high performance on datasets constructed within typical evaluation protocols. In particular, classical models based on stylometric features, such as text length, punctuation patterns, and aggregated formality indicators, can effectively capture statistical regularities of machine generation. However, their performance decreases substantially when texts produced by previously unseen generators are encountered. Under such conditions, feature distributions shift, which leads to a decline in classification quality, primarily due to an increase in false negative errors. This paper investigates the generalization ability of detection models under conditions involving an unseen generator. The study compares classical stylometric models and transformer-based approaches using the LOGO (Leave-One-Generator-Out) evaluation protocol. The task is formulated as binary text classification across two domains, Reddit and Wikipedia, and involves three generators, namely ChatGPT, Davinci, and Dolly. The classical models include Random Forest and Gradient Boosting, whereas the transformer-based approaches are represented by DistilBERT and RoBERTa. Model performance is evaluated using Accuracy, Precision, Recall, F1, and Macro-F1, with the final results averaged across multiple random initializations. The results show that transformer-based models demonstrate a higher ability to generalize to texts produced by unseen generators. In contrast, stylometric approaches exhibit a substantial degradation in performance, particularly depending on the domain and text length. Error analysis indicates that the main factor behind this decline is the increase in false negative errors. An additional analysis of feature importance shows that classical models rely heavily on surface-level textual characteristics, which do not ensure stable generalization across different generators. Therefore, the findings highlight the importance of evaluating AI-generated text detectors under the LOGO protocol to ensure robust performance in the presence of new language models.

References

Wang Y., Mansurov J., Ivanov P., et al. M4: Multi-Generator, MultiDomain, and Multi-Lingual Black-Box Machine-Generated Text Detection. EACL 2024. 2024. Available at: https://aclanthology.org/2024.eacl-long.83.pdf (accessed 05.03.2026).

Schuster T., Schuster R., Shah D. J., Barzilay R. The Limitations of Stylometry for Detecting Machine-Generated Fake News. Computational Linguistics. 2020, vol. 46, no. 2, pp. 499–510. DOI: 10.1162/coli_a_00380.

Pu X., Zhang J., Han X., Tsvetkov Y., He T. On the Zero-Shot Generalization of Machine-Generated Text Detectors. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023, pp. 4799–4808. DOI: 10.18653/v1/2023.findings-emnlp.318.

Databricks. Databricks Dolly (Dolly-v2) — official repository/documentation. Available at: https://github.com/databrickslabs/dolly (accessed 05.03.2026).

scikit-learn developers. GroupShuffleSplit — scikit-learn documentation. Available at: https://scikitlearn.org/stable/modules/generated/sklearn.model_selection.GroupS huffleSplit.htm (accessed 05.03.2026)

imbalanced-learn developers. Under-sampling — imbalanced-learn user guide (v0.14.1). Available at: https://imbalancedlearn.org/stable/under_sampling.html (accessed 05.03.2026).

Sanh V., Debut L., Chaumond J., Wolf T. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv, Computer Science. 2019. DOI: 10.48550/arXiv.1910.01108. Available at: https://arxiv.org/abs/1910.01108 (accessed 05.03.2026).

Liu Y., Ott M., Goyal N., et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv, Computer Science. 2019. DOI: 10.48550/arXiv.1907.11692. Available at: https://arxiv.org/abs/1907.11692 (accessed 05.03.2026).

Breiman L. Random Forests. Machine Learning. 2001, vol. 45, pp. 5–32. DOI: 10.1023/A:1010933404324. Available at: https://link.springer.com/article/10.1023/A:1010933404324 (accessed 05.03.2026).

Friedman J. H. Greedy function approximation: A gradient boosting machine. Annals of Statistics. 2001, vol. 29, no. 5, pp. 1189–1232. DOI: 10.1214/aos/1013203451. Available at: https://www.cse.cuhk.edu.hk/irwin.king/_media/presentations/2001_greedy_function_approximation_a_gradient_boosting_machine.pdf (accessed 05.03.2026).

Zheng A., Casari A. Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists. Sebastopol, O’Reilly Publ., 2018. Available at: https://dl.acm.org/doi/abs/10.5555/3239815 (accessed 05.03.2026).

Zellers R., Holtzman A., Rashkin H., et al. Defending Against Neural Fake News. arXiv, Computer Science. 2019. DOI: 10.48550/arXiv.1905.12616. Available at: https://arxiv.org/abs/1905.12616 (accessed 05.03.2026).

Mitchell E., Lee Y., Khazatsky A., Manning C. D., Finn C. DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature. Proceedings of Machine Learning Research. at: (accessed 2023, vol. 202. Available https://proceedings.mlr.press/v202/mitchell23a.html 05.03.2026)

EVALUATING THE GENERALIZATION ABILITY OF AI-GENERATED TEXT DETECTORS TO UNSEEN GENERATORS

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Information

Developed By