|
DOI: 10.14489/vkit.2025.11.pp.027-032
Мадияров К. Г. ОЦЕНКА ПРОИЗВОДИТЕЛЬНОСТИ И ТОЧНОСТИ МОДЕЛЕЙ МАШИННОГО ОБУЧЕНИЯ ДЛЯ ПРОГНОЗИРОВАНИЯ ВОЗДЕЙСТВИЯ ДИОКСИДА СЕРЫ И ОЗОНА НА ПШЕНИЦУ (с. 27-32)
Аннотация. Предложена оценка точности и производительности алгоритмов машинного обучения, применяемых для прогноза воздействия диоксида серы и озона на урожайность пшеницы. В качестве исходного массива данных использованы показатели индекса качества воздуха (AQI), собранные системой контроля качества воздуха (AQS) Агентства по охране окружающей среды США (EPA) за 2003–2023 годы. Сюда входят концентрация загрязняющих веществ, метеорологические характеристики и сведения о станциях мониторинга. Для прогнозирования применены методы градиентного бустинга (CatBoost, RandomForest, HistGradientBoosting) и сверточные нейронные сети (EfficientNet-B0, MobileNetV3-Large, ResNet50) для анализа изображений поврежденных растений. Анализ эффективности моделей проводился с учетом показателей точности (Accuracy, Precision, Recall, F1-score) и производительности (затраты на обучение, скорость предсказаний, ресурсоемкость). Результаты показывают, что CatBoost обладает наивысшей точностью в числовом прогнозировании, а EfficientNet-B0 достигает высокой точности в классификации визуальных симптомов повреждений растений. Выявленные закономерности могут быть использованы для разработки интеллектуальных систем в агростраховании, позволяющих автоматизировать оценку рисков и прогнозировать потери урожая.
Ключевые слова: машинное обучение; прогнозирование; загрязнение воздуха; диоксид серы; озон; пшеница; анализ изображений; болезни растений; агрострахование.
Madiyarov K. G. EVALUATION OF PERFORMANCE AND ACCURACY OF MACHINE LEARNING MODELS FOR PREDICTING THE IMPACT OF SULFUR DIOXIDE AND OZONE ON WHEAT (pp. 27-32)
Abstract. This study benchmarks machine learning models for forecasting the impact of ground level sulfur dioxide (SO2) and ozone (O3) on wheat. A 2003–2023 dataset merges EPA Air Quality System records with meteorological features and regional crop yields. For tabular predictions, we apply CatBoost, Random Forest, and HistGradientBoosting using engineered features that capture pollutant exposure and phenological stages. Image-based classification employs EfficientNet B0, MobileNetV3 Large, and ResNet50 to identify leaf damage caused by SO2 and O3. The models are evaluated using accuracy, F1-score, MAE, log-loss, and prediction speed. CatBoost achieves the best numeric forecast (MAE 10.3 %), while EfficientNet B0 demonstrates strong classification performance (accuracy 88.6 %, log-loss 0.312). Together, the dual-model system raises R2 from 0.56 to 0.81 and reduces insurance claim variance by 23 %. Scenario modeling reveals that a 15 ppb increase in O3 concentration may reduce yields by 4.7 %, yet timely intervention could save up to 52 USD per hectare in insurance costs. The results highlight the potential of ensemble learning and sensor fusion in agricultural risk analytics. Future work includes expanding to other crops and integrating hyperspectral data with attention-based model architectures. The paper evaluates how machine learning methods predict wheat losses driven by ground level sulphur dioxide (SO2) and ozone (O3) in Kazakhstan. A 2003–2023 dataset combining US EPA AQS pollution readings, regional weather logs and field plot yield records was curated. Two complementary pipelines were implemented. Tabular features (SO2, O3, temperature, humidity, rainfall, soil moisture) were modelled with CatBoost, HistGradientBoosting, RandomForest and TPOT AutoML. CatBoost achieved the highest predictive accuracy for numeric yield impacts (96 %) and the lowest MAE (≈ 10 %). Image based diagnosis employed EfficientNet B0, ResNet50 and MobileNetV3 Large, trained on labelled leaf photographs showing pollutant injury. ResNet50 attained the top classification accuracy (91.3 %), while EfficientNet B0 offered a favourable balance of accuracy (87.7 %) and log loss (0.312). Combining tabular risk scores with image validation enhanced overall reliability. The study confirms gradient boosting as optimal for rapid scalar predictions and lightweight CNNs as suitable for in field diagnostics on constrained hardware. Findings help insurers price pollution endorsements and guide farmers toward targeted mitigation, illustrating how multi source data and machine learning can reinforce climate smart agronomy.
Keywords: Machine learning; Forecasting; Air pollution; Sulfur dioxide; Ozone; Wheat; Image analysis; Plant diseases; Agricultural insurance.
К. Г. Мадияров (Новосибирский государственный университет экономики и управления «НИНХ», Новосибирск, Россия) E-mail:
Этот e-mail адрес защищен от спам-ботов, для его просмотра у Вас должен быть включен Javascript
K. G. Madiyarov (Novosibirsk State University of Economics and Management, Novosibirsk, Russia) E-mail:
Этот e-mail адрес защищен от спам-ботов, для его просмотра у Вас должен быть включен Javascript
1. Алпатов А. Н., Попов К. С., Чесалин А. Н. Анализ точности моделей машинного обучения с использованием методов векторизации для задач классификации разнородных текстовых данных // International Journal of Open Information Technologies. 2022. Т. 10, №. 7. С. 47–53. 2. Михайличенко А. А. Аналитический обзор методов оценки качества алгоритмов классификации в задачах машинного обучения // Вестник Адыгейского государственного университета. Сер. 4. Естественно-математические и технические науки. 2022. № 4(311). С. 52–59. 3. Pagano A. J., Cappiello A., Vannucci E., D’Elia C. InsurTech integration: reshaping the insurance value chain in the digital age // American Journal of Economics and Business Administration. 2024. V. 16. No. 1. P. 46–57. DOI: 10.3844/ajebasp.2024.46.57 4. Verhoef P. C., Donkers B. Predicting customer potential value an application in the insurance industry // Decision support systems. 2001. V. 32. No. 2. P. 189–199. 5. Evaluating solvency versus efficiency performance and different forms of organization and marketing in US property–liability insurance companies / P. L. Brockett, W. W. Cooper, L. L. Golden et al. // European Journal of Operational Research. 2004. V. 154. No. 2. P. 492–514. DOI: 10.1016/S0377-2217(03)00184-X 6. Akter S., Krupnik T. J., Rossi F., Khanam F. The influence of gender and product design on farmers’ preferences for weather-indexed crop insurance // Global Environmental Change. 2016. V. 38. P. 217–229. DOI: 10.1016/j.gloenvcha.2016.03.010 7. Albahri A. S., Khaleel Y. L., Habeeb M. A. A systematic review of trustworthy artificial intelligence applications in natural disasters // Computers and Electrical Engineering. 2024. V. 118. P. 109409. DOI: 10.1016/j.compeleceng.2024.109409 8. Ghahari A., Newlands N. K., Lyubchich V., Gel Y. R. Deep learning at the interface of agricultural insurance risk and spatio-temporal uncertainty in weather extremes // North American Actuarial Journal. 2019. V. 23. No. 4. P. 535–550. DOI: 10.1080/10920277.2019.1633928 9. Abu Zaid M. I. M., Abdullah R., Ismail S. I., Dzulkefli N. N. IoT-based emergency alert system integrated with telegram bot // 2023 IEEE International Conference on Automatic Control and Intelligent Systems (I2CACIS), Shah Alam, Malaysia, 17 June 2023. IEEE, 2023. P. 126–131. 10. Maginga T., Nsenga J., Bakunzibake P., Masabo E. Smallholder farmer-centric integration of IoT and Chatbot for early maize diseases detection and management in pre-visual symptoms phase // 2022 IEEE Global Humanitarian Technology Conference (GHTC): Proceedings. Santa Clara, CA, США, 8–11 сентября 2022 г. P. 369–372. DOI: 10.1109/GHTC55712.2022.9911047 11. Kurniawati R., Choiruddin A. Optimizing Claim Assessment Processes in Property Insurance: A Case Study // Procedia Computer Science. 2024. V. 234. P. 520–526. 12. Tereszkiewicz P., Południak-Gierz K. Liability for incorrect client personalization in the distribution of consumer insurance // Risks. 2021. V. 9. No. 5. P. 83. 13. Tadesse M. A., Shiferaw B. A., Erenstein O. Weather index insurance for managing drought risk in smallholder agriculture: lessons and policy implications for sub-Saharan Africa // Agricultural and Food Economics. 2015. V. 3. P. 1–21. 14. A comparative study of automated legal text classification using random forests and deep learning / H. Chen, L. Wu, J. Chen et al. // Information Processing & Management. 2022. V. 59. No. 2. P. 102798. DOI: 10.1016/j.ipm.2021.102798 15. Robson B., Boray S. Studies in the extensively automatic construction of large odds-based inference networks from structured data. Examples from medical, bioinformatics, and health insurance claims data // Computers in Biology and Medicine. 2018. V. 95. P. 147–166. 16. Severino M. K., Peng Y. Machine learning algorithms for fraud prediction in property insurance: Empirical evidence using real-world microdata //Machine Learning with Applications. 2021. V. 5. P. 100074. 17. A framework for incorporating insurance in critical infrastructure cyber risk strategies / D. Young, J. L. Jr. Lopez, M. J. Rice et al. // International Journal of Critical Infrastructure Protection. 2016. V. 14. P. 43–57. DOI: 10.1016/j.ijcip.2016.04.001
1. Alpatov, A. N., Popov, K. S., & Chesalin, A. N. (2022). Analysis of the accuracy of machine learning models using vectorization methods for the classification of heterogeneous text data. International Journal of Open Information Technologies, 10(7), 47–53. [in Russian language] 2. Mikhailichenko, A. A. (2022). Analytical review of methods for assessing the quality of classification algorithms in machine learning problems. Vestnik Adygeiskogo Gosudarstvennogo Universiteta. Ser. 4. Estestvenno-Matematicheskie i Tekhnicheskie Nauki, (4(311)), 52–59. [in Russian language]. 3. Pagano, A. J., Cappiello, A., Vannucci, E., & D’Elia, C. (2024). InsurTech integration: Reshaping the insurance value chain in the digital age. American Journal of Economics and Business Administration, 16(1), 46–57. https://doi.org/10.3844/ajebasp.2024.46.57 4. Verhoef, P. C., & Donkers, B. (2001). Predicting customer potential value an application in the insurance industry. Decision Support Systems, 32(2), 189–199. https://doi.org/10.1016/S0167-9236(01)00110-5 5. Brockett, P. L., Cooper, W. W., Golden, L. L., Rousseau, J. J., & Wang, Y. (2004). Evaluating solvency versus efficiency performance and different forms of organization and marketing in US property–liability insurance companies. European Journal of Operational Research, 154(2), 492–514. https://doi.org/10.1016/S0377-2217(03)00184-X 6. Akter, S., Krupnik, T. J., Rossi, F., & Khanam, F. (2016). The influence of gender and product design on farmers’ preferences for weather-indexed crop insurance. Global Environmental Change, 38, 217–229. https://doi.org/10.1016/j.gloenvcha.2016.03.010 7. Albahri, A. S., Khaleel, Y. L., & Habeeb, M. A. (2024). A systematic review of trustworthy artificial intelligence applications in natural disasters. Computers and Electrical Engineering, 118, 109409. https://doi.org/10.1016/j.compeleceng.2024.109409 8. Ghahari, A., Newlands, N. K., Lyubchich, V., & Gel, Y. R. (2019). Deep learning at the interface of agricultural insurance risk and spatio-temporal uncertainty in weather extremes. North American Actuarial Journal, 23(4), 535–550. https://doi.org/10.1080/10920277.2019.1633928 9. Abu Zaid, M. I. M., Abdullah, R., Ismail, S. I., & Dzulkefli, N. N. (2023, June 17). IoT-based emergency alert system integrated with telegram bot [Paper presentation]. 2023 IEEE International Conference on Automatic Control and Intelligent Systems (I2CACIS), Shah Alam, Malaysia. https://doi.org/10.1109/I2CACIS59108.2023.10248361 10. Maginga, T., Nsenga, J., Bakunzibake, P., & Masabo, E. (2022, September 8–11). Smallholder farmer-centric integration of IoT and Chatbot for early maize diseases detection and management in pre-visual symptoms phase [Paper presentation]. 2022 IEEE Global Humanitarian Technology Conference (GHTC), Santa Clara, CA, USA. https://doi.org/10.1109/GHTC55712.2022.9911047 11. Kurniawati, R., & Choiruddin, A. (2024). Optimizing claim assessment processes in property insurance: A case study. Procedia Computer Science, 234, 520–526. https://doi.org/10.1016/j.procs.2024.02.061 12. Tereszkiewicz, P., & Południak-Gierz, K. (2021). Liability for incorrect client personalization in the distribution of consumer insurance. Risks, 9(5), 83. https://doi.org/10.3390/risks9050083 13. Tadesse, M. A., Shiferaw, B. A., & Erenstein, O. (2015). Weather index insurance for managing drought risk in smallholder agriculture: Lessons and policy implications for sub-Saharan Africa. Agricultural and Food Economics, 3(1), 1–21. https://doi.org/10.1186/s40100-015-0044-3 14. Chen, H., Wu, L., Chen, J., Lu, W., & Ding, J. (2022). A comparative study of automated legal text classification using random forests and deep learning. Information Processing & Management, 59(2), 102798. https://doi.org/10.1016/j.ipm.2021.102798 15. Robson, B., & Boray, S. (2018). Studies in the extensively automatic construction of large odds-based inference networks from structured data. Examples from medical, bioinformatics, and health insurance claims data. Computers in Biology and Medicine, 95, 147–166. https://doi.org/10.1016/j.compbiomed.2017.12.011 16. Severino, M. K., & Peng, Y. (2021). Machine learning algorithms for fraud prediction in property insurance: Empirical evidence using real-world microdata. Machine Learning with Applications, 5, 100074. https://doi.org/10.1016/j.mlwa.2021.100074 17. Young, D., Lopez, J. L., Jr., Rice, M. J., Ramsey, B. W., & McTasney, R. J. (2016). A framework for incorporating insurance in critical infrastructure cyber risk strategies. International Journal of Critical Infrastructure Protection, 14, 43–57. https://doi.org/10.1016/j.ijcip.2016.04.001
Статью можно приобрести в электронном виде (PDF формат).
Стоимость статьи 700 руб. (в том числе НДС 20%). После оформления заказа, в течение нескольких дней, на указанный вами e-mail придут счет и квитанция для оплаты в банке.
После поступления денег на счет издательства, вам будет выслан электронный вариант статьи.
Для заказа скопируйте doi статьи:
10.14489/vkit.2025.11.pp.027-032
и заполните форму
Отправляя форму вы даете согласие на обработку персональных данных.
.
This article is available in electronic format (PDF).
The cost of a single article is 700 rubles. (including VAT 20%). After you place an order within a few days, you will receive following documents to your specified e-mail: account on payment and receipt to pay in the bank.
After depositing your payment on our bank account we send you file of the article by e-mail.
To order articles please copy the article doi:
10.14489/vkit.2025.11.pp.027-032
and fill out the form
.
|