img

官方微信

高级检索

中国沙漠 ›› 2024, Vol. 44 ›› Issue (6): 70-78.DOI: 10.7522/j.issn.1000-694X.2024.00049

• • 上一篇    下一篇

基于集成学习的风沙跃移轨迹自动识别模型

周虹吉1(), 梅凡民1(), 蒲梦洁1, 林川1, 苏进2, 陈金广3   

  1. 1.西安工程大学,环境与化学工程学院,陕西 西安 710600
    2.西安工程大学,理学院,陕西 西安 710600
    3.西安工程大学,计算机科学学院,陕西 西安 710600
  • 收稿日期:2024-03-11 修回日期:2024-05-12 出版日期:2024-11-20 发布日期:2024-12-06
  • 通讯作者: 梅凡民
  • 作者简介:梅凡民(E-mail: meifanmin@xpu.edu.cn
    周虹吉(1997—),男,山西忻州人,硕士研究生,主要从事风沙跃移颗粒追踪研究。E-mail: hosico@126.com
  • 基金资助:
    国家自然科学基金项目(41340043);陕西省自然科学基金项目(2021JM-448);西安工程大学一流课程(环境生态学)项目

Ensemble models for identifying automatically aeolian saltating tracks driven by datasets

Hongji Zhou1(), Fanmin Mei1(), Mengji Pu1, Chuan Lin1, Jin Su2, Jinguang Chen3   

  1. 1.School of Environmental and Chemical Engineering /, Xi'an Polytechnic University,Xi'an 710600,China
    2.School of Science /, Xi'an Polytechnic University,Xi'an 710600,China
    3.School of Computer Science, Xi'an Polytechnic University,Xi'an 710600,China
  • Received:2024-03-11 Revised:2024-05-12 Online:2024-11-20 Published:2024-12-06
  • Contact: Fanmin Mei

摘要:

为了发展风沙跃移轨迹追踪算法,需要建立自动识别跃移轨迹方法来代替人工识别方法。本文利用自建的跃移轨迹数据集,发展了4个优化的集成学习模型(极度随机树、随机森林、梯度提升决策树和XGBoost)以实现跃移轨迹的自动识别。结果表明:上述模型都能够较好地实现跃移轨迹的自动分类,反映了集成学习模型在解决这类非线性问题时的独特优势;在研究的模型中,极度随机树模型具有最高的准确率(0.9035)、精确度(0.9030)、召回率(0.9035)、F1分数(0.8995)、MCC(0.7378)、AUC分数(0.9179)和最高的时间成本;XGBoost模型具有较好的预测分数和较低时间成本;前者适合用于离线跃移轨迹的自动识别而后者具有在线追踪风沙颗粒的潜力;添加瞬时水平和垂直速度的方差等参数化方案不但可优化数据集,且能进一步提升极度随机树模型的预测性能。

关键词: 风沙跃移, 极度随机树, XGBoost, 随机森林, 梯度提升决策树

Abstract:

It is very vital for tracking sand particle to establish automatic identification of saltating tracks. Thus, the four ensemble models, including the Extremely randomized trees, the Random forests, the XGBoost, and the Gradient Boosting Decision Tree driven by the datasets we constructed, were proposed for identifying saltating tracks. Firstly, all the models perform well in spite of the dataset without very good discriminability, suggesting these models own an advantage when dealing with nonlinear relationships. Secondly, the Extremely randomized trees model holds the highest accuracy (0.9035), precision (0.9030), recall (0.9035), F1 score (0.8995), MCC (0.7378), and AUC score (0.9179), and time cost while the XGBoost model has the best balance between the higher scores and lower time cost. It implies that the former is most feasible for identifying offline saltating tracks and that the latter is prospective for tracking sand particle online. Finally, the improved datasets, which incorporate standard deviation of instant horizontal and vertical velocities, significantly enhance the predictive performances of Extremely randomized trees. This study effectively reduces the time cost of manual trajectory verification and broadens the application of machine learning in saltation.

Key words: aeolian saltating, extremely randomized trees, XGBoost, random forest, gradient boosting decision tree

中图分类号: