In order to construct an accurate drought prediction model, it is very important to select predictors with physical significance and adopt efficient prediction methods. Compared to the traditional prediction methods, more efficient and reliable machine learning algorithms have been more widely used in climate prediction. This study is based on the monthly meteorological element data of 70 national meteorological stations in Hubei Province from 1960 to 2022, as well as the atmospheric circulation and sea temperature indices provided by the National Climate Center and the National Oceanic and Atmospheric Administration (NOAA). The standardized precipitation evapotranspiration index was used to determine drought occurrence as the target variable, and 11 indices were selected as input variables using feature selection methods. On this basis, two machine learning algorithms, classification and regression tree (CART) and random forest (RF), were used to construct summer drought prediction models of Hubei Province. The 47 years data were randomly selected as the training set, while the remaining 16 years data were used as the test set to evaluate the prediction performance. The results show that the prediction accuracy of the CART and RF models for drought was 88% and 81%, respectively. Additionally, both algorithms identified the Asian zonal circulation index as the most important variable in their models, indicating that this index is crucial for predicting summer droughts in Hubei Province. By constructing these two machine learning algorithm prediction models, this study provides an objective and effective new approaches for summer drought prediction in Hubei Province, which will provide scientific information for drought prevention and mitigation in the region.
Comparative analysis of different drought events occurring in Guizhou Province during the flood season (from June to September) is significant for improving the short-term climate prediction techniques. Based on the precipitation data from 84 meteorological stations in Guizhou Province, the spatial and temporal evolution characteristics of two severe drought events in Guizhou Province during the period of 1981—2023 were characterized statistically, and the causes of the two drought events were revealed using reanalyzed data, and the differences were compared. At the same time, combined with 130 climate indexes of the National Climate Center and machine learning method, the drought event in Guizhou Province was modeled. The results show that the precipitation in flood season in Guizhou Province presented significant inter-decadal variability, and the precipitation was least under the La Niña background in 2011 and 2022. Poor water vapor conditions in flood season in 2011 under the joint influence of the Western Pacific Subtropical High moving to the east and the cyclonic circulation anomalies over the lower South China Sea region led to a widespread drought in Guizhou Province. In 2022, affected by the negative phase of the Tropical Indian Ocean Dipole (TIOD), the Western Pacific Subtropical High was abnormally large, strong and westward, the South Asian High was strong and eastward, the temperature was abnormally high, and there was an anticyclonic circulation anomaly at the lower level over the southern China, and the water vapor conditions were poor and accompanied by continuous high temperature, which resulted in the persistence of the drought in Guizhou Province. Twenty-six algorithms of machine learning were used to build a drought prediction model for Guizhou Province, among which the Linear SVC model had the best prediction effect. The test and evaluation show that this model had a good prediction ability for 2011 and 2022 drought in Guizhou Province.
The accuracy of flood forecasting in small and medium-sized basins in arid and semi-arid regions needs to be improved, primarily due to a limited understanding of critical factors such as topographic features, critical rainfall, time of concentration, and recurrence intervals. In this paper, Xiahe County, Gansu Province, which is located in the semi-arid region, and Yongchang County, Gansu Province, which is located in the arid region, are selected as the research objects. Field investigations were carried out on flood-related factors across 34 small basins in 86 riverine villages of Xiahe County (2015) and 240 cross-sections in 395 riverine villages of Yongchang County (2014). Flood characteristics were assessed using the instantaneous unit hydrograph method, the regional empirical formula method, and the method proposed by the China Railway First Survey and Design Institute Group Co., Ltd. These methods were employed to calculate the time of concentration, design storms, and design floods for the study areas, and flood warning thresholds were estimated. Based on the calculation results, machine learning methods(linear regression, random forest and neural network) were used to establish a flood warning model for arid and semi-arid areas, and each model was evaluated and analyzed. The results show that there is a linear correlation between prepared transfer rainfall and factors such as rainfall during storms, warning rainfall threshold, and main channel slope, and the linear regression model can accurately calculate the warning rainfall. In order to further verify the applicability of the regression model, the model is used to invert and analyze the prepared transfer rainfall of 34 survey basins in Hezuo City, Gansu Province, the mean absolute error is only 0.56 mm.
In order to find a more accurate photovoltaic power prediction method to be suitable for arid areas, based on the actual observed data and numerical forecasting information from a photovoltaic power station in Dunhuang, Gansu Province in 2022, three short-term photovoltaic power forecasting models were established by using the prototype prediction method, short-term error correction method and stepwise regression method. At the same time, a grasshopper optimization algorithm was used to optimize the three single models to form a combined prediction method. The forecasting effects of the four methods were tested and evaluated. The results show that the root-mean-square error and relative root-mean-square error of the stepwise regression method are smaller than those of the prototype prediction method and the short-term error correction method, and the prediction accuracy of the stepwise regression method is higher, the fluctuation range is smallest, and the prediction effect is more stable. Compared with the three single models, the combined prediction model formed by the grasshopper optimization algorithm has improved the prediction effect, and the average root-mean-square error is reduced by 145.21, 153.48 and 70.91 kW, respectively. Under different weather conditions, the combined forecasting model is superior to the single forecasting model, and the forecasting effect is the best under sunny weather condition.
Accurate forecast of dense fog (visibility less than or equal to 500 meters) is of great significance for ensuring people's safety and reducing economic losses. Based on the ground observation data of 31 national meteorological stations, environmental monitoring station data of southern Henan, and ERA5 reanalysis data from the European Centre for Medium-Range Weather Forecasts (ECMWF) from 2019 to 2021, the spatial distribution and physical characteristics of dense fog in this area were analyzed, and 30 dense fog forecast factors were selected. The visibility classification forecast (VCF) model is trained based on the LightGBM (Light Gradient Boosting Machine) machine learning method. By inputting the forecast field data by the ECMWF model at 08:00 every day and the PM2.5 concentration monitoring at 08:00, the 3-hour graded visibility forecast products of the national stations in southern Henan are obtained. Through the prediction test of 17 dense fog days in southern Henan from January to March 2022, it was shown that the scores of the VCF model were generally better than the visibility forecast directly output by ECMWF model. The dense fog forecast product generated based on the VCF model for the period from 20:00 to 20:00 in southern Henan can provide important reference for forecasting.
The accurate prediction of wind power is of great significance for the dispatching department to adjust power generation planning in a timely manner. Machine learning is one of the main methods of wind power prediction at present. However, how to select reliable and effective single model from numerous machine learning algorithms, and how to fit and associate different models are the key and difficult points of multi-model combination prediction methods. Based on the wind power and wind tower wind speed data of Jiuquan Gandong Wind Power Electric Station of China Genneral Nuclear Power Corporation from January 1, 2020 to December 31, 2020, and the characteristics of various typical machine learning algorithms and a single model on the test set, the combination method of K-Nearest Neighbor (KNN), Bootstrap Aggregating (BA) and Convolutional Neural Network (CNN) is studied, and a wind power prediction model that integrates KNN, BA and CNN is established. The results show that all single models overestimate the low values of some values, and the Multi-Layer Perceptron (MLP) and CNN neural networks also underestimate the high values. The BA model has the highest prediction accuracy, and its Root Mean Square Error (RMSE) on the test set is 13.08 MW. The combined models can improve the prediction accuracy of the single model to a certain extent. The RMSE of the CNN combined model on the test set is 12.21 MW, which is about 6.7% lower than the RMSE of the best BA model in the single models. The CNN combined model can significantly improve the situation that the high value is underestimated, the low value is overestimated for the CNN single model, and the low value is overestimated for the BA model. The prediction model established in this article can be extended to practical wind power prediction.
With the advancement of computer technology and big data, convolutional neural networks of the deep learning have become the mainstream technology for processing large-scale data with grid structure, especially in the field of computer vision. Convolutional neural networks have also been gradually applied in the field of atmospheric science to process multi-angle and multi-scale meteorological data. This paper reviews the progress of convolutional neural networks and their applications in atmospheric science, the conclusions are as following. Through the optimization of network depth and width and magnitude compression, the accuracy and efficiency of convolutional neural networks have been significantly improved, and they have become the mainstream technology for computer vision tasks. The convolutional neural network can process meteorological data efficiently, and has been applied in meteorological target recognition, extreme event detection, numerical model improvement and drought weather event prediction, etc., showing a good application prospect. The application of convolutional neural networks in atmospheric science is still in the exploratory stage, and faces challenges such as the complexity of meteorological data, the need for improvement of model structure and poor interpretability, so further research is needed to promote its development.