Article Contents

Improved Weather Radar Echo Extrapolation Through Wind Speed Data Fusion Using a New Spatiotemporal Neural Network Model

Funding:

National Natural Science Foundation of China 42375145

The Open Grants of China Meteorological Admin-istration Radar Meteorology Key Laboratory 2023LRM-A02


doi: 10.3724/j.1006-8775.2023.036

  • Weather radar echo extrapolation plays a crucial role in weather forecasting. However, traditional weather radar echo extrapolation methods are not very accurate and do not make full use of historical data. Deep learning algorithms based on Recurrent Neural Networks also have the problem of accumulating errors. Moreover, it is difficult to obtain higher accuracy by relying on a single historical radar echo observation. Therefore, in this study, we constructed the Fusion GRU module, which leverages a cascade structure to effectively combine radar echo data and mean wind data. We also designed the Top Connection so that the model can capture the global spatial relationship to construct constraints on the predictions. Based on the Jiangsu Province dataset, we compared some models. The results show that our proposed model, Cascade Fusion Spatiotemporal Network (CFSN), improved the critical success index (CSI) by 10.7% over the baseline at the threshold of 30 dBZ. Ablation experiments further validated the effectiveness of our model. Similarly, the CSI of the complete CFSN was 0.004 higher than the suboptimal solution without the cross-attention module at the threshold of 30 dBZ.
  • 加载中
  • Figure 1.  X and Y denote two types of input data, and P denotes the output prediction. The decoder takes all-zero input. The RNN layer refers to the recurrent unit of various spatiotemporal neural network models. The arrows indicate the main information flow between the units.

    Figure 2.  The Fusion GRU consists of two parts. The left part accepts the input data Xt and Yt, and completes the computation of the cascade structure, while the right part uses the cross-attention module to further extract and store features.

    Figure 3.  (a) The architecture of the CFSN. (b) The flow chart of the Secondary Block. Adopt convolution for downsampling and transposed convolution for upsampling. The blue line segment indicates radar information, the green indicates wind speed information, and the purple indicates a mixture of radar and wind speed information.

    Figure 4.  Top Connection, which relies on the CBAM module, uses Group Normalization and Max Pooling for downsampling and Transposed Convolution for upsampling.

    Figure 5.  Each model predicts the last 10 frames in the radar echo sequence on the dataset. These are images from frames 1, 4, 7 and 10.

    Figure 6.  Four scores for each predictive frame for models at thresholds of 20 and 30 dBZ.

    Figure 7.  Four scores for each predictive frame for ablation study at thresholds of 20 and 30 dBZ.

    Table 1.  Comparison results of the four indices for each type of model on the dataset at thresholds of 20 and 30 dBZ, respectively. Use PredRNN as the baseline and the highest scores are indicated in red. ↑ indicates that the higher the score, the better, while ↓ indicates that the lower the score, the better.

    Model CSI↑ POD↑ FAR↓ HSS↑
    20 30 20 30 20 30 20 30
    ConvLSTM 0.538 0.321 0.697 0.516 0.309 0.440 0.634 0.429
    PredRNN 0.548 0.323 0.670 0.431 0.265 0.348 0.644 0.430
    PredRNN++ 0.538 0.314 0.658 0.433 0.279 0.377 0.636 0.419
    MIM 0.541 0.321 0.639 0.410 0.250 0.342 0.639 0.426
    SimVP 0.481 0.258 0.596 0.353 0.282 0.364 0.575 0.350
    CFSN 0.565 0.357 0.669 0.467 0.237 0.326 0.661 0.463
    DownLoad: CSV

    Table 2.  Comparative results of the four indices for ablation experiments at thresholds of 20 and 30 dBZ. The highest scores are indicated in red. ↑ indicates that the higher the score, the better, while ↓ indicates that the lower the score, the better.

    Model CSI↑ POD↑ FAR↓ HSS↑
    20 30 20 30 20 30 20 30
    CFSN 0.565 0.357 0.669 0.467 0.237 0.326 0.661 0.463
    CFSN no Fusion LSTM 0.553 0.342 0.663 0.444 0.260 0.349 0.650 0.446
    CFSN no Top Connection 0.560 0.351 0.679 0.473 0.264 0.363 0.656 0.457
    CFSN no cross attention 0.567 0.353 0.673 0.462 0.236 0.318 0.663 0.457
    DownLoad: CSV
  • [1] CRANE R K. Automatic cell detection and tracking[J]. IEEE Transactions on Geoscience Electronics, 1979, 17(4): 250–262, https://doi.org/10.1109/TGE.1979.294654
    [2] QIAO C G, ZHENG S L, YANG L Z, et al. Principle and application of centroid method of radar echo extrapolation[J]. Meteorology Journal of Henan, 2006, 29(3): 29–30, https://doi.org/10.16765/j.cnki.1673-7148.2006.03.019, in Chinese with English abstract
    [3] DIXON M, WIENER G. TITAN: thunderstorm identifica-tion, tracking, analysis, and nowcasting—a radar-based methodology[J]. Journal of Atmospheric and Oceanic Technology, 1993, 10(6): 785–797, https://doi.org/10.1175/1520-0426(1993)010<0785:TTITAA>2.0.CO;2 doi:
    [4] JOHNSON J T, MACKEEN P L, WITT A, et al. The storm cell identification and tracking algorithm: an enhanced WSR-88D algorithm[J]. Weather and Forecasting, 1998, 13(2): 263–276, https://doi.org/10.1175/1520-0434(1998)013<0263:TSCIAT>2.0.CO;2 doi:
    [5] HAN L, FU S, ZHAO L, et al. 3D convective storm identification, tracking, and forecasting-an enhanced TITAN algorithm[J]. Journal of Atmospheric and Oceanic Technol-ogy, 2009, 26(4): 719–732, https://doi.org/10.1175/2008JTE-CHA1084.1
    [6] DELL'ACQUA F, GAMBA P. Rain pattern tracking by means of COTREC and modal matching[J]. Optical Engineering, 2002, 41(2): 278, https://doi.org/10.1117/1.1432668
    [7] LI L, SCHMID W, JOSS J. Nowcasting of motion and growth of precipitation with radar over a complex orography[J]. Journal of Applied Meteorology, 1995, 34(6): 1286–1300, https://doi.org/10.1175/1520-0450(1995)034<1286:NOMAGO>2.0.CO;2 doi:
    [8] CHEN L, DAI J H, TAO L. Application of an improved TREC algorithm (COTREC) for precipitation nowcast[J]. Journal of Tropical Meteorology, 2009, 25(1): 117–122, https://doi.org/10.16032/j.issn.1004-4965.2009.01.014, in Chinese with English abstract
    [9] RINEHART R E, GARVEY E T. Three-dimensional storm motion detection by conventional weather radar[J]. Nature, 1978, 273(5660): 287–289, https://doi.org/10.1038/273287a0
    [10] NOVÁK P, BŘEZKOVÁ L, FROLÍK P. Quantitative precipitation forecast using radar echo extrapolation[J]. Atmospheric Research, 2009, 93(1-3): 328–334, https://doi.org/10.1016/j.atmosres.2008.10.014
    [11] HORN B K P, SCHUNCK B G. Determining optical flow[J]. Artificial Intelligence, 1981, 17(1-3): 185–203, https://doi.org/10.1016/0004-3702(81)90024-2
    [12] FARNEBACK G. Very high accuracy velocity estimation using orientation tensors, parametric motion, and simulta-neous segmentation of the motion field[C]//Proceedings Eighth IEEE International Conference on Computer Vision ICCV 2001, IEEE. 2001, 1: 171–177, https://doi.org/10.1109/ICCV.2001.937514
    [13] WEINZAEPFEL P, REVAUD J, HARCHAOUI Z, et al. DeepFlow: Large displacement optical flow with deep matching[C]//Proceedings of the IEEE International Conference on Computer Vision, IEEE. 2013: 1385–1392, https://doi.org/10.1109/ICCV.2013.175
    [14] WULFF J, BLACK M J. Efficient sparse-to-dense optical flow estimation using a learned basis and layers[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE. 2015: 120–130, https://doi.org/10.1109/CVPR.2015.7298607
    [15] FLEET D, WEISS Y. Optical flow estimation[M]//In PARAGIOS N, CHEN Y M, FAUGERAS O (eds). Handbook of Mathematical Models in Computer Vision. Boston, MA: Springer, 2006: 237–257, https://doi.org/10.1007/0-387-28831-7_15
    [16] ELSAYED N, MAIDA A S, BAYOUMI M. Reduced-gate convolutional LSTM architecture for next-frame video prediction using predictive coding[C]//2019 International Joint Conference on Neural Networks, IEEE. 2019: 1–9.
    [17] KALCHBRENNER N, OORD A, SIMONYAN K, et al. Video pixel networks[C]//International Conference on Machine Learning, PMLR. 2017: 1771–1779.
    [18] LOTTER W, KREIMAN G, COX D. Deep predictive coding networks for video prediction and unsupervised learning[EB/OL]. arXiv, 1605.08104, 2016, https://doi.org/10.48550/arXiv.1605.08104
    [19] OLIU M, SELVA J, ESCALERA S. Folded recurrent neural networks for future video prediction[C]//Proceedings of the European Conference on Computer Vision, EACV. 2018: 716–731.
    [20] SHI X, CHEN Z, WANG H, et al. Convolutional LSTM network: a machine learning approach for precipitation nowcasting[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, NIPS. 2015: 802–810.
    [21] ELMAN J L. Finding structure in time[J]. Cognitive Science, 1990, 14(2): 179–211, https://doi.org/10.1016/0364-0213(90)90002-E
    [22] ZHANG F, WANG X, GUAN J. A novel multi-input multi-output recurrent neural network based on multimodal fusion and spatiotemporal prediction for 0–4 hour precipitation nowcasting[J]. Atmosphere, 2021, 12(12): 1596, https://doi.org/10.3390/atmos12121596
    [23] SMYTHE G R, ZRNIC D S. Correlation analysis of doppler radar data and retrieval of the horizontal wind[J]. Journal of Climate and Applied Meteorology, 1983, 22(2): 297–311, https://doi.org/10.1175/1520-0450(1983)022<0297:CAODRD>2.0.CO;2 doi:
    [24] ZENG Y, ZHOU H, ROARTY H, et al. Wind speed inversion in high frequency radar based on neural network[J]. International Journal of Antennas and Propagation, 2016, 2016: 1–8, https://doi.org/10.1155/2016/2706521
    [25] WANG Y, GAO Z, LONG M, et al. Predrnn++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning[C]//International Conference on Machine Learning, PMLR. 2018: 5123–5132.
    [26] LIN Z, LI M, ZHENG Z, et al. Self-attention ConvLSTM for spatiotemporal prediction [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 11531–11538, https://doi.org/10.1609/aaai.v34i07.6819
    [27] SCHAEFER J T. The critical success index as an indicator of warning skill[J]. Weather and Forecasting, 1990, 5(4): 570–575, https://doi.org/10.1175/1520-0434(1990)005<0570:TCSIAA>2.0.CO;2 doi:
    [28] HOGAN R J, FERRO C A T, JOLLIFFE I T, et al. Equitability revisited: why the "equitable threat score" is not equitable[J]. Weather and Forecasting, 2010, 25(2): 710–726, https://doi.org/10.1175/2009WAF2222350.1
    [29] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735–1780, https://doi.org/10.1162/neco.1997.9.8.1735
    [30] SHI X, GAO Z, LAUSEN L, et al. Deep learning for precipitation nowcasting: A benchmark and a new model[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS. 2017: 5622–5632.
    [31] WANG Y, LONG M, WANG J, et al. Predrnn: Recurrent neural networks for predictive learning using spatiotempor-al LSTMs[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS. 2017: 879–888.
    [32] WANG Y, ZHANG J, ZHU H, et al. Memory in memory: A predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE. 2019: 9154–9162.
    [33] WANG Y, WU H, ZHANG J, et al. PredRNN: A recurrent neural network for spatiotemporal predictive learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 2208–2225, https://doi.org/10.1109/TPAMI.2022.3165153
    [34] WANG Y, JIANG L, YANG M H, et al. Eidetic 3D LSTM: A model for video prediction and beyond[C]//International Conference on Learning Representations, ICLR. 2018.
    [35] TREBING K, STAǸCZYK T, MEHRKANOON S. SmaAt-UNet: Precipitation nowcasting using a small attention-UNet architecture[J]. Pattern Recognition Letters, 2021, 145: 178–186, https://doi.org/10.1016/j.patrec.2021.01.036
    [36] GENG H, GENG L. MCCS-LSTM: Extracting full-image contextual information and multi-scale spatiotemporal feature for radar echo extrapolation[J]. Atmosphere, 2022, 13(2): 192, https://doi.org/10.3390/atmos13020192
    [37] GENG Y, LI Q, LIN T, et al. Lightnet: A dual spatiotemporal encoder network model for lightning prediction[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, SIGKDD. 2019: 2439–2447.
    [38] ZHANG F, WANG X, GUAN J, et al. RN-Net: A deep learning approach to 0–2 hour rainfall nowcasting based on radar and automatic weather station data[J]. Sensors, 2021, 21(6): 1981, https://doi.org/10.3390/s21061981
    [39] CHUNG J, GULCEHRE C, CHO K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL]. arXiv, 1412.3555, 2014, https://doi.org/10.48550/arXiv.1412.3555
    [40] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS. 2017: 6000–6010.
    [41] BABAK O, DEUTSCH C V. Statistical approach to inverse distance interpolation[J]. Stochastic Environmental Re-search and Risk Assessment, 2009, 23(5): 543–553, https://doi.org/10.1007/s00477-008-0226-6
    [42] MILLER H J. Tobler′s first law and spatial analysis[J]. Annals of the Association of American Geographers, 2004, 94(2): 284–289, https://doi.org/10.1111/j.1467-8306.2004.09402005.x
    [43] BARNES L R, SCHULTZ D M, GRUNTFEST E C, et al. CORRIGENDUM: False alarm rate or false alarm ratio?[J]. Weather and Forecasting, 2009, 24(5): 1452–1454, doi:

Get Citation+

GENG Huan-tong, XIE Bo-yang, GE Xiao-yan, et al. Improved Weather Radar Echo Extrapolation Through Wind Speed Data Fusion Using a New Spatiotemporal Neural Network Model [J]. Journal of Tropical Meteorology, 2023, 29(4): 482-492, https://doi.org/10.3724/j.1006-8775.2023.036
GENG Huan-tong, XIE Bo-yang, GE Xiao-yan, et al. Improved Weather Radar Echo Extrapolation Through Wind Speed Data Fusion Using a New Spatiotemporal Neural Network Model [J]. Journal of Tropical Meteorology, 2023, 29(4): 482-492, https://doi.org/10.3724/j.1006-8775.2023.036
Export:  

Share Article

Manuscript History

Manuscript received: 29 January 2023
Manuscript revised: 15 August 2023
Manuscript accepted: 15 November 2023
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Improved Weather Radar Echo Extrapolation Through Wind Speed Data Fusion Using a New Spatiotemporal Neural Network Model

doi: 10.3724/j.1006-8775.2023.036
Funding:

National Natural Science Foundation of China 42375145

The Open Grants of China Meteorological Admin-istration Radar Meteorology Key Laboratory 2023LRM-A02

Abstract: Weather radar echo extrapolation plays a crucial role in weather forecasting. However, traditional weather radar echo extrapolation methods are not very accurate and do not make full use of historical data. Deep learning algorithms based on Recurrent Neural Networks also have the problem of accumulating errors. Moreover, it is difficult to obtain higher accuracy by relying on a single historical radar echo observation. Therefore, in this study, we constructed the Fusion GRU module, which leverages a cascade structure to effectively combine radar echo data and mean wind data. We also designed the Top Connection so that the model can capture the global spatial relationship to construct constraints on the predictions. Based on the Jiangsu Province dataset, we compared some models. The results show that our proposed model, Cascade Fusion Spatiotemporal Network (CFSN), improved the critical success index (CSI) by 10.7% over the baseline at the threshold of 30 dBZ. Ablation experiments further validated the effectiveness of our model. Similarly, the CSI of the complete CFSN was 0.004 higher than the suboptimal solution without the cross-attention module at the threshold of 30 dBZ.

GENG Huan-tong, XIE Bo-yang, GE Xiao-yan, et al. Improved Weather Radar Echo Extrapolation Through Wind Speed Data Fusion Using a New Spatiotemporal Neural Network Model [J]. Journal of Tropical Meteorology, 2023, 29(4): 482-492, https://doi.org/10.3724/j.1006-8775.2023.036
Citation: GENG Huan-tong, XIE Bo-yang, GE Xiao-yan, et al. Improved Weather Radar Echo Extrapolation Through Wind Speed Data Fusion Using a New Spatiotemporal Neural Network Model [J]. Journal of Tropical Meteorology, 2023, 29(4): 482-492, https://doi.org/10.3724/j.1006-8775.2023.036
  • Weather radar echo data contains a large amount of meteorological information. Predicting severe convective weather such as precipitation, hail, and thunderstorms based on radar echo sequences can accurately and timely inform forecasters of future weather phenomena within a certain period of time, which is conducive to forecasters making rapid weather warnings and safeguarding people's lives and property. As a result, improving the extrapolation of radar echo sequences has important implications for forecasting changes in the intensity of heavy precipitation, the location of areas where thunderstorms are generated, and the movement path of typhoons.

    Traditional weather radar echo extrapolation methods include centroid tracking (Crane [1]; Qiao et al. [2]; Dixon and Wiener [3]; Johnson et al. [4]; Han et al. [5]), cross-correlation (Dell'Acqua and Gamba [6]; Li et al. [7]; Chen et al. [8]; Rinehart and Garvey [9]; Novák et al. [10]), and optical flow (Horn and Schunck [11]; Farneback [12]; Weinzaepfel et al. [13]; Wulff and Black [14]; Fleet and Weiss [15]). However, centroid tracking is applicable to the prediction of large, strong echo cells and is poor for the prediction of fragmented echoes and drastic changes in echoes. Cross-correlation is susceptible to noise interference, making it difficult to calculate accurate motion vectors for rapidly evolving echoes. The premise of optical flow is that the pixel intensity is constant, which is not suitable for the development of the actual radar echo intensity. In addition, none of the traditional algorithms are able to fully apply multi-source meteorological historical observations.

    With the rapid development of deep learning in spatiotemporal prediction (Elsayed et al.[16]; Kalchbrenner et al. [17]; Lotter et al. [18]; Oliu et al. [19]), since Shi et al. studied to apply the neural network model to the weather radar echo extrapolation task [20], more and more spatiotemporal neural network models have been trained for prediction using meteorological historical observation data and play an important role in weather forecasting. However, the current spatiotemporal neural network models mostly use the structure of a Recurrent Neural Network (RNN) to output the prediction (Elman [21]). The output of the previous moment is used as the input of the next moment. On the one hand, this allows the errors generated in the previous moment to carry over to the next moment. On the other hand, the model actually perceives only the current image each time and indirectly obtains a small amount of full sequence information by interacting with the memories. This means that predictions are subsequently difficult to correct for errors and reduce the accumulation of errors. To address this aspect, we are inspired by the structure of MFSP-Net (Zhang et al. [22]) to construct an Encoder-Decoder structure with the Fusion GRU consisting of cascade memories and an attention mechanism in the proposed model and design the Top Connection so that the model can have a full receptive field of the input sequence to extract the global spatial features directly from the feature sequences and construct constraints as the decoder input to reduce the cumulative error in the prediction.

    In fact, atmospheric motion is complex, with a large number of various impact factors. Training the model with only the single weather radar echo data from the past to achieve radar echo extrapolation is bound to be limited in its effectiveness. By incorporating other relevant meteorological observations into the input data, the flow of information in the model can be increased, and the ability to represent radar echo variability information can be enhanced, which should lead to better prediction of relevant features, especially strong echo. Therefore, how to effectively incorporate other meteorological data is a key issue in improving the accuracy of radar echo extrapolation. Smythe and Zrnic studied assuming that changes in the radar reflectivity factor are caused by local wind motion and obtained wind retrievals through the correlation analysis of the reflectivity factor [23]. Zeng et al. studied fitting the complex non-linear relationship between wind speed and echo power in the work [24]. Accordingly, we add wind speed data to the model and construct the Fusion GRU based on the concepts from PredRNN++ (Wang et al. [25]) and Self Attention ConvLSTM (Li et al. [26]), which uses the cascade structure to increase the depth of computation to strengthen the ability of fitting relationships between two datasets and uses a cross-attention module to enhance data feature interactions and extract long-term spatial dependencies. The model improves the effectiveness of radar echo extrapolation and produces more accurate predictions than a single-input model. In summary, the study makes two contributions:

    1) We construct the Fusion GRU and Top Connection to form the Cascade Fusion Spatiotemporal Network (CFSN). CFSN enhances the performance of the prediction by effectively combining radar echo data and wind speed data and extracting the global spatial relationships of the input sequence. This improves the accuracy of radar echo extrapolation.

    2) We process the multi-source data separately, using convolutional layers to accommodate data with different spatial resolutions. We also use a post-integration strategy, placing the Fusion GRU in the fourth layer of the model. Ultimately, we conducted experiments on a dataset from Jiangsu. The CFSN improves by 10.5% and 7.6% over the baseline on critical success index (CSI) (Schaefer [27]) and Heidke skill score (HSS) (Hogan et al. [28]) of 30 dBZ threshold, respectively.

  • The ConvLSTM was first proposed by Shi et al. [20], which replaces full connection operation used in the long short-termmemory (LSTM) (Hochreiter and Schmidhuber [29]) with convolution operation, complementing the lack of spatial information encoding in the LSTM. It extends the ability of LSTM to process sequence information and also stimulates the exploration of LSTM in predicting image sequences. Shi et al. later studied TrajGRU again to change the location-invariant feature of the convolutional recurrence structure in the ConvLSTM so that it can adapt to the position diversity of natural things in motion [30]. The PredRNN++, PredRNN, MIM, and PredRNNv2 were successively proposed by Wang et al. [25, 31-33]. These models are used to enrich information flow at the current moment by enhancing the linkage of spatiotemporal information before and after a moment to improve the prediction results. In more recent times, influenced by the development of attention mechanisms, spatiotemporal prediction models have started to incorporate attention modules to improve the ability of the model. Wang et al. studied using 3D-Conv to store short-term dependent features and completed interactions with historical records through a self-attentive module of the recall gate, thus capturing long-term relationships [34]. Lin et al. studied adding the attention mechanism to the ConvLSTM unit to construct the SAM module, forming the Self-Attention ConvLSTM [23]. SAM successfully captures the long-term spatial dependencies of the data. Trebing et al. studied the SmaAt-UNet and successfully applied it to the spatiotemporal prediction problem by embedding CBAM modules [35]. Geng and Geng studied the MCCS-LSTM, which integrated a cross-attention mechanism into ConvLSTM to improve the utilization of the model context [36].

    In addition, multiple data fusion models for spatiotemporal predictions are being developed to make full use of the relationships between various types of data and to improve the predictive effect of the model. For single-input models, we can superimpose related data on the channel as model input, thus allowing the model to be trained to learn from multiple information inputs. The various types of data form a single input whole in the model and are processed in the same way. This requires consistent spatial and temporal resolution of the data, which is not very applicable, and makes data processing cumbersome and error-prone. More importantly, the lack of differentiation between the different data and the early fusion may lead to a strong influence of the data on each other during model training, such that the final output will be more inclined towards the secondary input rather than the primary input as the predictive target. Zhang et al. studied MFSP-Net to enhance the fusion by differentiated processing of different data within the ConvLSTM unit and improved the output tendency problem in the design of the loss function [22]. Moreover, LightNet, proposed by Geng et al.[37], used a dual encoder to train WRF simulation data and lightning observation data separately with different ConvLSTMs and combined their hidden states by convolution operation, which were finally input to the decoder to output the prediction results, assisting in lightning prediction. Zhang et al. also studied using a dual encoder structure to separate processing radar echo sequences and precipitation sequences. They used TrajGRU as the model unit to complete a large area precipitation nowcasting with high spatial and temporal resolution using layer-by-layer convolutional deflation [38].

  • We take the weather radar echoes collected at regular intervals as the input sequence and the weather radar echoes at a certain time in the future as the output sequence with the same temporal resolution that needs to be predicted. We consider the overall weather radar echo extrapolation task as a spatiotemporal sequence prediction task. The following equation was constructed by Shi et al.[20]:

    $$ \widetilde{X}_{t+1}, \ldots, \widetilde{X}_{t+K}=\underset{X_{t+1}, \ldots, X_{t+K}}{\operatorname{argmax}} p\left(X_{t+1}, \ldots, X_{t+K} \mid \widehat{X}_{t-J+1}, \widehat{X}_{t-J+2}, \ldots, \widehat{X}_t\right) $$

    Each frame X contains the measurements of all grid points on a grid area of size M rows and N columns. The spatiotemporal sequence prediction task is to predict the most likely K future frames based on the first J frames obtained from the observations. Based on the above construction of the problem, we describe and compare the CFSN in detail in this section.

  • As shown in Fig. 1, most spatiotemporal neural network models are based on the structure of RNN, and the multi-input method is to fuse on the channel at the early stage of the model (Zhang et al.[22]). This is done with multiple input channels and a single output channel. We take pre-processed radar echo data and wind speed data with the same spatial and temporal resolution to superimpose the channel and input together. The model extracts the spatiotemporal information for the first n flames through the encoder with l layers and stores it in the memory unit for transmission to the decoder. Finally, the decoder outputs the radar echo maps predicted for the last m flames.

    Figure 1.  X and Y denote two types of input data, and P denotes the output prediction. The decoder takes all-zero input. The RNN layer refers to the recurrent unit of various spatiotemporal neural network models. The arrows indicate the main information flow between the units.

  • The front section of the Fusion GRU module adds a memory N on the basis of the Gated Recurrent Unit (GRU) (Chung et al.[39]) to store the spatiotemporal information of the two types of data separately while using a cascade to iteratively compute around the memory H, as shown on the left of Fig. 2. We introduce the current secondary input wind speed data Yt and combine the memory Ht1 and Nt1 from the previous moment to complete the update of the spatiotemporal information of the wind speed data stored in N. The radar echo data is then used as the current primary input Xt, combined with the updated Nt, to update the memory H. Meanwhile, we enhance the depth of construction of the non-linear relationship by skip connections of input Xt and Yt, thus facilitating the fitting of the evolutionary relationships of the data. This enables the description and memory of the feature information of Xt and Yt. The latter part of the Fusion GRU is to further capture the long-term spatial dependencies of the updated memory H and N, as shown on the right of Fig. 2. Through a cross-attention mechanism, the interactions between the data are enhanced, and the spatial distribution characteristics of the current data features are extracted and stored in the memory M.

    Figure 2.  The Fusion GRU consists of two parts. The left part accepts the input data Xt and Yt, and completes the computation of the cascade structure, while the right part uses the cross-attention module to further extract and store features.

    The Fusion GRU generates gates rt', zt' and gt' to control the flow of information on the memory N by performing convolution operations on Yt, Ht–1 and Nt–1 respectively. gt' selects the information stored in the memory N at the previous moment by the reset gate rt' and combines the input data Yt to generate the information for the current moment. N uses the update gate zt' to select the information stored at the previous moment and the new information from gt', thus completing the update. Subsequently, we use N to influence the generation of the gate zt and reproduce Xt, Yt and Nt in the gate gt by the skip connection to complete the update of the memory H. We compute the cross attention with the updated Ht' and the memory units Mt–1 and Nt respectively and combine them by convolution. Finally, combining the information generated by the cross-attention mechanism, the memory M is updated to achieve the capture of long-term spatial dependencies of the data and generate the new Ht. The update equation of the Fusion GRU is shown below, where σ is the sigmoid activation function, Attention is shown in Vaswani et al. [40] and * and ⊙ denote the convolution operator and element multiplication, respectively.

    $$ \begin{aligned} & r_t^{\prime}=\sigma\left(W_1 *\left[Y_t, H_{t-1}, N_{t-1}\right]\right) \\ & z_t^{\prime}=\sigma\left(W_2 *\left[Y_t, H_{t-1}, N_{t-1}\right]\right) \\ & g_t^{\prime}=\tanh \left(r_t^{\prime} \odot\left(W_3 * N_{t-1}\right)+W_4 * Y_t\right) \\ & N_t=\left(1-z_t^{\prime}\right) \odot N_{t-1}+z_t^{\prime} \odot g_t^{\prime} \\ & r_t=\sigma\left(W_5 *\left[X_t, H_{t-1}\right]\right) \\ & z_t=\sigma\left(W_6 *\left[X_t, H_{t-1}, N_t\right]\right) \\ & g_t=\tanh \left(r_t \odot\left(W_7 * H_{t-1}\right)+X_t+Y_t+N_t\right) \\ & H^{\prime}{ }_t=\left(1-z_t\right) \odot H_{t-1}+z_t \odot g_t \\ & Z_n=\operatorname{Attention}\left(W_8 * H^{\prime}{ }_t, W_9 * N_t, W_{10} * N_t\right) \\ & Z_m=\operatorname{Attention}\left(W_{11} * H^{\prime}{ }_t, W_{12} * M_{t-1}, W_{13} * M_{t-1}\right) \\ & Z=W_{14} *\left[Z_n, Z_m\right] \\ & z^{\prime \prime}{ }_t=\sigma\left(W_{15} *\left[Z, H^{\prime}{ }_t, M_{t-1}\right]\right) \\ & g^{\prime \prime}{ }_t=\tanh \left(W_{16} *\left[Z, H^{\prime}{ }_t, M_{t-1}\right]\right) \\ & o_t=\sigma\left(W_{17} *\left[Z, H^{\prime}{ }_t, N_t, M_{t-1}\right]\right) \\ & M_t=\left(1-z^{\prime \prime}{ }_t\right) \odot M_{t-1}+z^{\prime \prime}{ }_t \odot g^{\prime \prime}{ }_t \\ & H_t=o_t \odot \tanh \left(M_t\right) \end{aligned} $$
  • CFSN uses an encoder-decoder structure, as shown in Fig. 3. The encoder can be divided into three parts: the Secondary Block that extracts feature maps of wind speed data using CBAM to aid the prediction, the backbone network consisting of three layers of ST-LSTM (Wang et al. [31]) for radar echo extrapolation, and the fourth-layer module Fusion GRU, which fully combines the information from both. The decoder is the prediction network corresponding to the encoder. In addition, Top Connection extracts the global features from the encoder output on the Fusion GRU and generates the input data for the decoder accordingly.

    Figure 3.  (a) The architecture of the CFSN. (b) The flow chart of the Secondary Block. Adopt convolution for downsampling and transposed convolution for upsampling. The blue line segment indicates radar information, the green indicates wind speed information, and the purple indicates a mixture of radar and wind speed information.

    We put the main input, the radar echo history data, into the encoder backbone network to construct the spatiotemporal information flow and output feature information for each moment. Then, the secondary input, the wind speed maps, is resized by pooling operations to fully extract the spatial information and output feature maps by CBAM. We input these features into the Fusion GRU separately to calculate the spatiotemporal relationship and capture long-term spatial dependencies. We overlay the state feature maps about radar echo output from the Fusion GRU on the channel as a long sequence and further extract the feature information. Top Connection, as shown in Fig. 4, uses this to obtain a global receptive field through the CBAM module and strengthen the connection between the encoder and decoder. It generates an initialization distribution, giving the input of the decoder a priori knowledge rather than learning from zero. This ensures that the model can base its predictions on the global spatial characteristics of the input data, forming certain constraints on the spatial distribution of the data to improve the effect of radar echo extrapolation.

    Figure 4.  Top Connection, which relies on the CBAM module, uses Group Normalization and Max Pooling for downsampling and Transposed Convolution for upsampling.

  • In this section, we measure the performance of the model to demonstrate its efficiency and reliability. We use the data from the Aliyun Tianchi 2022 Jiangsu Weather AI Algorithm Challenge as the dataset to complete the comparison experiments and ablation study of the model on the radar echo extrapolation task.

  • The dataset from the competition covers weather radar and automatic station observation elements in Jiangsu Province for the period from April to September 2019–2021, containing radar echo sequences, precipitation, and mean wind elements. All the data in the study possess a horizontal resolution of 0.01°, a temporal resolution of 6 minutes, and a grid size of 480 × 560 pixels. The data included in the radar echo sequences represents the radar-based reflectivity at the height of 3 km after undergoing quality control and mosaic of multiple S-band weather radars in Jiangsu, covering the entire area of Jiangsu. The data values range from 0 to 70 dBZ, which indicates the intensity of the radar echoes. The mean wind element dataset is generated by using the Inverse Distance Weighting (IDW) (Babak et al.[41]) interpolation method to interpolate mean wind data collected from automatic meteorological stations in Jiangsu and its surrounding regions onto a uniform grid. The values in the dataset range from 0 to 35 m s–1. IDW interpolation method is based on Tobler's First Law (Miller and Harvey [42]). The value of the point to be interpolated is determined based on the reciprocal of the distance between the point to be interpolated and the sample point. Consequently, the farther the point to be interpolated is from the sample point, the lesser impact it will have on the interpolated value. Although this generates a certain degree of error and does not fully represent the actual observations, it makes full use of discrete observations to reflect the spatial distribution of meteorological elements objectively.

    We take the radar data and the mean wind data and reduce the original resolution of 480 × 560 pixels to 120 × 140 pixels by bilinear interpolation to facilitate experiments. We use the 28, 158 sequences for the training set and the 2987 sequences for the test and validation set. We read the image sequences in chronological order using a 20-frame wide sliding window. The model takes the first 10 frames of radar echo and wind speed. The radar echo sequence for the next 10 frames is then predicted. In other words, we predict the radar echo sequence for the next 60 minutes based on the last 60 minutes of observations.

  • We set the batch size to 8, the patch size to 4, and the learning rate to 10–3 during training. All models are trained for 50 rounds using Adam as the optimizer and MSE as the loss function. To prevent overfitting during the model training process, we read each batch every 2 strides and revert to 1 stride during testing. Since the field of meteorology focuses on CSI, we select the model with the highest CSI score at a threshold of 30 dBZ as the best-trained model for the test comparison.

    We use probability of detection (POD), false alarm ratio (FAR) (Barnes et al.[43]), CSI and HSS, and four meteorological forecast scores as indicators to measure the effectiveness of model forecasts. CSI reflects the proportion of both predicted and factual occurrences, penalizing false and missing reports, while HSS excludes cases where a random forecast is correct. Irrespective of predicted and true values, if the radar echo value exceeds the specified threshold, it is recorded as 1. Otherwise, it is recorded as 0. The corresponding values are then calculated and accumulated to obtain the confusion matrix, where TP denotes true positives (prediction=1, truth=1), TN denotes true negatives (prediction=0, truth=0), FP denotes false positives (prediction=1, truth=0), FN denotes false negatives (prediction=0, truth=1). We took thresholds of 20 dBZ and 30 dBZ in our experiments and calculated POD, FAR, CSI, and HSS as defined below.

    $$ \begin{aligned} & \mathrm{CSI}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}+\mathrm{FP}} \\ & \mathrm{POD}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}} \\ & \mathrm{FAR}=\frac{\mathrm{FP}}{\mathrm{FP}+\mathrm{TP}} \\ & \mathrm{HSS}=\frac{2 \times(\mathrm{TP} \times \mathrm{TN}-\mathrm{FN} \times \mathrm{FP})}{(\mathrm{TP}+\mathrm{FN}) \times(\mathrm{FN}+\mathrm{TN})+(\mathrm{TP}+\mathrm{FP}) \times(\mathrm{FP}+\mathrm{TN})} \end{aligned} $$
  • Table 1 shows the four scores of each model on the competition dataset based on different thresholds. In addition to the CFSN proposed in this study, the single-input models ConvLSTM, PredRNN, MIM, and PredRNN ++, which fuse two types of input data on the channel, are tested in experiments. All models use the same training set, validation set, and test set, and the same randomization mechanism to initialize the data order and model weights. Overall, we can first find that CFSN performs best.

    Model CSI↑ POD↑ FAR↓ HSS↑
    20 30 20 30 20 30 20 30
    ConvLSTM 0.538 0.321 0.697 0.516 0.309 0.440 0.634 0.429
    PredRNN 0.548 0.323 0.670 0.431 0.265 0.348 0.644 0.430
    PredRNN++ 0.538 0.314 0.658 0.433 0.279 0.377 0.636 0.419
    MIM 0.541 0.321 0.639 0.410 0.250 0.342 0.639 0.426
    SimVP 0.481 0.258 0.596 0.353 0.282 0.364 0.575 0.350
    CFSN 0.565 0.357 0.669 0.467 0.237 0.326 0.661 0.463

    Table 1.  Comparison results of the four indices for each type of model on the dataset at thresholds of 20 and 30 dBZ, respectively. Use PredRNN as the baseline and the highest scores are indicated in red. ↑ indicates that the higher the score, the better, while ↓ indicates that the lower the score, the better.

    Whether the threshold is 20 or 30 dBZ, CFSN has the best CSI, FAR, and HSS. At the 30 dBZ threshold, the CSI and HSS of CFSN are 10.5% and 7.6% higher, respectively, than those of the baseline PredRNN. And at the 20 dBZ threshold, they are 3.1% and 2.6% higher, respectively. Although ConvLSTM has the highest POD, the FAR is also the highest. CFSN, which has the lowest FAR, is 5.2% and 4.6% lower than the next-best MIM at thresholds of 20 and 30 dBZ, respectively. In addition, the POD of CFSN at the 30 dBZ threshold is also the next best. This suggests that CFSN adequately incorporates the data and forms effective constraints on the distribution, especially focusing on strong echo. To better illustrate the experimental results, we visualize some of the experimental data in Fig. 5.

    Figure 5.  Each model predicts the last 10 frames in the radar echo sequence on the dataset. These are images from frames 1, 4, 7 and 10.

    From the above, we can see that the model in this study retains strong echoes in the area corresponding to the weather radar image at the late stage of the forecast. In contrast, the echo predicted by PredRNN and MIM decays faster, and the strong echo is less, while ConvLSTM and SimVP predict too much strong echo. Fig. 6, showing the scores of each model at each time, again confirms the above discussion. On the one hand, CFSN maintains a low score of FAR at each predictive frame regardless of the threshold. On the other hand, at thresholds of 20 and 30 dBZ, CFSN outperforms other models, with both CSI and HSS higher than the comparison model.

    Figure 6.  Four scores for each predictive frame for models at thresholds of 20 and 30 dBZ.

  • To further investigate the effectiveness of Fusion GRU and Top Connection in the CFSN, we continue to conduct ablation experiments on the Jiangsu meteorological station dataset. We remove the cross-attention module, Fusion GRU, and Top Connection, respectively, on CFSN to conduct comparative experiments with the full CFSN model. The experimental results are shown in Table 2.

    Model CSI↑ POD↑ FAR↓ HSS↑
    20 30 20 30 20 30 20 30
    CFSN 0.565 0.357 0.669 0.467 0.237 0.326 0.661 0.463
    CFSN no Fusion LSTM 0.553 0.342 0.663 0.444 0.260 0.349 0.650 0.446
    CFSN no Top Connection 0.560 0.351 0.679 0.473 0.264 0.363 0.656 0.457
    CFSN no cross attention 0.567 0.353 0.673 0.462 0.236 0.318 0.663 0.457

    Table 2.  Comparative results of the four indices for ablation experiments at thresholds of 20 and 30 dBZ. The highest scores are indicated in red. ↑ indicates that the higher the score, the better, while ↓ indicates that the lower the score, the better.

    Table 2 shows that CFSN does not achieve the highest scores of CSI and HSS for the full model either by removing Fusion GRU or by removing Top Connection. Both Top Connection and Fusion GRU have a beneficial effect on the scores of the multiple data fusion radar extrapolation. Without Fusion GRU, the model is unable to fully fuse the data. Without Top Connection, the model loses constraints on the prediction of the data distribution, which results in a high POD but also a high FAR. Only when the two are combined can the best extrapolation results be achieved. In addition, we removed the cross-attention module from the Fusion GRU. Although the lowest FAR is obtained, the focus of the model on strong echo is decreased. As a result, the CSI and HSS scores at the 30 dBZ threshold decrease by 0.004 and 0.006, respectively. Out of the need to focus on strong echo development, we choose to retain the cross-attention module according to the higher CSI score. Details of scores are shown in Fig. 7, which again confirms the above discussion. A complete CFSN generally outperforms the CFSN without a module. The cross-attention mechanism makes the model tend to focus on the stronger parts of echo.

    Figure 7.  Four scores for each predictive frame for ablation study at thresholds of 20 and 30 dBZ.

  • In this study, we aim to enhance the accuracy of weather radar echo extrapolation by incorporating wind speed data and propose a cascaded fusion spatiotemporal network, which undergoes comprehensive research and validation. The cascaded structure of Fusion GRU effectively explores and captures the relationship between wind speed and radar echo. The cross-attention mechanism enhances data interactions, with a particular emphasis on handling strong echoes. Additionally, Top Connection extracts spatial characteristics from the encoder output sequence and generates initialization data with specific constraints, which then replace the all-zero input for the decoder.

    These specifically show an overall improvement in weather radar echo extrapolation, achieving high scores on the Jiangsu dataset, especially with the CSI 10.7% higher than the baseline. In addition, we find that complex models without provided constraints are prone to higher FAR, and all have reduced performance in the later period of prediction. How to further improve the prediction accuracy of the model and ensure the extrapolation effect in the later period of prediction by adjusting the added constraints remains the focus of further research in the future.

    In general, we still take advantage of multi-source input and achieve relatively good prediction results. In the future, to obtain better extrapolation results, we will try to use the new framework Transformer (Vaswani et al.[40]) to take advantage of the attention mechanism to better capture long-term dependencies and global information. The inclusion of more meteorological elements as prediction aids for radar echo extrapolation will contribute to a more comprehensive understanding of the weather conditions and effectively supplement the constraint information in the extrapolation process to achieve more accurate weather radar echo sequence extrapolation.

Reference (43)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return