Authors
Catalina Jerez (CIRES,Dept. of Civil, Env. and Arch. Eng.|Univ. of Colorado Boulder), Rajagopalan Balaji (CIRES,Dept. of Civil, Env. and Arch. Eng.|Univ. of Colorado Boulder), Emerson LaJoie (NOAA|NWS|NCEP|CPC), Matthew Rosencrans (NOAA|NWS|NCEP|CPC), Seth Shanahan (Colorado River Program Manager, Southern Nevada Water Authority), Sarah Baker (USBR), Paul Miller (NOAA|CBRFC), Edith Zagona (|CADSWES|Dept. of Civil, Env. and Arch. Eng.|Univ. of Colorado Boulder)
Abstract
Increased climate variability and increased water use make effective water resource management in the Colorado River Basin essential. Streamflow forecasts play a pivotal role in informing operational decisions, with the Colorado River Midterm Modeling System relying heavily on accurate predictions to inform water allocation decisions and reservoir operations. However, forecasting seasonal runoff remains a significant challenge due to the complex interplay of hydroclimatic variables across different times-scales. To address this problem, we develop a multi-model ensemble approach for seasonal streamflow forecast using machine learning techniques -Random Forest (RF) and Gradient Boosting Machine GBM) -for runoff season flows (April 1 - July 31) at the gauge of Lees Ferry, Arizona, CRB. The ensemble is trained over n iterations to improve robustness and capture model uncertainties. RF is implemented using three R packages (randomForest, ranger and rfsrc), while GBM utilizes two packages (gbm and xgboost). The model predictors include antecedent basin conditions (PRISM), Snow Water Equivalent (SWE), mean Ensemble Streamflow Prediction (ESP), North American Multi-Model Ensemble (NMME), and large-scale oceanic-atmospheric teleconnections (Sea Surface Temperature (SST) patterns, Pacific Decal Oscillation (PDO), and Atlantic Multidecadal Oscillation (AMO)). The models are trained (validated) separately for lead-times spanning 0 to 24 months over the period 1983 - 2019 (2020 - 2024). Results indicate at short lead-times (0 - 3 months), ESP, SWE and NMME provide the strongest predictability. At medium lead-time (4 - 9 months), NMME becomes significant, along with SST anomalies, which contribute most to predictability for longer lead-times (> 9 months). RF and GBM consistently outperform the climatological benchmark and ESP, particularly for lead-times beyond 6 months. RF reduces ensemble median forecast errors by 10% +- 2.5% (~ 0.4 to 0.65 MAF) compared to ESP in Root Mean Square Error (RMSE) reductions. Although GBM performs very well when looking at short lead-times, RF outperforms the former when looking at longer lead-times. The (Continuous) Ranked Probability Skill Scores confirm the increased reliability of these models, whereby GBM has a slight edge at shorter lead-times. The inclusion of NMME data improves predictive performance largely at mid-lead-times-results where RF and GBM both hone the integration of the climate signal.