Abstract
|
The sodium adsorption ratio (SAR) is the most crucial irrigation water quality indicator to diagnose the suitability of agricultural water resources. Due to this reason, accurate forecasting of SAR in the absence of its time series, based on limited input sequences, is recently considered a challenging environmental issue on a monthly scale. This research developed a dual eXplainable multivariate expert framework for the first time to forecast monthly SAR at Zayanderud River, Iran. The framework (i.e., BS-GPR-E.TVF-EMD-VMD) consisting of a Boruta coupled with SHapley Additive exPlanations (Boruta-SHAP) feature selection, an ensemble of time-varying filter-based empirical mode decomposition (TVF-EMD) and variational modal decomposition (VMD), namely (E.TVF-EMD-VMD), and eXplainable Gaussian process regression (GPR). The main novelty of this framework is converting the “black-box” nature of the forecasting model to a dual interpretable "glass box" before and during the learning process. For this purpose, among nine hydrometric and water quality parameters associated with Zayanderud River at two stations (Regulating dam and Zaman Khan) over the period of 1969 to 2016, the significant two-month antecedent information (lags) signals were extracted using the Boruta-SHAP feature selection. Afterwards, the optimal inputs signal lags for each station were decomposed into sub-components to reduce the complexity and non-stationary of original signals using three pre-processing techniques (i.e., E.TVF-EMD-VMD, TVF-EMD, and VMD). The decomposed predictors were employed as inputs into the multilayer perceptron neural network (MLP), Random Forest (RF), Elman recurrent neural network (ERNN), and eXplainable GPR approaches. Statistical validation and infographic tools revealed that the BS-GPR-E.TVF-EMD-VMD regarding the best performance in the Regulating dam (R=0.9817, RMSE=0.1431, and NSE=0.8866) and Zaman Khan (R=0.9632, RMSE=0.0610, and NSE=0.9233) stations, outperformed the other complementary and standalone counterpart frameworks followed by the BS-GPR-TVF-EMD and BS-ERNN-E.TVF-EMD-VMD, respectively. SHAP explainer through the GPR model clearly interpreted the effect of the lagged-time sub-components related to each predictor and represented the impact of each decomposition technique on the input signals through E.TVF-EMD-VMD aiming to forecast SAR in standalone and complementary frameworks.
|