Document Type : Complete scientific research article
Authors
1
M.Sc. Student in Watershed Sciences and Engineering, Department of Rangeland and Watershed Management, Faculty of Agriculture and Natural Resources, University of Gonbad Kavous, Iran,
2
Associate Professor in Engineering Hydrology, Department of Rangeland and Watershed Management, Faculty of Agriculture and Natural Resources, Gonbad Kavous University,
3
Associate Professor in Environmental Hydrogeology, Department of Rangeland and Watershed Management, Faculty of Agriculture and Natural Resources, University of Gonbad Kavous, Iran
4
Associate Professor in Statistics, Department of Statistics, Faculty of Science, Golestan University, Iran
Abstract
Background and Objectives: The relationship between rainfall and runoff is a fundamental concept in hydrology, reflecting complex processes such as infiltration, evapotranspiration, and water exchange between surface and subsurface flows that ultimately lead to runoff generation. In arid and semi-arid regions, irregular and intense rainfall events, coupled with limited hydrometric data and the intricate structure of watersheds, pose significant challenges for water resources management. Rivers, as vital components of the hydrological cycle, play a crucial role in water supply, aquifer recharge, and ecosystem sustainability, and are highly sensitive to variations in precipitation. Advanced runoff modeling, particularly through data-driven approaches such as machine learning and deep learning, has enabled the identification of nonlinear patterns and the accurate prediction of hydrological events, providing valuable tools for informed decision-making in sustainable water resources management. Accordingly, the primary objective of this study is to evaluate and compare the performance of statistical and data-driven models, including Quantile Regression (QR), Multi-Layer Perceptron (MLP), Adaptive Neuro-Fuzzy Inference System (ANFIS), and Transfer Function (TF), in simulating the rainfall–runoff process in the semi-arid regions of Iran.
Materials and Methods: In the present study, monthly data from hydrometric and meteorological stations in three watersheds, Zoshk, Dehbar, and Kardeh, located in Khorasan Razavi Province, were used to model the rainfall–runoff process over 26 years (1997–2022). As an initial step, the Chow method was employed to assess the accuracy and homogeneity of the time series data. Given the temporal dependency of monthly rainfall and runoff data, the data were structured as a time series for further analysis. To simulate and forecast monthly runoff for the next 12 months based on rainfall data, four models were utilized: multilayer perceptron (MLP) neural network, adaptive neuro-fuzzy inference system (ANFIS), transfer function (TF), and quantile regression (QR). Considering that more recent years provide more accurate insights into current conditions, a forward selection approach was adopted to determine the effective number of years for modeling. Among the available years, all but one were used for model training, and the remaining year was reserved for validation. The performance of the calibrated models was evaluated using three standard metrics: mean absolute error (MAE), root mean square error (RMSE), and the coefficient of determination (R²). It is noteworthy that statistical analyses and computations in this study were performed using R software (quantreg and frbs packages), SAS, MINITAB, and SPSS.
Results: The analysis of the monthly trend of rainfall and runoff time series in the three studied watersheds revealed that these basins exhibit distinct structural characteristics. Cross-correlation analysis indicates the presence of a lagged relationship between rainfall and runoff time series in the three selected watersheds. Specifically, the direct effect of precipitation on runoff was observed with a maximum lag of one month in the Zoshk and Dehbar watersheds, while in the Kardeh watershed—due to its mountainous conditions and the higher contribution of snow in the precipitation regime—a maximum lag of three months was identified. The effective number of years for modeling, determined through a forward selection process, was found to be 17, 19, and 9 years for the Zoshk, Dehbar, and Kardeh watersheds, respectively, providing optimal model performance. The validation results of the models using MAD, RMSE, and R² indices indicate that the MLP model provides the most accurate estimation of monthly runoff in the Zoshk, Dehbar, and Kardeh watersheds (RMSE = 0.0032, 0.0028, and 0.0123 m³/s, respectively) compared to the other models. Following MLP, the ANFIS model ranked second in performance (RMSE = 0.0146, 0.0044, and 0.0186 m³/s, respectively). The validation results further revealed considerable similarity between the simulation outputs of the MLP and ANFIS models. After these two models, the quantile regression (QR) approach exhibited the next highest accuracy in the Zoshk, Dehbar, and Kardeh watersheds (RMSE = 0.0344, 0.0293, and 0.0444 m³/s, respectively). The transfer function (TF) model also provided relatively satisfactory results in identifying trends and evaluating prediction accuracy; however, it exhibited the weakest performance among the four models in the Zoshk, Dehbar, and Kardeh watersheds (RMSE = 0.0344, 0.0378, and 0.0510 m³/s, respectively). Validation results also indicated notable similarities between the simulation outputs of the TF and QR models. Based on the coefficient of determination (R²) for the models, it can be concluded that a substantial proportion of the variance in the dependent variable was adequately explained by the independent variables. Therefore, all four models demonstrated acceptable levels of predictive accuracy.
Conclusion: The results of this study indicated that the four models employed, including MLP, ANFIS, QR, and TF, despite differences in accuracy, demonstrated satisfactory performance in identifying patterns and modeling variations in output time series based on input data. Accordingly, these models can be considered efficient and effective tools for predicting monthly runoff based on precipitation in the study area. No evidence of persistent overfitting or underfitting, which could reduce the models’ accuracy and efficiency, was observed in any of the approaches. Nevertheless, a comparison of the evaluation metrics indicated that, in terms of predictive accuracy, the models ranked in the following order: MLP, ANFIS, QR, and TF. Given the high variability of precipitation in Khorasan Razavi Province, it was initially expected that the QR model, designed to handle extreme values differently from average values, would achieve superior accuracy. However, its performance was noticeably lower than that of the MLP and ANFIS models. Although the TF model demonstrated lower accuracy compared to the other models, it played a significant role in identifying time lags in the relationship between input variables (rianfall) and output variables (runoff). Moreover, the structure of the TF model provides a suitable framework for explaining hydrological processes and representing the influence of rainfall on runoff within a process-based modeling perspective.
Keywords
Main Subjects