Assessment of Kernel Functions Performance in River Flow Estimation using Support Vector Machine

Document Type : Complete scientific research article

Authors

Abstract

Background and objectives: Accurate prediction of river flow has an important role in the optimum management of available water resources. In recent years, support vector machine (SVM) that is one of the most important data-driven models, has been considered in this regards. This model is a useful learning system based on constrained optimization theory that uses induction of structural error minimization principle and results a general optimized answer. Such as other data mining models, the SVM model can also be used for runoff simulation when the only available data is runoff (autoregressive simulation). Typically, three kernel functions, namely, radial basis, polynomial of degree d and linear are applied in SVM that use of each function with various parameters for river flow estimation may have different results. Therefore, it is necessary to evaluate the accuracy of each of these functions and select the appropriate kernel function for runoff simulation. Since time series models, namely, AR, ARMA and ARIMA are the main models for autoregressive simulation of runoff, relative accuracy of kernel functions can be investigated by comparing their performance with these models. Therefore, assessment of the accuracy of kernel functions for monthly river flow simulation and comparison of their performance with time series models is main aim of this study.
Materials and Methods: In this study Kherkherehchiy river basin was selected as the study area and observed monthly river flow of this basin in the Santeh gauging station were applied for calibration and validation of models. For this purpose, first 75 percent of monthly river flow data (1367-1384) were selected to calibrate models and 25 percent of data (1385-1390) were used to validate models. Next, probability distribution of monthly river flow data in Santeh station were studies based on Kolmogorov-Smirnov and Shapiro- Wilk test and then normalization of data distribution were done. After optimization of parameters for each kernel functions the monthly flow values were predicted in Santeh station and the performance of these functions were evaluated using root mean square errors (RMSE) and the correlation coefficient (CC).
Results: The investigations of this study indicated that although there is no significant difference in the results of three kernel functions, but the polynomial kernel function of degree 4 with CC and RMSE values of 0.86 and 5.88 (m3/sec) respectively in the testing period, has high accuracy and better performance in prediction of monthly flow in comparison to other kernel functions. Also the results showed that ARMA(6,2) with CC and RMSE values of 0.82 and 6.47 (m3/sec) respectively in the testing period, has good performance in prediction of Kherkherehchiy monthly flow compared to the other time series models.
Conclusion: Finally, the predicted monthly river flow using polynomial kernel function of degree 4 (as a representative of SVM model) was compared with the results of ARMA(6,2) (as a representative of time series model) and this conclusion was obtained that the SVM model has a better performance than time series models in the monthly river flow prediction of the Kherkherehchiy basin

Keywords


1.Adamowski, J. 2013. Using support vector regression to predict direct runoff, base flow
and total flow in a mountainous watershed with limited data in Uttaranchal, India. Versita J. 45: 1. 71-83.
2.Adeli, A., Fathi-Moghadam, M., and Musavi Jahromi, H. 2014. Using Stochastic Models to Produce Artificial Time Series and Inflow Prediction: A Case Study of Talog Dam reservoir, Khuzestan Province, Iran. J. Inter. Bull. Water Resour. Dev. 2: 5. 1-13. (In Persian)
3.Asefa, T., Kemblowski, M., McKee, M., and Khalil, A. 2005. Multi-time scale stream flow predictions: The Support vector machines approach. J. Hydrol. 318: 1-4. 7-16.
4.Azani, A., Fazelifard, M.H., and Ghorbani, M.A. 2014. Simulation of Urmia Lake water level using support vector machines and artificial neural network. The 13th Conference of the hydraulic Iran, Tabriz University. (In Persian) 
5.Baofeng, G., Gunn, S.R., Damper, R.I., and Nelson, J.D.B. 2008. Customizing Kernel Functions for SVM-Based Hyperspectral Image Classification. IEEE Transactions on Image Processing. 17: 4. 622-629.
6.Bani Habib, M.A., and Valipour, M. 2008. Comparative assessment of ARMA, ARIMA and autocorrelated artificial neural network models in forecasting inflow to the Dez reservoir. First International Conference on Water Crisis, Zabul University. (In Persian)
7.Basak, D., Pal, S., and Patranabis, D.C. 2007. Support vector regression. Neural Inf. Process. 11: 203-225.
8.Box, G.E.P., Jenkins, G.M., and Reinsel, G.C. 1994. Time Series Analysis: Forecasting and Control. Third edition, Prentice Hall. 598p.
9.Damle, C., and Yalcin, A. 2007. Flood prediction using time series data mining. J. Hydrol. 333: 2-4. 305-316.
10.Dibike, Y., Velickov, S., Solomatine, D., and Abbott, M. 2001. Model induction with of support vector machines: Introduction and applications. J. Comp. Civil Engin. 15: 3. 208-216.
11.Eskandari, A., Nouri, R., Meraji, H., and Kiaghadi, A. 2012. Development of appropriate model based on artificial neural network and support vector machine for forecasting 5-Days Biochemical Oxygen Demand (BOD5). J. Ecol. 61: 71-82. (In Persian)
12.Fletcher, R. 1987. Practical Methods of Optimization. Wiley, New York. 456p.
13.Jain, A., and Kumar, A.M. 2007. Hybrid neural network models for hydrologic time series forecasting. Appl. Soft Com. J. 7: 2. 585-592.
14.Jian, Y., Cheng, C.T., and Chau, K.W. 2006. Using support vector machines for long-term discharge prediction. J. Hydrol. Sci. – des Sci. Hydrol. 51: 4. 599-612.
15.Kavzoglu, T., and Colkesen, I. 2009. A kernel functions analysis for support vector machines for land cover classification. Inter. J. Appl. Earth Obs. Inf. 11: 5. 352-359.
16.Kakaei Lafadani, E., Moghaddam Nia, A., Ahmadi, A., Jajarmizadeh, M., and Ghafari, M. 2013. Stream flow simulation using SVM, ANFIS and NAM models (A Case study). Caspian J. Appl. Sci. Res. 2: 4. 86-93.
17.Liu, G.Q. 2011. Comparison of Regression and ARIMA models with Neural Network models to forecast the daily streamflow. PhD thesis, University of Delaware. 545p.
18.Misra, D., Oommen, T., Agarwal, A., and Mishra, S.K. 2009. Application and analysis of Support Vector machine based simulation for runoff and sediment yield. J. Biosyst. Engin. 103: 9. 527-535.
19.Moharrampour, M., Mehrabi, A., Hajikandi, H., and Sohrabi, S. 2013. Comparison of Support Vector Machines (SVM) and Autoregressive integrated moving average (ARIMA) in daily flow forecasting. J. River Engin. 1: 1. 34-45.
20.Salas, J.D. 1993. Analysis and modeling of hydrological time series. P 1-19, In: R, David (Ed.), Handbook of Hydrology, McGraw-Hill, New York.
21.Vapnik, V., and Chervonenkis, A. 1991. The necessary and sufficient conditions for consistency in the empirical risk minimization method. Pattern Recognition and Image Analysis. 1: 3. 283-305.
22.Vapnik, V., and Cortes, C. 1995. Support vector networks. Machine Learning. 20: 273-297.
23.Wei, W.W.S. 2006. Time Series analysis: univariate and multivariate methods (second edition). Greg Tobin Publisher, ISBN 0-321-32216-9.
24.Yang, K., Shan, G., and Zhao, L. 2006. Correlation Coefficient Method for Support Vector Machine Input Samples. International Conference on Machine Learning and Cybernetics. Pp: 2857-2861.
25.Yoon, H., Jun, S.C., Hyun, Y., Bae, G.O., and Lee, K.K. 2011. A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer. J. Hydrol. 396: 1-2. 128-138.
26.Zhang, G.P. 2003. Time series forecasting using a hybrid ARIMA and neural network model. Neuro computing. 50: 159-175.