Comparative comparison of data mining models in downscaling rainfall and temperature (Case Study: Bazoft-e- Samsami Watershed)

Document Type : Complete scientific research article


Background and Objectives: Temperature and rainfall are two important meteorological variables, especially in arid and semi-arid areas. As a result, determining the value of these variables, their changes and prediction of these phenomena are necessary for more precise planning in the management of agricultural, economic and social sectors. Nowadays, incompatibility of temporal and spatial scales required in investigated models on the effect of climate change with GCM outputs and the need to assess the change trend in meteorological threshold variables at the regional scale has led to develop various downscaling methods. So, the aim of this study is the comparative comparison of data mining models in downscaling of rainfall and temperature based on data of NCEP general circulation model.
Material and Methods: The study area in this research is bazoft- e- Samsami watershed. This basin is one of the northern Karun sub-basins located in the northwest of Chaharmahal and Bakhtiari province. Marghak rain gauge and hydrometric stations are located at its outlet.
In this study, the performance and efficiency of four methods including decision tree (M5), Nearest Neighbor (KNN), Multilayer Perceptron (MLP) and Simple linear regression (SLR) were evaluated for modeling monthly rainfall and temperature of Marghak station during the training period of 1971-1990 and The 1991-2000 test period using NCEP output parameters.
Results: Monthly rainfall modeling results using mentioned models showed that the output of all models except the KNN model provides negative values for rainfall. The rainfall prediction by M5 model in January, March, April and December is lower than the observed values (P). This situation is also somewhat seen in other models. Also, given that the minimum rainfall is zero, it can be concluded from the low predicted values rather than observed values that the maximum limit of rainfall with these models is not well predicted. The prediction of rainfall by all models in all months except May has a lower standard deviation than the observed values (P).
The predicted results of monthly temperature also showed that only MLP output provides negative values for the temperature, which can be due to the extrapolation and generalizationin in MLP method. Also, The standard deviation obtained from all models in January, February, March, April, July, August, October, November and December is more than standard deviation of observed temperature. The results of statistical analyzes also showed that M5 than the other models in the test stage according to RMSE, MBE and R2 have better estimates for rainfall and monthly temperature. Although the results of determination coefficient (R2) in the test stage for monthly temperature estimation are weaker than monthly rainfall.
The results of the efficiency of four models of KNN, M5, SLR and MLP in monthly rainfall and temperature modeling in Marghak meteorological station with NCEP output data showed that these models were weak in downscaling the monthly rainfall and temperature. Therefore, despite the relative superiority of M5 model compared to other models, the use of these data mining models is not recommended to predict rainfall and temperature variables in Margak station.
Keywords: Downscaling, Decision tree (M5), Nearest Neighbor (KNN), Multilayer Perceptron (MLP), Simple linear regression.


 1.Aksornsingchai, P., and Srinilta, CH. 2011. Statistical downscaling for rainfall and
temperature prediction in Thailand. Proceeding of the international multi conference of
engineers and computer scientists, 6p.
2.Chen, H., Yu Xu, C., and Guo, S. 2012. Comparison and evaluation of multiple GCMs,
statistical downscaling and hydrological models in the study of climate change impacts on
runoff. J. Hydrol. 434-435: 36-45.
3.Dawson, C.W., and Wilby, R. 1988. An artificial neural network approach torainfall–runoff
modeling. J. Hydrol. 43: 47-66.
4.Deepashree, R., and Mujumdar, P. 2011. A comparison of three methods for downscaling
daily precipitation in the Punjab region. Hydrological Processes. 25: 23. 3575-3589.
5.Dibike, B.Y., and Coulibaly, P. 2006. Temporal neural networks for downscaling climate
variability and extremes. J. Neur. Net. 19: 135-144.
6.Ghamghami, M. 2010. Evaluation and comparison of parametric models and nonparametric
analysis of meteorological data. M.Sc. Thesis, Tehran University, 128p. (In Persian)
7.Ghorbani, Kh. 2015. Evaluation data mining models in Downscaling of precipitation based
on NCEP general circulation model output (Case study: Kermanshah synoptic station).
Iran Water Res. J. (IWRJ). 15: 15. 177-186. (In Persian)
8.Hamlet, A.F., and Lettenmaier, D.P. 2007. Effects of 20th century warming and climate
variability on flood risk in the western U.S. Water Resources Research. 43: W06427,
9.IPCC. 2007. Summary for Policymakers. P 1-18, In: S. Solomon, D. Qin, M. Manning, Z.
Chen, M. Marquis, K.B. Averyt, M. Tignor and H.L. Miller (Eds.), Climate Change 2007:
The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment
Report of the Intergovernmental Panel on Climate Change, Cambridge University Press,
10.Khan, M.S., Coulibaly, P., and Dibike, Y. 2006. Uncertainty analysis of statistical
downscaling methods. J. Hydrol. 319: 4. 357-382.
11.Kutner, M., Nachtcheim, Ch., and Neter, J. 2005. Applied Linear Statistical Models.
McGraw-Hill Irvin Press, 1396p.
12.Meshkavati, A.M., kordjazi, M., and Babaeian, I. 2011. Evaluation of Lars models to
simulate meteorological data Golestan Province in the period (1993-2007). J. Appl. Res.
Geograph. Sci. 19: 81-96. (In Persian)
13.Mitchell, T.D. 2003. Pattern Scaling: An Examination of Accuracy of the Technique for
Describing Future Climates. Climatic Change. 60: 217-242.
14.Quinlan, J.R. 1992. Learning with continuous classes. Proceedings of Fifth Australian joint
conference on artificial intelligence, Singapore, Pp: 343-348.
15.Semenov, M.A., and Barrow, E.M. 2002. LARS-WG a stochastic weather generator for use
in climate impact studies. User’s manual, Version 3.0.
16.Seyyed Kaboli, H., Akhondali, A.M., Masah Bavani, A.R., and Radmanesh, F. 2012. A
Downscaling Model Based on K-nearest neighbor (K-NN) Non-parametric Method. J. Water
Soil. 26: 4. 779-808. (In Persian)
17.Tripathi, S., Srinivas, V., and Nanjundiah, R.S. 2006. Downscaling of precipitation for
climate change scenarios: A support vector machine approach. J. Hydrol. Pp: 621-640.
18.Two Crows Corporation. 1999. Introduction to datamining and knowledge discovery, third
edition Available at: 36p.
19.Zahoor, J., Abrar, M., Bashir, Sh., and Mirza, A. 2009. Seasonal to inter-annual climate
prediction using data mining KNN technique. Wireless Networks, Information Processing
and Systems, Communications in Computer and Information Science. 20: 40-51.
20.Witten, I.H., and Frank, E. 2005. Data mining practical machine learning tools and
techniques with Java implementations. Morgan Kaufmann San Francisco, 664p.