Identification and Prediction of Consumption Behavior Using Decision Tree and Consumer Value Pyramid

Document Type : Research Paper


1 MSc., Dept. of Industrial Engineering, Faculty of Industrial Engineering, Iran University of Science and Technology, Tehran, Iran

2 Prof., Dept. of Information Technology, Faculty of Industrial Engineering, Iran University of Science and Technology, Tehran, Iran

3 MSc., Dept. of Industrial Engineering, Faculty of Industrial Engineering, Payame Noor University, Tehran, Iran


What is important in the current situation it is important to pay attention to patterns of the water consumption behavior of subscribers and to identify subscribers and consumers with a higher position in the value pyramid in the field of water consumption management policy. Value pyramid is a tool that identifies valuable customers in terms of consumption, and so, has great application and importance in the field of water consumption management to identify high-consumption and low-consumption customers. Therefore, in the present study, in order to identify the patterns of consumption behavior of Shiraz Water and Wastewater Company subscribers based on their consumption value pyramid and to predict the subscribers with a higher position in the value pyramid, data mining techniques have been used. In the framework of the proposed method, first, the data of water subscribers' consumption including residential, commercial and industrial subscribers, public and administrative, religious and educational sites, military and non-governmental, for two consecutive years, were extracted from the database of Shiraz Water and Wastewater Company. After determining the optimal number of clusters using self-organizing map and Davis Bouldin index, clustering operation is performed using K-Means algorithm. It should be noted that the indicators and criteria for subscriber clustering include the type of use, location, consumption, history of unauthorized branching, number of disconnection notices and time of payment of bills which have been identified using the opinion of experts. Then, while calculating the consumption value of subscribers in each cluster and plotting the subscriber consumption value pyramid, the decision tree algorithm is used to predict and discover the behavioral patterns of subscribers. The results show that Shiraz Water and Wastewater Company subscribers are divided into six clusters in terms of consumption behavior patterns. While drawing the consumption value pyramid, these six clusters are classified into three classes: high consumption subscribers, medium consumption subscribers and low consumption subscribers. After implementing the decision tree, the accuracy of the tree was 78.92 that, according to the results of the decision tree, the subscribers of these three classes have 11 patterns of behavior that predict the type of consumption. Thus, according to the 11 behavioral patterns of the subscribers of Shiraz Water and Wastewater Company, the consumption of new subscribers can be predicted and its position in the value pyramid can be determined.


Aghababaei, A. & Shahrabi, J. 2012. Application of data mining knowledge in the face of indebted subscribers in Mashhad water and wastewater company (Region 3). The 6th Data Mining Conference. Tehran, Iran. (In Persian)
Aghababaei, A., Shahrabi, J. & Hadavandi, I. 2011. Designing a data mining model for indebted subscribers of Mashhad water and wastewater company (Region 3). 5th Iran Data Mining Conference, Amirkabir University of Technology. Tehran, Iran. (In Persian)
Aghahosseinali Shirazi, M., & Akbarpour, A. 2011. Estimation of daily urban water demand using the February series: a case study of Birjand city in South Khorasan province. International Conference on Water and Wastewater. Tehran, Iran. (In Persian)
Ahangarkani, M. & Khasteh, S. H. 2019. Analysis of urban (domestic) water consumption in Babol city using data mining methods. Sepehr Geographical Information, 28, 53-69. (In Persian)
Amini, Q. 2020. Modeling the diagnosis of unauthorized water use (case study: Qom city). Journal of Water and Wastewater, 31(4), 184-193. (In Persian)
Amini, Q., Farmani Enteza, H., Jan Sadeghpour, A. & Davoodabadi, A. 2018a. Application of data mining in identifying subscribers with unauthorized water uses (case study: Qom water and wastewater company). 2nd Iranian Water and Wastewater Science and Engineering Congress. Isfahan, Iran. (In Persian)
Amini, Q., Farmani Entezam, H., Jan Sadeghpour, A. & Davoodabadi, A. 2018b. Identification and extraction of water consumption pattern by data mining method (case study: Qom water and wastewater company). 2nd Iranian Water and Wastewater Science and Engineering Congress. Isfahan, Iran. (In Persian)
Amoozegar, M. 2016. Presenting a two-step solution to identify the pattern of electricity consumption. Iranian Journal of Quality and Productivity of Electricity Industry, 5(9), 48-57. (In Persian)
Ansari, H., Boostani, A., Tabatabayi, A. & Foroozesh. M. 2015. Investigation of consumption management and estimation of Mashhad drinking water demand in the horizon of 1420. Water and Sustainable Development, 4(1), 125-132. (In Persian)
Arora, N., Arora, A. S., Sharma, S. & Reddy, A. S. 2014. Use of cluster analysis-A data mining tool for improved water quality monitoring of river Satluj. International Journal of Advanced Computer Science and Applications, 6, 63-69.
Azhar, S. A. S., Johar, H., Baki, S. R. M. S. & Tahir, N. M. 2013. Optimization of water quality monitoring based on fuzzy algorithms. In 2013 IEEE Conference on Systems, Process and Control (ICSPC). Kuala Lumpur, Malaysia. 283-288.
Azimi, V., Vakilifard, A. & Asadi, A. 2015. Evaluation of M5 gene expression planning and model in estimating daily flows, case study of Liquan river. International Quarterly Journal of Water Resources Analysis and Development, 3, 134-142.
Berry, M. J. & Linoff, G. S. 2004. Data Mining Techniques: for Marketing, Sales, and Customer Relationship Management, John Wiley and Sons, Indiana, USA.
Boyle, C. E., Eskaf, S., Tiger, M. W. & Hughes, J. A. 2011. Mining water billing data to inform policy and communication strategies. JournalAmerican Water Works Association, 103, 45-58.
Castellano, I. M. 2020. Water Scarcity in the American West, Palgrave Macmilan, Cham. New York, USA. 51-93.
Chen, X., Yang, S. H., Yang, L. & Chen, X. 2015. A benchmarking model for household water consumption based on adaptive logic networks. Procedia Engineering, 119, 191-198.
Cho, Y. 2016. A watershed water quality evaluation model using data mining as an alternative to physical watershed models. Water Science and Technology: Water Supply, 16, 703-714.
Davidson, I. 2002. Understanding K-Means Non-Hierarchical Clustering. Suny Albany, Technical Report, 02-2.
Davies, D. L. & Bouldin, D. W. 1979. A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2, 224-227.
Díaz, J. L., Herrera, M., Izquierdo, J. & Pérez-García, R. 2010. The tasks of pre and post-processing in data mining applied to a real world problem. 5th International Congress on Environmental Modelling and Software, Ottawa, Ontario, Canada.
Dutta, P. & Chaki, R. 2012. A survey of data mining applications in water quality management. In Proceedings of the CUBE International Information Technology Conference, Pune, India.
Ebrahimi, P. & Naderi, H. 2001. Study and evaluation of drinking water supply and demand management in drought conditions of Isfahan. Journal of Water and Environment, 48-49, 89-97. (In Persian)
Eskandary, M., Taghavifard, M. T., Raeesi Vanani, I. & Ghazi Noori, S. 2020. Identification and prioritization of public-private partnership indicators in Iran’s water and wastewater industry via data mining algorithms. Iranian Journal of Economic Studies, 8(2), 375-396.
Foster, H. S. & Beattie, B. R. 1981. On the specification of price in studies of consumer demand under block price scheduling. Land Economics, 57(4), 624-629.
Garcia, X., Ribas, A. Llausàs, A. & Saurí, D. 2013. Socio-demographic profiles in suburban developments: implications for water-related attitudes and behaviors along the Mediterranean coast. Applied Geography, 41, 46-54.
Gu, Q., Deng, J., Wang, K., Lin, Y., Li, J., Gan, M., et al. 2014. Identification and assessment of potential water quality impact factors for drinking-water reservoirs. International Journal of Environmental Research and Public Health, 11, 6069-6084.
Humaid, E. H. 2012. A data mining based fraud detection model for water consumption billing system in MOG. MSc. Thesis. Islamic University of Gaza Deanery of Higher Studies Information Technology Program Department of Computer Science. Gaza, Palestine.
Jahanpour, K. & Nosrati, G. 2015. An overview of data mining applications in water and wastewater company management; methods and patterns. The 1st International Conference on Humanities with Indigenous-Islamic Approach and Emphasis on New Research. Sari, Iran. (In Persian)
Javadianzade, M. M. 2009. Preparation of urban water demand function using artificial neural networks method in Yazd. 3rd National Conference on Water and Wastewater with Consumption Pattern Modification Approach. Niroo Research Institute. Tehran, Iran. (In Persian)
Ji, Y., Lei, X., Cai, S. & Wang, X. 2016. Application of a classifier based on data mining techniques in water supply operation. Water, 8, 599.
Kazemi, Z. 2015. Applying process mining to improve knowledge management processes in contact centers (case study: contact center 122 of Tehran water and wastewater organization). MSc. Thesis, Tarbiat Modares University. Tehran, Iran. (In Persian)
Khalfi, A. A., Shiri, Q. & Pourashraf, Y. 2018. Investigating the pattern of domestic water consumption with the approach of consumer segmentation (case study: household water consumers in Ilam city). Journal of Water and Wastewater, 29(2), 59-67. (In Persian)
Khan, M. A., Islam, M. Z. & Hafeez, M. 2012. Evaluating the performance of several data mining methods for predicting irrigation water requirement. In Proceeding of the 10th Australasian Data Mining Conference, 134, 199-208.
Kojury Naft Chali, M. & Fereydonian, A. 2015. Identifying the pattern of electricity consumption by data mining. 30th International Conference on Electricity. Tehran, Iran. (In Persian)
Kovács, F., Legány, C. & Babos, A. 2013. Cluster Validity Measurement Techniques, Department of Automation and Applied Informatics. Budapest University of Technology and Economics, Budapest, Hungary.
MacQueen, J. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, California, Los Angeles. USA. 1, 281-297.
Maleki Nasab, A., Abrishamchi, A. & Tajrishi, M. 2007. Evaluation of household water consumption savings through the use of reducing components. Journal of Water and Wastewater, 18(2), 2-11. (In Persian)
Malekmohamadi, M. & Mozafari, M. M. 2018. Applying social marketing in optimize management of water consume. Quarterly Journal of Interdisciplinary Studies in the Humanities, 10(4), 197-218. (In Persian)
Mohammadi, R. 2014. Investigating the effect of targeted subsidies on consumption pattern and water consumption in Ardabil. MSc. Thesis, Islamic Azad University Garmi Branch, Ardabil, Iran. (In Persian)
Mohammad Taheri, A. 2013. Prediction of effluent quality of wastewater treatment plant using predictive data mining-case study: Baharan industrial town. MSc. Thesis, Hamadan, Iran. (In Persian)
Monedero, I., Biscarri, F., Guerrero, J., Roldan, M. & Leon, C. 2015. An approach to detection of tampering in water meters. 19th International Conference on Knowledge Based and Intelligent Information and Engineering Systems. National Uiversity of Singapore, Singapore.
Monika, C. & Amarpreet, K. 2018. A comparative study of classification techniques for fraud detection. Journal on Future Revolution in Computer Science and Communication Engineering, 4, 19-23.
Noori, A., Banihabib, M. E. & Soltani, J. 2015. Determining and prioritization of sustainable management strategies for water supply and consumption in the dry areas of Iran. The 1st Conference and Exhibition of Water Science and Engineering. Shahid Beheshti University. Terhan, Iran. (In Persian)
Oyedepo, S. O. 2014. Towards achieving energy for sustainable development in Nigeria. Renewable and Sustainable Energy Reviews, 34, 255-272.
Rathnayaka, K., Maheepala, S., Nawarathna, B., George, B., Malano, H., Arora, M., et al. 2014. Factors affecting the variability of household water use in Melbourne, Australia. Resources, Conservation and Recycling, 92,
Rayegan Shirazinejad, A., Zare, M., Zare, F., Banshi, M. M. & Rezaei, S. 2015. Investigation of statistical models in modeling wastewater treatment process using data mining method. Environmental Health Engineering, 2, 186-194. (In Persian)
Rosenblatt, F. 1962. Principles of Neurodynamics Spartan. New York, USA.
Sabouhi, M. & Noubakht, M. 2009. Estimating the water demand function of Pardis city. Journal of Water and Wastewater, 20(2), 69-74. (In Persian)
Sattari, M. T. & Rezazadeh Judi, A. 2018. Monthly runoff modeling using data mining methods based on feature selection algorithms. Protection of Water and Soil Resources, 7, 39-53. (In Persian)
Sattari, M. T., Abbasgholi Nayebzadeh, M. & Mirabbasi Najafabadi, R. 2014. Predicting surface water quality using decision tree method. Irrigation and Water, 4, 76-88. (In Persian)
Shamsai, M. 2000. Estimating the water demand function of Isfahan province. a collection of 21 papers presented in the first scientific and research conference on water consumption optimization. Public Relations and International Affairs Publications of Tehran Water and Wastewater Company. Terhan, Iran. (In Persian)
Soleimanpour, S. M., Hedayati, B. & Zolfaghari, M. 2015. Determining the effective indicators on drinking water quality using QUEST data mining technique in Saadatshahr, Fars province. 3rd International Conference on Rainwater System. Birjand, Iran. (In Persian)
Soleimanpour, S. M., Mesbah, S. H. & Hedayati, B. 2016. Application of K-Means and CART data mining algorithms in determining the most effective factors of drinking water quality in Noorabad plain of Fars province. 11th National Conference on Watershed Management Science and Engineering. Yasouj, Iran. (In Persian)
Soleimanpour, S. M., Mesbah, S. H. & Hedayati, B. 2018. Application of CART decision tree data mining techniques in determining the most effective drinking water quality factors (case study: Kazerun plain, Fars province). Health and Environment, 11(1), 1-14. (In Persian)
Tabatabai, A. 2009. An Attitude on Data Mining. Qazvin Azad University Pub., Qazvin, Iran. (In Persian)
Tabesh, M. & Dini, M. 2010. Predicting daily urban water demand using artificial neural networks, case study: Tehran. Journal of Water and Wastewater, 21, 84-95. (In Persian)
Taherdoost, M. A. 2021. Prediction of domestic drinking water demand in Shiraz city using time series and panel data. The 1st National Conference on Water Quality Management and the 3rd National Conference on Water Consumption Management with the Approach of Reducing Waste and Recycling. Tehran University. Tehran, Iran. (In Persian)
Thompson, E. 2016. Investigating drinking water advisories in first nations communities through data mining. MSc. Thesis. University of Guelph. Hamilton, Canada.
Wen, Y. Y., Huang, W. M., Wu, J., Chen, Y. & Song, J. Q. 2013. Water consumption analysis system based on data mining. Applied Mechanics and Materials, 241, 1093-1097.
Willis, R. M., Stewart, R. A., Giurco, D. P., Talebpour, M. R. & Mousavinejad, A. 2013. End use water consumption in households: impact of socio-demographic factors and efficient devices. Journal of Cleaner Production, 60, 107-115.
Yurekli, K., Taghi Sattari, M., Anil, A. & Hinis, M. 2012. Seasonal and annual regional drought prediction by using data-mining approach. Atmósfera, 25, 85-105.