Modeling of Unauthorized Water Consumption Detection (Case Study: Qom)

Document Type : Case study

Author

MSc of Statistics, Water and Wastewater Co., Qom Province, Qom, Iran

Abstract

Detection of tampering in water meters as part of unauthorized usage is a key step in development of service delivery and increasing water resource productivity, and requires special attention. Data used to identify unauthorized water usage, due to tampering in water meters, include 671 subscribers with a history of meter tampering during the years 2017-2019 and a random sample of 3120 subscribers with no tampering record (clean) among Qom’s residential water users. Data analysis was conducted using subscriber’s water consumption and invoice payment history as well as supervised data mining techniques such as decision tree,  support vector machine, neural network, logistic regression, K-nearest neighbor and unsupervised clustering method. The comparison of different data mining techniques between two groups of tampered and non-tampered water meters showed that among the supervised methods, the accuracy of the models is close to each other and there is a 1–3% difference between them. On the other hand, given the percentage of correct responses among the methods, logistic regression, as the best data mining model, with correct detection of 85% of tampered and 91% of non-tampered cases as well as 89% overall accuracy on the testing data, can be used for identification of tampered meters. The study used clustering as an unsupervised technique. The subscribers were grouped into six clusters. Cluster 3 (n=160 subscribers) showed distinct behavior from the other clusters. About 86% of subscriptions in cluster 3 are tampered cases. Moreover, 18% of the tampered cases detected by logistic regression are in this cluster. Data mining techniques for identification of water meter tampering were presented in this study. Findings of the study indicated that both supervised (including response variable) and unsupervised methods (no response variable) such as clustering can be used for the identification of unauthorized water consumption In this study, logistic regression, due to its high accuracy, was selected as the most appropriate model for detection of tampered meters.

Keywords


Amini, G. & Davood Abadi, A. 2014. Estimating household water demand of the city of Qom using artificial neural networks and log linear regression. 1st Water Sciences and Engineering Conference, Tehran, Iran. (In Persian).
Amini, G., Entezam, H., Sadeghpour, A. & Davood Abadi, A. 2018a. Application of data mining to identify subscribers  with unauthorized use of water (case study of Qom water and wastewater company). 2nd Iran Water and Wastewater Science Engineering Congress and National Conference on Demand & Supply of Drinking  Water and Sanitation, Isfahan, Iran. (In Persian).
Amini, G., Entezam, H., Sadeghpour, A. & Davood Abadi, A. 2018b. Identification and extraction of water consumption patterns  by data mining (Case study of Qom water and wastewater company). 2nd Iran Water and Wastewater Science Engineering Congress and National Conference on Demand & Supply of Drinking  Water and Sanitation, Isfahan, Iran. (In persian).
Amini, G. & Saeidi, Z. 2017. Identification of meteorological parameters affecting water consumption in household sector of Qom. Journal of Water and Wastewater, 29(2), 48-58. (In Persian)
Amoozagar, M. 2016. Provides a two-step solution to identify the pattern of power consumption. Iranian Electric Industry Journal of Quality and Productivity, 5, 48-57. (In persian).
Anita, B. D. & Ravindra, D. 2013. Data mining techniques for fraud detection. Journal of Computer Science and Information Technologies, 4, 1-4.
Hashem, E. & Humaid, S. 2012. A data mining based fraud detection model for water consumption billing system in MOG. MSc Thesis, Islamic University of Gaza.
Hassanat, A. B., Abbadi, M. A., A., A. G. & Alhasanat, A. A. 2014. Solving the problem of the k parameter in the knn classifier using an ensemble learning approach. Journal of Computer Science and Information Security, 12, 33-39.
Hosseini, R., Sarmad, M. & Noghabi, M. 2013. Data mining in r by rattle package. Jornal of Andishe- ye Amari, 35, 17-29. (In Persian)
Kajori, M., Feriedunian, A. & Lesani, H. 2015. Identifying the pattern of electric energy consumption with data mining. 30th International Electrical Conference, Tehran, Iran. (In Persian).
Kasaeyan, A. & Ghayni, M. 2017. Examining unauthorized consumption detection methods based on measurement data in intelligent network structure. 32nd International Electrical Conference, Tehran, Iran. (In Persian).
Minaie, B., Dianat, R., Hani, H. & Sobhaninia, M. 2011. Identify fraudsters in service organizations using data mining. MSc Information Technology, University of Qom, Iran. (In Persian).
Monedero, I., Biscarri, F., Guerrero, J., Roldan, M. & Leon, C. 2015. An approach to detection of tampering in water meters. 19th International Conference on Knowledge Based and Intelligent Information and Engineering Systems, National Uiversity of Singapore, Singapore.
Monika, C. & Amarpreet, K. 2018. A comparative study of classification techniques for fraud detection. Journal on Future Revolution in Computer Science & Communication Engineering, 4, 19-23.
Navanshu, K. & Saad, Y. S. 2018. Credit card fraud detection using machine learning modeles and collating machine learning models. Journal of Pure and Applied Mathematics, 118, 825-838.
Rastgar, H. 2010. Investigating aggregation clustering algorithms and simulating and executing a sample. MSc Thesis, Payame Noor University of Mashhad, Mashhad, Iran. (In Persian)