|
|
Detecting Abnormal Water Consumption Pattern of Enterprise Based on Isolation Forest Sampling |
LIN Qingxuan1, GUO Qiang1, DENG Chunyan1, WANG Yajing1, LIU Jianguo2
|
1. Research Center for Complex Systems Science, University of Shanghai for Science & Technology, Shanghai 200093,China; 2. Institute of Accounting and Finance, Shanghai University of Finance and Economics, Shanghai 200433, China |
|
|
Abstract To solve the low-frequency short-sequence data and unbalanced classification problem in detecting the abnormal water consumption pattern of enterprises, this paper proposes a two-class prediction method based on Isolation Forest sampling. Firstly, the volatility and statistical features of water consumption are constructed. The Isolated Forest algorithm is used to calculate the degree of isolation of samples in the large class to measure the representation of each sample, and the samples are extracted according to their representation. Then the extracted samples are merged with the small class to form a balanced training dataset. Finally, the XGBoost classifier is trained with the balanced dataset and predicting the abnormal pattern. On the dataset of 7,604 enterprises' 13-month water consumption in a city, the AUC and recall ratio of the method proposed by this paper can reach 0.927 and 0.891, and those of XGBoost method based on random under sampling are 0.855 and 0.733, which are improved by 4.7% and 21.6% respectively.
|
Received: 20 January 2020
Published: 23 September 2020
|
|
|
|
|
No related articles found! |
|
|
|
|