Adaptive Multi-state Pipe Framework Based on Set Pair Analysis

Abstract—Human Activity Recognition (HAR) is one of the main research fields in pattern recognition. In recent years, machine learning and deep learning have played important roles in Artificial Intelligence (AI) fields, and are proven to be very successful in classification tasks of HAR. However, there are two drawbacks of the mainstream frameworks: 1) all inputs are processed with the same parameters, which would cause the framework to incorrectly assign an unrealistic label to the object; 2) these frameworks lack generality in different application scenarios. In this paper, an adaptive multi-state pipe framework based on Set Pair Analysis (SPA) is presented, where pipes are mainly divided into three kinds of types: main pipe, sub-pipe and fusion pipe. In the main pipe, the input of classification tasks is preprocessed by SPA to obtain the Membership Belief Matrix (MBM). The sub-pipe shunt processing is performed according to the membership belief. The results are merged through the fusion pipe in the end. To test the performance of the proposed framework, we attempt to find the best configuration set that yields the optimal performance and evaluate the effectiveness of the new approach on the popular benchmark dataset WISDM. Experimental results demonstrate that the proposed framework can get the good performance by achieving a result of 1.4% test error.


I. INTRODUCTION
Human Activity Recognition (HAR) is an important area in pattern recognition and its research can be traced back to the 1990s [1]. There are two main ways to acquire human activities, i.e., using cameras and wearable sensors [2]. With the rapid development of microelectronics and the Internet of Things (IoT), wearable sensors allow people to interact with devices. Many scholars have made great achievements in using various advanced algorithms to classify and process perceived activity information [3]- [5]. However, there are two drawbacks of the mainstream frameworks: most of them fail to use a divide-and-conquer-based method for different activities and lack generality in different application scenarios. Manuscript  The authors are with the School of Information Science and Engineering, Shandong University, Qingdao, China (e-mail: sduslx@163.com, hongjixu@sdu.edu.cn).
In this paper, a novel framework is presented for HAR, which utilizes a divide-and-conquer-based method based on Set Pair Analysis (SPA). SPA is a theoretical tool for dealing with uncertainties proposed by K. Q. Zhao [6]. The main idea is that there are two connected sets A and B to form the set pair H(A, B) in a specific problem, which is used to analyze the certainty and uncertainty of the two sets [7]. In the proposed framework, the sample data is pre-classified by K-Nearest Neighbor (KNN), and the results are processed by SPA to obtain the Membership Belief Matrix (MBM). The Membership Belief (MB) is determined by the threshold method for the selection of sub-pipes. Finally, the results are output through the fusion pipe. During the feedback phase, this paper traverses the optimal solution for finding MB to further improve activity recognition performance.
The rest of this paper is organized as follows. The related outline of SPA is introduced in Section II. The details of the proposed framework are described in Section III. The experimental results are showed and analyzed in Section IV. Finally, the conclusions and some future works are described in Section V.

II. RELATED WORKS
Some related concepts about existing frameworks of HAR and SPA are reviewed in this section briefly.

A. Framework of HAR
With the development of hardware processing ability and Artificial Intelligence (AI), the framework of HAR is constantly optimized. The articles [8]- [11] used various methods of machine learning to identify human behaviors and analyzed the performance of various methods. I. Cleland et al. [12] and K. Altun et al. [13] compared the performances of various machine learning classification methods, and the results indicated that Bayesian Decision Making (BDM) generally can get the highest correct classification rate with relatively small computational cost. The above frameworks are only suitable for dealing with low complexity applications. J. B. Yang et al. [14] proposed a system feature learning method of HAR which used a deep Convolutional Neural Network (CNN) to automate feature learning from the raw inputs in a systematic way. F. J. Ordez et al. [15] proposed a general deep activity recognition framework based on convolution and Long Short-term Memory (LSTM) cycle units. In order to improve the shortcomings of the current frameworks, i.e., their failure to take the important spatial hierarchies between features into account, the concept of capsule was first proposed in 2011 by G. Hinton [16] and achieved state-of-the-art accuracy on the dataset MNIST in 2017 [17]. T. Turan et al. [18]  However, the above frameworks perform the same processing for all kinds of data, resulting in an inability to distinguish behavior. H. Cho et al. [19] applied a divide-and-conquer approach and built a two-stage activity recognition process, where the abstract activities, i.e., dynamic and static activities, were first recognized using a 2-class classifier, and then the individual activities were recognized using two 3-class classifiers. This framework effectively compensates for the shortcomings of the previous frameworks, but still has shortcomings in generality.

B. Set Pair Analysis
SPA is a theoretical tool for dealing with uncertainty problems. The basic idea is that in the context of a specific problem, there are two connected sets A, B forming a set pair ( , ) H A B [20].
The two sets A and B are quantitatively described by the identical, contradictory and discrepant analysis according to their characteristics. The set pair ( , ) including the identical features, contradictory features and other features. For a specific problem, assuming that the number of identical features is S , and then the identity degree of two sets is / SM ; the number of contradictory features is P , and then the contradictory degree is / PM ; the number of other features is F M S P    , and then the discrepancy degree is / FM . The connection degree of sets A and B can be expressed as: where I is the coefficient of discrepancy degree and J is the coefficient of contradictory degree,  Differentiation of confusing activities is still a difficult problem in HAR. A novel adaptive multi-state pipe framework based on SPA is proposed to effectively identify confusing activities.

A. Proposed Relational Degree of Evidences
Fig . 1 shows the basic flowchart of the proposed framework, which is mainly composed of main pipe, sub-pipes and fusion pipe. In the main pipe, the training information is processed by the KNN [21] algorithm to obtain the confusion matrix of each predicted value and actual value, and then the SPA algorithm is used to obtain the Belief Matrix (BM). When the test information enters the main pipe again, the MBM is obtained by multiplying the membership with the previous BM, and then the corresponding sub-pipes are selected by the upper and the lower limits of the credibility to classify the information. Finally, the results in each sub-pipe are output through the fusion pipe.

B. Procedure of the Proposed Method
As shown in Fig. 2, the flowchart of the adaptive multi-state pipe framework based on SPA is as follows: • Step 1: Activity information is collected through intelligent hardware as input to this framework. • Step 2: The proposed framework pre-processes the data and detect whether the information is uncertain by using a preset threshold. If there exists uncertainty, the system will execute uncertainty elimination and information standardization, otherwise it will directly execute information standardization. There are three main types of data standardization. 1) Standardization of category-type features using one-hot coding. 2) Standardization of numerical features using normalization. 3) Standardization of ordered features using ordered numerical coding. After the original data eliminates uncertainty, activity information will have higher credibility. Normalization unifies the scalar to [0,1] , which is convenient for calculation and may help the system to find the global optimal solution. • Step3: Enter the appeal results into the main pipe of the framework, and use KNN for classification processing to get the confusion matrix. The columns of confusion matrix represent the predicted categories of data and the sum of data in each column is the number predicted as a category. The rows represent the real categories of data, and the sum of data in each row is the number of data instances of one category. The diagonal of the confusion matrix represents the same number of instances as the actual prediction, that is, the prediction is correct. The confusion matrix of the main pipe is showed in Table I, where n represents the number of activity types, and Numberij represents the number that input activity i is identified as output activity j.

Computing confusion matrix
Computing confidence confusion matrix

Computing membership belief matrix
Is optimum solution ?
Feedback update Step 1 Step 2 Step Step 5 Step 6 Step 7 No Yes Step 4  is the sum of elements in its column i .   Table III shows the BM of the main pipe, and it indicates that the prediction credibility is 100% when the element value is 1.
• Step4: Define a new concept of credibility MB for sub-pipe selection as MB ij i ij p cos  . The sub-pipe shunt processing is performed according to the upper limit (MBh) and the lower limit (MBl) of the threshold.
where the membership degree corresponding to MBh is initialized to 0.8 and the membership degree of test data is 0.8. The one corresponding to MBl is initialized to 0.6 and the membership degree of test data is 0.6. Initial values of them are determined by simulations, MBh and MBl are then equal to 0.776 and 0.498, respectively. • Step5: The sub-pipes are divided into high-speed sub-pipe, CNN sub-pipe, and capsule sub-pipe. If the classification result of main pipe is accurate, the data passes through the high-speed sub-pipe. Otherwise the sub-pipe using CNN or capsule is used for secondary classification. • Step6: The processed data is sequentially merged through the fusion pipe. • Step7: Traverse the sub-pipeline between MBh and MBl in 0.1 step to find the best solution. The results adjust step 4 as feedback. The algorithm implementation to find the optimal MBh and MBl is shown in Table IV.
In order to demonstrate the efficiency of the proposed framework, the experiments are performed on the public dataset WISDM. The dataset collects 36 people's activities through sensors in smartphones, including the following six activities: walking, standing, going upstairs, going downstairs, sitting and jogging. The whole data is divided into 75% training data and 25% testing data. We analyze how MBh and MBl of the proposed framework affect the classification results, and confirm the optimal configuration of parameters. Finally, we show the performance of the proposed framework by comparing with some other state-of-the-art frameworks, including CNN and a two-stage learning of multiple 1D CNN model. Fig. 3 shows the comparison of the accuracy of different thresholds of the adaptive multi-state pipe framework based on SPA. The abscissa represents MBl, the legend represents MBh, and the ordinate represents the accuracy of different thresholds of the proposed framework. After several simulations we find out that the recognition rate of the framework is optimal when the probability coefficient corresponding to MBh is 0.8 and the one corresponding to MBl is 0.6. In different application scenarios, the optimal parameters may be different.

B. Comparison with the State-of-the-Art Frameworks
In terms of the performance of the proposed framework, we compare it with some other state-of-the-art frameworks of HAR. Fig. 4 shows the accuracy and loss of CNN, 1D CNN and the proposed framework. We can see that when the loss tends to converge, the recognition accuracy of the proposed framework is 98.6% (i.e., the test error is 1.4%). During the training process, the recognition accuracy of the proposed framework is always higher than that of CNN or 1D CNN. Table V shows comparison of the accuracy of the above three frameworks, we can see that the proposed framework's testing recognition rate is optimal, which is about 7.9% higher than that of CNN and 2.2% higher than that of 1D CNN. Fig. 5 shows the accuracy of each activity with the proposed framework. It can be seen that the proposed framework has relatively high recognition accuracy for each activity. Fig. 6 shows the accuracy of each activity of CNN, 1D CNN and the proposed framework，respectively. It can be seen that the recognition accuracy of jogging, walking and standing are very high. However, like most other classifiers, CNN and 1D CNN fail in distinguishing between very similar activities like going upstairs and going downstairs. The proposed adaptive multi-state pipe framework based on SPA can handle these activities more effective and get the best performance.   V. CONCLUSION In this paper, an adaptive multi-state pipe framework based on SPA is proposed, which provides an effective and feasible method for the differentiation of confusing activities.
In the experiments, the optimal configuration of parameters is confirmed by analyzing the impact of several key parameters on the proposed framework. To demonstrate the strength of the proposed framework, it is compared with two state-of-the-art frameworks on the raw time series of the WISDM dataset. The experimental results demonstrate that the proposed framework's recognition accuracy is 98.6% on the dataset WISDM, which is about 7.9% higher than CNN and 2.2% higher than 1D CNN. Therefore, it can be used in both highly confused and general situations, and the fusing results are more effective compared with some other typical frameworks. Future works will consider combining capsule and LSTM to identify the characteristics of human activity.