Smartphone Sensor Accelerometer Data for Human Activity Recognition Using Spiking Neural Network

Recognizing human actions is a challenging task and actively research in computer vision community. The task of human activity recognition has been widely used in various application such as human monitoring in a hospital or public spaces. This work applied open dataset of smartphones accelerometer data for various type of activities. The analogue input data is encoded into the spike trains using some form of a rate-based method. Spiking neural network is a simplified form of dynamic artificial network. Therefore, this network is expected to model and generate action potential from the leaky integrate-and-fire spike response model. The leaning rule is adaptive and efficient to present synapse exciting and inhibiting firing neuron. The result found that the proposed model presents the state-of-the-art performance at a low computational cost.


I. INTRODUCTION
Human activity recognition (HAR) is an important area of research in ubiquitous computing, human behaviour analysis and human-computer interaction. Research in these areas employ different machine learning algorithms to recognize simple and complex activities such as walking, running, cooking, etc. Particularly, recognition of daily activities is essential for maintaining healthy lifestyle, patient rehabilitation and activity shifts among the elderly citizens that can help to detect and diagnose serious illnesses. Therefore, human activity recognition framework provides mechanism to detect both postural and ambulatory activities, body movements and actions of users using different multimodal data generated by variety of sensors. Accelerometer sensor have been used in monitoring elderly, healthcare and rehabilitation systems to recognize human activities in mobile devices before smartphone available in the market. Nowadays smartphones became a must have gadget for our daily lives which we bring everywhere. People used smartphone at anytime and anywhere. Many smartphones are equipped with various sensors such as accelerometers, GPS, light sensors, gyroscope, temperature sensor, etc. These sensors have become a rich data source to develop real-time Android applications to measure various aspect of daily life. In terms of battery efficiency, the battery Manuscript  can even last longer (more than 1 day). Even our smartphone can track what we do and also connect with other devices such as mobile cloud computing for other purposes.
Therefore, this research utilize smartphone data which have built-in sensor instead of collecting data from single accelerometer sensor in a normal system [1]. The advantage of using smartphone data are low installation cost, easy-to-use and slightly contains unobtrusive data. We chose open dataset from WISDM for training and testing purposes. This dataset contains typical activity like walking, sitting, etc.
In the state of the art, accelerometer sensor is the main sensor used for activity recognition. For example, accelerometer sensor has lots of potential in the field of rehabilitation and healthcare to emphasize the clinical applications of fall detection during rehabilitation treatment [2]. In classification phase for activity recognition system, the role of this sensor in performing overall recognition is evaluated using K-nearest neighbor classifier [3]. Author in [4], used multi-class Support Vector Machine (SVM) to classified accelerometer signal frequency signal pattern extracted from Fast Frequency Transform. To study the behavior of users, [5] implemented real-time activity recognition using hierarchical hidden Markov model. This paper utilizes the extracted accelerometer sensor data collected using smartphone to recognize various types of human activities. This paper proposed spiking neural network (SNN) model to analyze the spike which highly informative for actions recognition. This paper is organized as follows: Section II briefly reviews related work in HAR using smartphone data. Sections III formulate the problems based feature extraction using SNN model. In Section IV, presents experimental results and discussions of the proposed approaches. Finally, conclusions are drawn in Section V.

II. RELATED WORKS
The research in HAR using smartphones accelerometer sensor has attract more practical applications especially in real-time classification of user activity. The challenge arises from existing works are on how to select acceptable time segments size within average interval duration of 2-10s. The experiments were designed and controlled under certain types of dataset. Results in [6] shows quite good performance with smaller time interval.
Other than that, selection of methods for feature extraction also play an important role in the performance of HAR. Existing research tends towards adapting deep learning or convolutional neural network (CNN) [7] in HAR. Normally, the raw input for this kind of machine learning are images. Majority of research are concerned with the idea of Smartphone Sensor Accelerometer Data for Human Activity Recognition Using Spiking Neural Network

Nor Surayahani Suriani and Fadilla 'Atyka Nor Rashid
International Journal of Machine Learning and Computing, Vol. 11, No. 4, July 2021 representation learning which focusing on how best to represent the raw signal data using deep leaning or CNN. For example, EEG signal data [8] is normalized at zero mean and standard deviation before feed into the CNN for training and testing. The CNN is implemented because it has both shift and translational invariance. In [9] and [10] image as an input to the CNN, which was similar to how an RGB data was processed by CNN. While [11] applied Discrete Fourier Transform to sensor image and extract features for further classification. Author in [12] also applied CNN for human activity recognition to detect any abnormalities occurred by estimated scale and velocity for different types of activities using Histogram of Oriented Optical Flow (HOOF) technique.
Study in SNN as the powerful neural network approaches has motivated researcher to focus on bioinspired methods for pattern recognition [13], [14]. SNN also learn to convert speech signal into spike train signatures which are distinguishable for other speech signals to represent different words [15]. In [16], SNN model is used to model balance brain interface using EEG pattern data. Author in [17] applied SNN model to recognizes different types walking pattern. The evaluation of SNN model also able to classify motor imagery EEG signals [18] due to robustness in pattern recognition compared to other classifiers like Linear Discriminant, Feed Forward Neural Network and Radial Basis Function. Further optimization technique applies for weights and delay settings. The acceleration data for each axis x, y and z are calculated by the integration of velocities. Fuzzy spiking neuron model is applied to observe and understand human behaviors in development of humanoid robot [19]. The person is extracted from measured data by the sensor network.
Inspired by biological human perception, accelerometer sensor data is encoded and model to match with spiking activity of cells. Then the extracted nonlinear frequency information from the spike trains is then classified using SVM for real-time surface texture classification in neurorobotic research [20]. Again, in wildlife behavior monitoring research, SNN model is implemented to analyze the data obtained from the GPS sensor to track animal positions and the behavior [21].
Over the last century, the first generation of neural network started with perceptron until today the deep learning networks trained using gradient descent. This 3 rd generation of neural network believes that SNN will be the next generation of machine learning which compute biological-realistic models of human brain.
In this research, SNN model is employed to encoded motion pattern of rehabilitation monitoring system using 3D Gabor spatio-temporal filter in V1 and MT layer to form active motion map according to direction, speed and orientation of object movement based on Gaussian distribution [22]. SNN model are inspired by the brain and communication of neurons which useful for information transformation via action potentials (spikes) through adaptive synapses. Spiking neuron is generated when the average of changes in membrane potential exceed a spiking threshold, resulting from stimulation. The rate of spike generation and the temporal pattern of spike train carry significant information of the stimuli. The following section will explain the details of SNN architectures.

III. SPIKING NEURON NETWORK MODEL
SNN model incorporating spatial-temporal information in communication and computation, like real neurons. The neuronal signals consist of short electrical pulses which known as action potentials or spikes. Spike is the incoming signals that alter the voltage of the neuron and when this reaches above a threshold-value the neuron sends out an action potential itself. While the chain of action potentials emitted by a single neuron is called spike train. These individual spikes are sparse in time, so each spike has high information content. The knowledge representation in time and space makes SNNs unique to perform brain-like computations and to understand the brain data/activity in a spatio-temporal pattern.
Considering an accelerometer dataset in M × N × T matrix, where M indicates the total number of data, N is the x, y, z axis of accelerometer sensor and T is total number of trials. Fig. 1 shows the general stages of classification using SNN model. The accelerometer data act as the raw input data, preprocess signal data is transformed into spike train input dataset. The spike train data work as the initial input of the SNN model. Further setting weight parameter and optimization of the SNN is done within the network. Finally, the similarity score of the network output determine type of the output class. For firing rate and spike generation, there are many different schemes for the use of spike timing information in neural computation. The most widely used and best-known model of threshold fire neurons, and spiking neurons in general, is the Integrate-and-Fire neuron. The Integrate-and-Fire neuron is the simplest spiking neuron models and easy to analyzed and simulated. The neuron receives all the weighted spikes from the presynaptic neurons via several synapses. The neuron generates spike trains according to a conductance-driven Integrate-and-Fire equation as follows: where k c is amplification factor. When cell fires a spike, the inhibitory conductance, Gt is generated in the neighborhood cell which has similar direction, orientation angle and speed. The inhibitory conductance, () inh i G t given as where d is the Euclidean distance between each cell with the neighborhood cell and R inh is the size of receptive field. In order to achieve the input patterns, which will indicate the same class of activity, the generated cell almost have the same firing rate. While different classes will generate different firing rates with the average spiking rate of each class is widely separated. This is enough to discriminate among different types of HAR.
The firing rate is calculated by the summation of spikes that have been fired by neuron, i as shown in the firing rate equation as follow: The general learning rule for the synapses is given by: where, ∆t = t pret post . Here the goal is that weights are changed such that the next (after this training iteration) postsynaptic spike occurs closer to the pre-synaptic spike. The weights are then changed according to the relaxation rule: Here, η is the learning rate. For excitatory synapses w min = 0 and w max = 1, whereas for inhibitory synapses w min = −1 and w max = 0.
In classification part, the predefine spike sequences for different input patterns and the learning neuron is trained to fire a desired sequence of spikes when a corresponding pattern is present. Euclidean distance measure of spike timing was adopted to measure the similar degree between the target and observed output spike trains. When the actual output spike train is the same with the desired output spike train, the measure C = 1, and the C decreases towards 0 for loosely similarity pattern of spikes.

A. Dataset
We present a comparative performance evaluation SNN model for WISDM dataset [23]. The dataset is made by six different activities, collected under controlled laboratory environment. The activities are walking, jogging, going upstairs and downstairs, sitting and standing position. WISDM data is collected at 20Hz from both Android phone and smartwatch. The dataset has been partitioned into 30 frames per sample and preprocess to remove the unnecessary noisy signal before being generated as a spike train.

B. Experimental Setting
After few learning epochs, the neuron in training set can generate desired output spike trains for every class of activity input pattern. The WISDM dataset divided into 10-fold cross validation. The performance of HAR was primarily evaluated based on the percentage accuracy of activity recognized correctly. The accuracy for each iteration is computed by comparing the equivalence of the predicted activity class and ground truth label of the activity label for each feature vector and illustrated in confusion matrix.  Fig. 2 emphasized the general framework for SNN Model performance evaluation. WISDM dataset has been partitioned into 30 frames per sample before being generated as a spike train. Spike trains then being split into training and test data with the ratio of 80-20 respectively. Next, the spike train is taken as an input and converted into gray scale images. The converted images being transformed into numeric matrix for training in classification phase. Scores for each data are then obtained from the network output. Finally, we illustrated the accuracy by performing confusion matrix in the evaluation phase.

C. SNN Analysis
We verify the variability of results for this dataset using the proposed SNN model. The result indicate that the information extracted from the input patterns encode neuron International Journal of Machine Learning and Computing, Vol. 11, No. 4, July 2021 and generate unique spikes for each class of activity. The uniqueness of each spikes mean firing rate for different types of activities justify the relevant responses cells for action task purposes.
The performance accuracy depends on the selected window length of the size interval and sampling rate. The following classification results were carried out for selected size interval, which is 30 samples per second. This size interval was selected due to the acceptable or normal size interval used in previous work.
Based on the result, the accelerometer produces reasonable performance result using the proposed SNN model. These results can be verified for more extensive dataset of activities and using different size interval. Overall, the experiment was carried out using normal CPU and the computational time is only 2s per 30 sample. The prediction is updated every 3s. This would be acceptable for real-time activity recognition.
The neuron parameters used here are as follows:  80 ms,   4 and R inh =1 while the parameters  in the population encoding is 1.5. In this experiment, the SNN model has one hidden layer with 10 hidden neurons and only one output layer with one output neuron. There is no synapse between input neuron and hidden neuron but added 5 synapses between hidden neuron and output neuron. Fig. 3 and Fig. 4 show the sample of an average firing rate for 'walking' and 'sitting' activity. By looking at the x-axis, the activity can be classified based on the spike pattern generated along 0° to 360° orientation direction. Generic walking pattern generated unique spikes train for walking pattern distributed along the directions. While for sitting activity, sudden changes of spikes occur which indicate the change of movement from standing to sitting position. The range of firing rates for sitting activity is below 1000Hz compared with walking activity fall within 1000-15000Hz. This indicate that the hit of action potential towards sitting activity is slower than normal walking activity. Fig. 5 and Fig. 6 differentiate the firing rate between activity for going upstairs and downstairs. Sudden drop occur at the earlier stage, while repeated pattern firing up and down throughout the movement. This indicate that motion patterns of individual activities are significantly meaningful and firing rate data can carry meaningful information of the spatio-temporal features detection for different activities.    The series of firing rates encode the captured motion into spike trains. The spike trains capture motion of all accelerations within time interval at each frame. Fig. 7 and Fig. 8 show the example of spike trains also for walking and sitting activity. The figures depict that the pattern of the spikes is vary from one to another. Walking activity's spike patterns a bit scattered as the movement of the activity is non-static in contrast with sitting activity, the patterns more organized and structured as sitting activity involved in a static placement.

D. Spiking Train
At this point, the number of synapse between neurons is selected carefully in order to prevent reduction of convergence rates and classification accuracy. While, if more synapses is added into the network, the learning efficiency will be affected with no improvement in convergence or In this paper, we employ spike train as the input to be transformed before being applied to Euclidean distance measure. As a result, Fig. 9 summarize the accuracy of SNN Model performance evaluation. The average of the accuracy is almost achieved 100% which describes each of the activity has their own uniquely spike patterns. Some information might get lost for certain activity, which may lead to a lower recognition rate. On the other hand, the SNN model capture spatio-temporal features naturally from the spike train analysis and able to classify the output class directly. There is no preprocessing needed to extract temporal features from the raw accelerometer dataset.

V. CONCLUSION
In this paper a new method to apply a spiking neuron in a pattern recognition task was proposed. This method is based on the firing rates produced with integrate-and-fire neuron when is stimulated. The proposed architecture cost at low computational time which can also run on mobile devices in real time. The evaluation performance result will be tested for another smartphone dataset collected under uncontrolled environment. Hence, further filtering at preprocessing process is needed to validate the proposed framework. Extensive dataset will be carried out for different user with different accelerometer devices calibrations.
Although the proposed SNN model for activity recognition is shown to be promising, few parameters in the network are determined heuristically. Hence, the final output is slightly depends on the dataset and parameter setting within the network after a few trials. Therefore, the proposed SNN framework need to be verified on more data sets. Further powerful search and optimization methods can be apply to tune the parameters of SNN model automatically to improve its performance. In conclusion, the initial results tested on popular dataset WISDM achieve our expectation. More comparison with state-of-the art approaches is needed in future work to evaluate the robustness of the proposed model.

CONFLICT OF INTEREST
The authors declare no conflict of interest.

AUTHOR CONTRIBUTIONS
N. S. Suriani wrote and finalized the whole paper, conducted the research in bio-inspired visual cortex system and this paper present some of the experimental results. While, Fadilla Atyka N. R. help in analyzed the data and wrote the paper for results and analysis part. All authors had approved the final version.