Agricultural Machinery Abnormal Trajectory Recognition

The service system of supervision of agricultural machinery subsoiling operation enables acquisition of a large amount of agricultural machinery movement track data. These trajectories include not only farmland operation track data, but also road driving track data. Their spatial distribution characteristics and attribute data are different. In this paper, we make a study of the abnormal trajectory data in data set, and propose an abnormal trajectory recognition algorithm based on DBSCAN clustering. According to the attribute data of agricultural machinery trajectory, the trajectory is divided to form different types of motion trajectory, then to judge the spatial distribution of different types of agricultural machinery tracks. If the attribute data of the tracks are inconsistent with their spatial distribution, it will be judged as abnormal tracks. The experimental results show that both the accuracy of the algorithm and the recall rate is 98.61%, which can identify the abnormal tracks of agricultural machinery.


I. INTRODUCTION
It is an important measure to break down the bottom layer of the plough to increase soil and air permeability and improve the growth conditions of crop roots without disturbing the structure of soil layer. In order to encourage the implementation of subsoiling operations, the Chinese government has implemented subsidizing policies in support of subsoiling operations in several provinces. According to the needs of supervision of subsoiling operation, the National Agricultural Intelligent Equipment Engineering Technology Research Center has developed a service system for supervision of agricultural machinery-based subsoiling operation [1], including monitoring terminals installed on agricultural machinery, GPS positioning device, depth sensor, GPRS data transmission module, etc., as well as remote visual management platform for data storage, display and analysis. The supervision service system ensures the integrity, real-time and effectiveness of operation data, greatly reduces the intensity of manual sampling and strengthens the process of the supervision.
At present, with the rapid development and wide application of wireless network and global positioning equipment, a large number of trajectory data of mobile objects show a rapid growth trend, such as traffic trajectory data, animal migration data, meteorological airflow data, people flow data, etc. However, the research on trajectory data mining at home and abroad mainly focuses on road network extraction and update [2], [3], user trip analysis [4], business district hot area directory [5], urban traffic congestion recognition [6], [7], etc., with less research and analysis being conducted on agricultural machinery trajectory data. The aforementioned agricultural machinery subsoiling operation supervision-oriented service system enables acquisition of a large amount of agricultural machinery movement track data, deeply excavates these track big data, analyzes the information behind the data, which is of great significance to a further improvement of the service level of the platform and to a full use of the application value.
In this paper, based on the spatial distribution characteristics of agricultural machinery trajectory, combined with the attribute information of trajectory points, a method of agricultural machinery abnormal trajectory recognition based on DBSCAN clustering algorithm is proposed, whilst the validity and operational efficiency of the algorithm are tested. The purpose of this study is to detect the abnormal state data in the movement track of agricultural machinery, to prevent possible equipment failure or farmers' cheating conduct, to ensure the calculation correctness of the working area of the supervision system, and to provide effective technical support for the correct appropriation of the state's subsidy for subsoiling [8]- [17].

A. Spatial Distribution Characteristics of Track Data
The track of agricultural machinery movement is a sequence of position points of agricultural machinery movement, as being based on time and space changes. Its spatial distribution characteristics can objectively reflect the movement state of agricultural machinery. The movement state of agricultural machinery is mainly divided into field operation state, road driving state, stop state (agricultural machinery trouble parking or normal repair parking), etc. The monitoring device for subsoiling operation of agricultural machinery collects the longitude and latitude, machine status, swath width, movement speed, heading, operation depth and other position information of agricultural machinery at a fixed sampling frequency, and transmits the data to the server remotely through the wireless network. As shown in Fig. 1, the historical track of agricultural machinery subsoiling operation can truly reproduce the actual movement process of agricultural machinery. When the agricultural machinery is working in the field, it usually adopts the straight reciprocating or detour operation mode, the movement speed is slow, so the movement track presents the dense clustering spatial distribution. When the agricultural machinery transfers between different jobsites, the movement speed is relatively fast, and the movement track presents a sparse and regular linear spatial distribution; when the agricultural machinery is in the state of stop, the characteristics of the agricultural machinery track show as a cluster point, and the distribution range of the point cluster is very small.

B. Track Data Attribute Fields
Please refer to Table I for the description of main attribute fields of agricultural machinery subsoiling operation track.
Generally, the speed of subsoiling operation in agricultural machinery field is not more than 10 km/h, and the road running speed is not more than 30 km/h.

C. Data Cleaning
The transverse Mercator projection is used to convert the latitude and longitude coordinates in the track of agricultural machinery into plane coordinates. To illustrate the data cleaning method, the following definitions are given: Spatiotemporal trajectory: the set of spatiotemporal points T p that move in a direction according to the time sequence, expressed as where, x and y are position coordinates of trajectory points; t is time; 1 , ⋯ , is the other m attribute values of track points; N is the number of points of spatiotemporal track.
Based on the definition of spatiotemporal trajectory, data preprocessing is carried out for the following aspects to achieve preliminary data cleaning.
Duplicate record (DR) processing: If the recording time of two adjacent track points is the same, only one track data will be kept, and other duplicate data will be deleted.
Data missed (DATM) processing: In this study, the spatiotemporal trajectory is recorded in an isochronous manner with a time interval of . When the time difference between any adjacent trajectory points is greater than , data loss is considered to occur, that is and +1 are records before and after data loss, which are used to mark the location of data loss. For data loss, the corresponding loss data location is marked and the exception data table is written.
Attribute data missed (ADATM) processing: Some key attribute field data of track point in spatiotemporal trackT p is empty, such as operation depth data, which means that attribute data is missing.
where, is the key field to be detected for each record, and m is the number of attributes contained in each record data. In case of missing of attribute data, the corresponding records are removed and the missing records are written into the abnormal data table.
Data drift (DL) processing: Due to the problem of receiving signal from positioning equipment, the position coordinates of track points deviate from the real position, resulting in data drift. Because the real position of the track points of agricultural machinery is unknown, the drift data is not easy to be found. In this study, based on the two adjacent track points k , k+1 , the data drift is determined by calculating the motion speed ( k , k+1 ) between the two track points.
where, ( k , k+1 ) is the distance between two adjacent track points, and interval is the time interval. Compare the speed calculated by equation (5) with the set maximum speed threshold value, ℎ ℎ , if The track point k+1 is considered to have data drift. The track point data record is deleted and written into the abnormal data table.
Stop point (SP) processing: The subset SP of the spatiotemporal trajectories T p whose trajectories are distributed in a small space, reflects the stopping characteristics of agricultural machinery, which is expressed as: where, ( , ) is the distance between the start point and the end point of the stop track segment; ( , e ) is the length of the pause track segment, that is, the time difference between the end point and the start point of the pause track segment; ℎ ℎ is a preset time threshold, and ℎ ℎ is the preset distance threshold. In this study, the pause track segment is determined by the speed attribute of track points and the average speed of multiple adjacent track points. The average velocity of the adjacent multiple track points ( ) is: The velocity ( k ) and the average velocity ( ) at the locus are compared with the minimum velocity threshold ℎ ℎ respectively.
Then it is determined to be a stop track segment, and the stop track segment is segmented and eliminated.

A. What Is the Abnormal Track of Agricultural Machinery
After data preprocessing, the movement track of agricultural machinery can be divided into two state tracks: field operation track and road driving track. They have their own track spatial distribution characteristics and attribute field values reflecting they are in normal state. The spatial distribution of the field work track is a high density point cluster. The sensor status and implement status fields in the attribute field are both 1, whilst the operation depth and speed fields are not 0. The spatial distribution of agricultural machinery road travel track is linear, the sensor status and implement status fields in the attribute field are 0, the operation depth is 0, and the speed field value is the travel speed of agricultural machinery on the road.
The issue of abnormal trajectory of agricultural machinery is put forward to be addressed for the above normal state trajectory. When there is a track segment whose spatial distribution state is inconsistent with the attribute field value in the work track and road track, it can be determined as abnormal track data. There are many kinds of abnormal trajectories, such as the track points whose attribute field is the road driving state are mixed in the operation track point cluster abnormally; the track points which present the road driving spatial distribution state possess the operation depth data, etc. At present, the grid clustering algorithm is often used in some track point data processing, the clustering effect is good, but the grid size and density threshold parameters are sensitive; and the algorithm is complicated, which requires a lot of time and storage expenses. Based on the DBSCAN clustering algorithm, this paper proposes a method for identifying abnormal tracks of agricultural machinery. Firstly, the operation depth of track data and the running state of machines are used to segment the tracks of agricultural machinery to form different types of tracks. Then, the clustering algorithm is used to distinguish the spatial distribution of different types of tracks of agricultural machinery. If the operation depth attribute of track data is inconsistent with the spatial distribution state, the tracks are judged as abnormal tracks.

B. Recognition Algorithm of Abnormal Track of Agricultural Machinery
DBSCAN (density based spatial clustering of applications with noise) is a clustering algorithm based on density, which can identify clusters of arbitrary shape and is not sensitive to noise. This algorithm divides the regions with high density into clusters, and then obtains a clustering category. By dividing all closely connected samples into different clusters, all final clusters are obtained. Based on DBSCAN clustering algorithm, following parameters are defined to describe the compactness of sample distribution in the neighborhood. 1) ε neighborhood: for the spatiotemporal trajectory , each trajectory point is in a given ε neighborhood, and the trajectory point satisfies the following formula: where, ( ) is the number of adjacent track points in neighborhood, ( , )is the distance between and neighborhood, and ε is the distance threshold of neighborhood.
2) Core locus point: for locus point , if at least MinPts points are included within the radius of ε, that is: In the formula, MinPts is the threshold value of the minimum number of trajectory points that must be included International Journal of Machine Learning and Computing, Vol. 11, No. 4, July 2021 in the ε neighborhood. The locus point satisfying the above formula is called the core locus point.
The basic idea of abnormal trajectory recognition algorithm for agricultural machinery: for the preprocessed spatiotemporal trajectory data set , each trajectory point in each segment of the trajectory is taken as the center, and the circular region is drawn with the distance threshold ε as the radius. The values of ε and MinPts are determined according to the speed and track distribution characteristics of agricultural machinery. After calculating the number of trace points ( ) in the scope of trace point , different clustering results were found in road driving track and field operation track. The density of field operation track points is high and there are more core track points; the number of core track points is small when the road driving track is linear.
For the core track points that reach the minimum number threshold MinPts, the accumulated number is counted and the distribution density of core track points is compared, then: where, is the total number of core track points, N is the total number of track points included in the track section, and ℎ ℎ is the distribution density threshold of core track points. If the distribution density of the core track points is lower than ℎ ℎ , it is determined to be the road driving track, otherwise it is determined to be the operation track. The algorithm flow of agricultural machinery abnormal trajectory recognition is shown in Fig. 2, and the specific implementation steps are described as follows: Step 1: at the beginning of the algorithm, the track database is preprocessed to form a clean track data set, and the track is divided into "road track" and "field operation track" according to the attribute fields of subsoiling operation, such as operation depth and machine status; Step 2: set the algorithm parameter threshold (ε, MinPts); Step 3: traverse each track segment in the track data set, and calculate the number of points ( ) contained in the neighborhood of all track points in each track segment. If ( ) ≥ , it is the core track point. Mark the track point as the accessed state, and continue to traverse the next track point; Step 4: after traversing all track points in the track segment , calculate the proportion between the total number of core track points and the total number of track points in data set, and judge whether it is less than the set proportion threshold ℎ ℎ . If it is true, it is a road track, then proceed to step 5, otherwise it is a field operation track, then go to step 6; Step 1 Step 2 Step 3 Step 4 Step 7 Step 8 Step 6 Step 5 International Journal of Machine Learning and Computing, Vol. 11, No. 4, July 2021 Step 5: judge the attribute data of the road track. If the track point contains the attribute of abnormal depth, that is, Deep > 0, it will be judged as the abnormal agricultural machinery movement track, otherwise it will be judged as the normal agricultural machinery movement track, go to step 7; Step 6: judge the attribute data of the field operation track. If the track point contains the attribute of abnormal depth, that is, Deep = 0, it will be judged as the abnormal agricultural machinery movement track, otherwise it will be judged as the normal agricultural machinery movement track, then proceed to step 7; Step 7: judge whether all tracks in the track data set have been identified. If yes, proceed to step 8, otherwise go to step 3; Step 8: end of the algorithm.

C. Algorithm Evaluation
In the evaluation of the abnormal track recognition algorithm of agricultural machinery in this paper, the road driving track is defined as "positive example" and the field operation track is defined as "negative example". In the results of abnormal trajectory identification of agricultural machinery, the following four situations may occur: 1) The track of road driving is correctly identified as the track of road driving, represented by TP; 2) The track of road driving was misidentified as the field work track, represented by FN; 3) The track of field work is correctly identified as the track of field work, represented by TN; 4) The track of field work was wrongly identified as the track of road driving, represented by FP; For the above recognition results, the accuracy and recall rate can be used to evaluate the recognition effectiveness. The calculation formula of accuracy is as follows: Generally speaking, the higher the accuracy, the better the classifier. Recall rate is a measure of coverage, which measures how many positive cases are listed into positive cases. The formula for calculating the recall rate is as follows:

A. Test and Result Analysis 1) Test environment and test data
The test uses a computer with Windows 10 operating system, CPU 3.20 GHz, 8.00 GB memory and 930 GB disk space; the development software environment of the data processing program includes myeclipse-8.5.0, Navicat 8.0 and MySQL database, which is realized by java language.
In the experiment, part of the movement track data of 9 subsoiling machines in six counties of Yongzhou City, Hunan Province in 2017 was selected as the sample data set to build the test database. After data preprocessing, a clean data set with 164 track segments is obtained.

2) Data processing and analysis
A recognition test is carried out on the sample data set by using the above abnormal trajectory recognition algorithm.
According to the space-time distribution characteristics of agricultural machinery movement track and the average movement speed of track points, the minimum number threshold parameters (ε, MinPts) of the neighborhood range radius and the core points in the neighborhood are selected as (6,4); according to the experience, the proportion threshold ℎ ℎ is set as 0.6, and the 164 track data of agricultural machinery in the sample data set were identified. In order to verify the correctness of the recognition results, remote sensing satellite map data are used to overlay agricultural machinery track point data for visual interpretation. The track of agricultural machinery shown in Fig. 3(a) shows a highly concentrated and dense spatial distribution feature. The track points in the figure are marked green, indicating that the depth of agricultural machinery operation reaches the standard depth, and the spatial distribution is consistent with the depth attribute, so it is determined to be a normal field operation track. As shown in Fig. 3(b), the spatial characteristics of this agricultural machinery movement track are of loose and linear road travel track distribution characteristics, but the track points have the attribute values of qualified operation depth, so it is determined that this track is an abnormal track. The above visual interpretation method is used to verify the track recognition results of agricultural machinery to test the effectiveness of the algorithm. In this experiment, eight abnormal tracks of agricultural machinery are identified by the algorithm, which are confirmed to be abnormal tracks upon manual check. Please refer to Table II for the basic information and abnormal trace identification results of the sample data set. It can be seen from the table that the mean value of the accuracy of the abnormal trajectory recognition algorithm is 98.61%, whilst the mean value of the recall rate is 98.61%. The test results show that the algorithm can effectively identify the abnormal trajectory of agricultural machinery.
In order to analyze the impact of different parameter (ε, MinPts) thresholds on the trajectory recognition results, two groups of comparative tests are carried out, and the test results are shown in Table III. (1) Parameter ε threshold test: when MinPts = 4, the domain distance threshold ε is 4, 6 and 8 respectively. When the parameter threshold value is (8,4), the accuracy of trajectory recognition decreases; (4,4) and (6,4) have the same accuracy of trajectory recognition, but when the threshold value is (6,4), the classification interval of trajectory recognition is the largest. (2) Parameter MinPts threshold test: when ε = 6, the minimum number threshold value MinPts of locus points must be included in ε neighborhood is 3, 4 and 5 respectively. When the threshold value is (6,3), the accuracy of track recognition decreases, and the accuracy of (6,4) and (6,5) recognition is the same; but for the same track, with the increase of the threshold value of included track points, the number of the determined core track points will decrease, and the proportion density of the distribution of the core track points will show a decreasing trend, which will affect the determination of the distribution density. The results of two groups of comparative experiments show that when the value is (6,4), the algorithm recognition results are better.

V. CONCLUSION
This paper studies the abnormal track data in the data set based on the massive movement track of agricultural machinery obtained by the service system of supervision of agricultural machinery subsoiling operation.
Based on the analysis of the space-time distribution characteristics of the agricultural machinery movement track, an abnormal track recognition algorithm is proposed. The depth of the track data and the running state of the implementation process are used to segment the tracks of the agricultural machinery to form different kinds of tracks. Then an extended DBSCAN clustering algorithm is used to distinguish the spatial distribution of the tracks of different types of agricultural machinery. If the spatial distribution of track data is inconsistent with the depth attribute, the tracks are judged as abnormal tracks.
A sample database of 164 track segments is constructed, and the verification experiment of abnormal track recognition algorithm of agricultural machinery is conducted. Through the visual comparison of agricultural machinery trajectory data by using ArcGIS, the recognition effectiveness of the algorithm is verified. The experimental results show that both the accuracy of the algorithm and the recall rate is 98.61%. In addition, the effects of different threshold conditions on the accuracy of abnormal trajectory recognition are compared.