Enhanced Numeral Recognition for Handwritten Multi-language Numerals Using Fuzzy Set-Based Decision Mechanism

Handwritten character and numeral recognition have gained interest in the research community as part of the big picture of Machine Learning. Writer independent recognition systems are still in the working and the research is geared towards an optimized technique that can achieve this. In this paper, we propose a numeral recognition system that forms fuzzy sets of the features extracted using modified structural features for English, Arabic, Persian, and Devanagari Numerals. The structural features extract the geometrical primitives that distinguish each image. After the feature extraction phase, the results are input into a classifier, we test two different classifiers namely Neural Network and Naïve Base. To further enhance the recognition process with low overhead the erroneously recognized numerals (confusion matrix) are processed through the fuzzy set-based decision mechanism to enhance the numeral recognition process. Results indicate that recognition is enhanced by applying the fuzzy set-based decision mechanism for both classifer.


I. INTRODUCTION
Online and offline character and numeral recognition is a topic of significant importance in the field of image recognition. The applications for online and offline recognition varies with the end product use. The difference between the two is well known; online dealing with spatio-temporal representation while offline dealing with spatio-luminance representation. Though online is a bit more challenging than offline recognition yet most proposed systems in previous literature are tested on offline datasets even if the system is proposed for an online setting. Dealing with Arabic and Persian digits in particular is different in some sense due to the absence of a comprehensive benchmark for these languages [1].
In this paper, we present an enhanced English, Arabic, Persian, and Devanagari numerals recognition system based on structural feature extraction, fuzzy set-based decision mechanism (Trapezoidal Fuzzy sets). The proposed system is put through a rigorous test on more than 190,000 handwritten numerals that exist in four different benchmark datasets. The output scores of the classifiers are quantified using fuzzy Manuscript  membership functions. The parameters of the membership functions are optimized using particle swarm optimization. Results indicate that the use of fuzzy logic after classification results in better recognition rate and reduces the percentage of numbers erroneously recognized.
The organization of the rest of the paper is as follows: Section II highlights relevant literature, Section III details the methodology used for Numeral Recognition and details the fuzzy set-base decision mechanism (Trapezoidal Fuzzy sets) used to enhance the recognition process. Section IV lists and describes the datasets used in this paper, Section V details the results obtained prior to and after using fuzzy set-based decision mechanism, Section VI concludes the paper and set the direction for future work.

II. LITERATURE REVIEW
Numeral recognition algorithms have been proposed in various published research. To date an optimized numeral recognition algorithm is still not found. Previous work has concentrated on both online and offline digit recognition, however, in most cases even those algorithms proposed for online recognition are tested on offline datasets. In [2], the authors proposed a generic recognition algorithm for Arabic digits with the three main phases of Pre-processing, Feature extraction and classification. Their method was tested a small dataset consisting of 100 samples from 100 users.
Due to the size of the dataset, a recognition rate of 98% was reported. In [3], Bayesian fusion process was applied on certain parts of the confused characters in a way to improve the classification rate, the obtained classification rate was 97.95%. In [4], the authors proposed a technique for Arabic digit recognition separating the zero digit from the digits 1-9. The algorithm utilizing both the spatial and temporal information achieves an average of 98.73% recognition rate. The contributions of the author also includes the introduction of a training set of 30,000 samples. In [5], Feature extraction using global features was proposed. These include global features extracted from Discrete Cosine Transform (DCT), Discrete Fourier Transform (DFT) as well as Wavelet Transform (WT).
Local features were extracted from other sources such as the nth derivative, two-dimensional coordinates, angular features and curvature. Experimental tests were done on global features, local features and combined features. An average of more than 95% recognition accuracy was achieved using the combined features. In [6], Persian characters were the target for a recognition algorithm that divided the letters into 18 groups based on structure and shape of the body of the character. The authors proposed the use of Hidden Markov Model (HMM) for the recognition of main body variables and delayed strokes with their HMM was used for final recognition. The proposed method achieved an accuracy recognition rate of 94.2%-95.9% for different groups. In [7], a comparative study was done between Multilayer Perceptron (MLP) and K-Nearest Neighbors (KNN) for the recognition of handwritten cursive Arabic numerals. Feature extraction utilized the morphological mathematical model. In this study, MLP produced better recognition rates compare with the KNN. Another comparative study was done in [8] between SVM and HMM on the recognition of Arabic numerals. SVM prevailed in this study over HMM in the recognition of Arabic numerals.
Persian-Arabic numeral recognition algorithm proposed in [9] based on robust feature set and the use of KNN classification. Contour form of each digit was obtained and three features were extracted using block base methods. The method achieved a recognition accuracy between 99.82% -99.90%. In [10], a new method was proposed for Persian-Arabic handwritten digit recognition. Feature set in the feature extraction phase was done by combining four Histogram of Oriented Gradient (HOG) and directional Chain Code Histogram (CCH). Feature were extracted at two level yielding a higher recognition rate. 164 dimension were presented in their feature set. Radial basis function kernel SVM was utilized during the classification and the average recognition rate of 99.31% was achieved on test datasets. In [11], Convolutional Neural Networks (CNN) was proposed for Arabic numeral recognition in addition to introducing a new dataset with 45,000 samples. This produced an average recognition rate of 95.7%. 115 features for the recognition of Persian handwritten numeral was proposed in [12]. Ensemble classifier was used in the classification phase achieving an average recognition rate of 95.28%. In [13], the authors proposed a feature set calculated from the vertical and horizontal directions of the image along with freeman chain code histogram (CCH).
SVM was used in the classification phase. The results indicated that the recognition results were enhanced using this method. Deep Learning Algorithms have also been proposed for the recognition of Arabic and Persian numerals. In [14], a Convolutional Neural Network (CNN) was designed and called LeNet-5 for the recognition of handwritten Arabic digits. MADBase dataset was used for testing and results showed significant improvement over different classification algorithms. In [15], the authors developed a CNN for the recognition of mutli-language numerals in the following languages (English, Arabic, Persian, Urdu, and Devanagari). The overall accuracy of the combined dataset was 99.26% with a precision of 99.29%.
Fuzzy and fuzzy based systems were also proposed for the recognition of numerals. In [16], the authors proposed a fuzzy based technique for the recognition of Arabic numerals. Directional features were used for automatically generation the fuzzy models in addition to automatic weight generation for the different segments. Automatic estimation of the fuzzy intervals using the training dataset was also included in their proposed algorithm. Then classification was done on two phases; first phase zero/nonzero classifications using features of the numerals and SVM as classifier, second phase classification of number 1-9 using the fuzzy classification using segment directional and histogram features and syntactic fuzzy classified. The first phase of zero/nonzero produced an average accuracy of 99.55% and second phase numbers 1-9 produced an average accuracy of 98.01%.
In Ref. [17], the authors present a technique for handwritten Hindi and English digit recognition using their exponential membership functions as a base for a fuzzy model. The recognition process consists of modifying the membership function fitted to the fuzzy set which are derived from the features calculated from the normalized distances using the box approach. The overall average recognition accuracy was found to be 98.4% for English digits and 95% for Hindi digits. In [18], the authors propose a digit recognition method based on a simplified structural classification through the use of small set of primitive types along with fuzzy memberships. The algorithm extracts five types of primitive segments of each image based on three types of feature points.
An estimate of the likelihood of these primitives being close to the vertical boundaries of the image is calculated using the membership function. The classifier in their algorithm uses the primitives, extracted feature points and fuzzy memberships to classify the digits. Using the NIST dataset for testing, they achieved a recognition rate of 87.33%-88.72%. In [19], another algorithm is proposed that forms the fuzzy sets from the extracted features.
The modified exponential membership function of type-2 is used to represent the input fuzzy sets. Fuzzy measure theory is used to manage the interaction between the input fuzzy sets. The algorithm was tested on the recognition of English and Devanagari digits as well as English characters. In all cases, the algorithm proved to achieve improved recognition over other methods. In [20], the recognition of Hindi handwritten digit recognition is the target of the study. This algorithm is also based on the modified exponential membership function that is fitted to fuzzy sets. This is derived from the extracted features containing the normalized distances based on the box approach. The foraging model of E.Coli bacteria is used for optimization. The algorithm uses two window sizes; one for zero and one for the rest of the digits. Tested on a small dataset, the average recognition was 96%.
In Ref. [21], during segmentation, the digits are segmented into primitives that are measured and labeled using fuzzy logic. The labeled primitives assist in forming strings of numerals. Conventional string matching along with membership value of strings was used for classification and recognition. In [22], the target is the recognition of Odia handwritten digits using HMM. Feature extraction is done using polygonal approximation of object contour. Fuzzy inference system is then used to generate aggregated feature vectors for each digit. The feature vector for each digit is then split into three levels and used as three different state for HMM. The end result is a HMM model with ten different three state correspond to ten digit classes. Log-likelihood is then used to calculate and decide the class label. The overall accuracy achieved was 96.3%. In [23], a feature selection technique base on Axiomatic Fuzzy Set (AFS) was proposed.
The technique allows for the most powerful subset of the feature set to be selected based on the class seperability power of combined features. Their method is tested on handwritten numeral in five languages namely Arabic, Telugu, Roman, Devanagari, and Bangla. SVM classifier was used in the study. Higher recognition rates and less features were reported in the study. In [24], the authors presented two hybrid classification techniques for handwritten Arabic numerals. The proposed the serial hybrid classifier consisting of first level Fuzzy C-Means followed by SVM and followed a pixel method classifier. They also proposed the parallel hybrid classifier fusing the decisions of Fuzzy C-Means and Neural Network in parallel to make final decision. Hybrid classifiers were reported to outperform single classifiers. In [25], CNN was combined with a fuzzy logic module based on structural approach. The combination of fuzzy logic was able to increase the recognition rate of handwritten numerals. In [26], Quantum Neural Network (QNN) which combines fuzzy theoretical principles and neural modelling is used for numeral recognition. An average recognition accuracy of 99.1% was reported.
Due to the noise that has been accumulated from all recognition steps, some numerals will be confused with other numerals in the classifiers output. At this stage, no further processing could be achieved. We propose a method that could be plugged into the output of the classifier that helps in decreasing the confusion among numerals that have close scores at the classifier level. Instead of using binary threshold to make the final decision, we propose a fuzzy threshold using a trapezoidal membership function in the cases where the highest two consecutive scores are close to each other. Particle Swarm Optimization (PSO) were used to minimize the performance index (number of confused numerals).

III. MATERIALS AND METHODS
The proposed Methodology and systems consists of four distinct phases; First is the Preprocessing phase, Structural Feature extraction is then achieved in phase 2, phase 3 consists of the classification, and finally phase 4 is enhancing the classification by processing the misclassified digits using fuzzy logic. The following is the details of the four phases of the proposed system.

A. Preprocessing
English, Persian, Arabic, and Devanagari digits were specifically chosen for this study due to the structural similarity between digits in these numbers. The representation of the numerals from 0-9 in these languages are shown in Table I.
The challenge in digit recognition is the different handwriting styles. For example, 52 writing classes have been identified for Persian and Arabic numerals [1]. As seen in Table I, some digits are have similarities that make the recognition process more complex. Samples of handwritten numbers in English, Arabic, Persian, and Devanagari are shown in Fig. 1. It is clearly seen from the figure that even though the numerals in these languages were used based on their structural similarity yet this makes the recognition process more difficult because of structural similarities between numbers in the same language or between the different languages.
Preprocessing is a crucial phase in image recognition techniques. It is essential to make the images uniform in terms of size, location, and angle in addition to removing noise from the images. The proposed system images are subject to the following preprocessing steps: Grayscale to binary conversion through the Otsu's Thresholding Method [27].
 Noise removal using a 3 × 3 window of disk shaped structure. Dilation and erosion (morphological operations) are applied for noise removal.  Boundary box applied to each image for segmentation and separation.  Centering the segmented numeral based on region centroids.  Normalization and converting the image to size 36×36.

B. Feature Extraction
As mentioned in the literature review, many feature extraction methods have been proposed. In our proposed system, we used a structural feature extraction method similar to those proposed in [1]. The feature extraction phase is the most critical phase as it will set the basis of how digits differ from each other in terms of the measurements of the structural features chosen for this method. The similarity between the digits requires that the concentration is on the local structural features rather than on the global features that could be extracted by applying Histogram based algorithms, DFT and DCT. In this work, we propose structural features totaling to 37 features and they extracted based on the calculations shown below. The extracted local features are calculated as follows taking into account that the images input to the feature extraction phase are the preprocessed images. 1. In the image, calculations (Horizontally) of the starting black pixel and ending black pixel is obtained over three locations (top, center and bottom) as shown in Fig. 2(a). The same method is used for Horizontal calculation of the starting black pixel and ending black pixel as shown in Fig. 2(b). The total features extracted in this step are equal to 12 features. 2. Another 3 horizontal and 3 vertical features are extracted by measuring the distance between the starting and ending black pixels as shown in Fig. 2(a), Fig. (b) respectively. Equations 1 and 2 are used for the measurement of these features.
where 虠distance and 虠distance are the horizontal distance and vertical distance for defined axis vector . ‫‬ ‫ݎ‬ and ‫‬ ‫ݎ‬ are horizontal first and last appearance of black pixel respectively, and ϡ ‫ݎ‬ and ϡ ‫ݎ‬ are vertical first and last appearance of black pixel respectively. 3. The aspect ratio of height to width distance (HW-Aspect) is used also as additional features using equation 3 to calculate the ratio. This gives an extra 3 features for height distance to width distance ratio (H-W_Ratio) and is measured using the horizontal distance and vertical distance as shown in Fig. 2 (c).

虠 虠 ̴ ‫‬
where is the sum of the black pixels of segment . and are segment height and segment width respectively for each block.

C. Classification
In the classification phase, two classifiers were chosen for the proposed system. The two classifiers are Naïve Bayes and Neural Network. The following a brief about each classifier.

Naïve Bayes
The basis for Naïve Bayes classifier is the Bayes' theorem for probabilistic statistical model. The conditional probability of assigning a data to a certain class is calculated using both the condition and unconditional probability of the data belonging to each class in the all the data set. The Naïve Bayes classifier has extensively documented in previous literature [28]- [30].
Thus P probability (likelihood function) values are calculated as an estimate from the training sample of the Numeral image set with attributes xi which can be represented with the value sik using equation 5.
The summation training samples for attribute xik in class is specified by ݅ , while n is the training sample number in class k. For simplicity, P is modeled as normal density function for each attribute. To find the class k maximum probability, the Naïve Bayes classifier can be used and is specified by equation 6.
argmax P‫ݎ‬ h P Data is divided into training and testing data in the Naïve Bayes classification dataset. The training dataset is considered as previous information and the model is constructed on the basis of the training set.

Neural Networks (Multilayer Perceptron)
Multilayer Perceptron (MLP), a basic type of deep learning algorithm, is a classifier containing hidden interconnected layers. MLP has extensively been documented in previous literature [31][32][33]. Matrices are used in MLP in which a matrix Mi uses matrix multiplication to encode conversions through layers. The transformation matrix of x neurons in a layer connected to y neuron of the consecutive layer has a dimension shown in equation 7. th (7) The transformation layer has the capability of mapping the input to the output using equation 8, 9, and 10.

D. Fuzzy Set-Based Optimization
After the classification phase, some digits will still be misclassified or erroneously classified. We therefore introduce this fourth layer of Fuzzy Set-based Optimization which will process the erroneously classified digits and reclassify them reducing the number of erroneously classified digits and increasing the overall accuracy and recognition rate of the proposed system. In this paper, we propose the use of the Trapezoidal Fuzzy sets with the following membership function with parameters w, x, y, and z as sown in Equation 11 and Fig. 3.
The trapezoidal fuzzy sets have been used and extensively explained in previous literature [34], [35]. The parameters of the trapezoidal function are optimized using PSO. PSO is considered an evolutionary optimization technique. The number of parameters of any two confused numerals is 8 two for two membership functions. The objective of the fitness function is to minimize the confusion between the numerals. Let us assume the normalized highest two consecutive scores at the classifier output are 0.51 for class 1 and 0.49 for class 2 and assuming the true ground is class 2. For the purpose of illustration of two-class problem, we subtract 1 from class2 which result in 0.51 (the same as class 1 score). Then, if the resulted score is greater than 0.5 then the decision will go for class 1 otherwise for class 2.
The decision in this case will be erroneously made for class 1. While in the proposed approach, we augment this score by two membership function for each class as shown in Fig. 4. The mapped values for the score 0.51 is 0.73 and 0.82 for class 1 and class 2 respectively. Thus, the decision will be correctly made in this way for class 2. It should be noted that this method will focus on normalized scores (classifier output) that are between 0.45 and 0.55 as widening this range will effect on the correctly classified classes. The formation of the two trapezoidal membership functions between any two classes in the context of numeral recognition problem will be obtained from as mentioned earlier by PSO technique. The parameters of PSO are: Swarm size 100, number of generations100, cognitive acceleration coefficient, and the social acceleration coefficient were set to 2.

IV. EXPERIMENTAL DATA
The type and size of the reference database may affect the results obtained from any proposed technique. The dataset used in the paper are those documented in previous literature and contain a large number of sample which make the results obtained using the proposed model comparable to those published in previous literature. The proposed model is validated with four well-known large databases for the four targeted languages totaling more than 190,000 numeral pattern samples. The first database used is the Modified Arabic Handwritten Digits Databases (MADBase) collected from 700 individuals [36] containing 70,000 handwritten Arabic digits in 300 dpi resolution in 28x28 pixels. For English Numerals, the MNIST Dataset was used. Modified National Institute of Standards and Technology (MNIST) dataset consisted of 70,000 samples collected from 250 participants [37]. For Persian, the HODA dataset was used which the first Farsi handwritten digit dataset extracted from 12000 registration forms. It consists of a total 80,000 samples [38]. For the Devanagari numerals, the Devanagari Handwritten Character Dataset (DHCD) was used which consists of a total of 20,000 samples collected from an unspecified number of documents [39].  V. RESULTS AND DISCUSSIONS Table III shows the number of digits whose normalized scores are between 0.45 and 0.55 both using Neural Network (NN) and Naïve Bayes (NB) classifiers. It is noticed that the number of digits normalized between these scores are much higher using NN than those normalized into these scores using NB. For example, in training of Arabic Numerals 50 digits had the normalized score between 0.45 and 0.55 as opposed to only 16 digits using NB. We note here that these represent the total number of digits, meaning both classified correctly and misclassified (erroneously classified) by the system. Fig. 5 shows the average fitness value of the swam in each generation in one of the confusion cases, it is clearly shown that parameters of membership functions were converged to values in which the number of misclassified numerals is zero for this particular case.    Table IV shows the number of misclassified (erroneously classified) digits both with the Fuzzy Phase and without the fuzzy phase (Classifiers only). It is evident in this table that the number of misclassified digits is reduced in all cases and in all datasets using the optimized Trapezoidal Fuzzy sets.
The percentage of this decrease is also shown for both training and testing. This is results in an overall average recognition accuracy. Fig. 7 shows the confusion matrix using the NN alone and using the NN followed by the Trapezoidal Fuzzy sets phase. It is shown in these tables that the number of confused digits is reduced when using the NN with the Trapezoidal Fuzzy sets Phase. Fig. 6 shows the confusion matrix using the NB alone and using the NB followed by the Trapezoidal Fuzzy sets phase. It is shown in these tables that the number of confused digits is reduced when using the Naïve Base with the Trapezoidal Fuzzy sets Phase.
This method is aimed to improve the classification rate in handwritten numeral recognition problem regardless the classifier's type. Therefore, comparative analysis in this case is meaningless because improvement is achieved after obtaining the result of the classifier by hooking the fuzzy based method up with the classifier output.

VI. CONCLUSION
In this paper we propose a four phase recognition system with the fourth phase being the Fuzzy Based Decision (Trapezoidal Fuzzy sets) rather than the usual three phase recognition systems which constitute of Pre-Processing, Feature Extraction and Classification. The Preprocessing phase consists of steps to normalize the images and denoised them as well centralization of the handwritten numeral. The feature extraction phase is based on the structural features of the handwritten numerals written in the four languages targeted in this paper namely; English, Arabic, Persian and Devanagari. For Classification, two well-known classifiers were chosen namely; MLP and Naïve Bayes classifier. The fourth phase consists of the use of Trapezoidal Fuzzy sets with the membership functions specified in the body of the paper. It is clearly evident from the results produced that the number of misclassified digits is further reduced by using the Fuzzy Set-Based Decision Mechanism which increase the overall average recognition accuracy of the system. Future work will include designing a numeral recognition system language independent, writer independent using deep learning networks combined with Fuzzy Set-Based Decision Mechanism.

CONFLICT OF INTEREST
The authors declare no conflict of interest.

AUTHOR CONTRIBUTIONS
A. Al-Hmouz, G. Latif, J. Alghazo conducted the research; A. Al-Hmouz, R. Al-Hmouz analyzed the data; All authors International Journal of Machine Learning and Computing, Vol. 10, No. 1, January 2020 wrote the paper; All authors had wrote the paper and approved the final version. achievements like full scholarship for FSc, BS -CS and MS-CS. He worked as an instructor at Prince Mohammad bin Fahd University, Saudi Arabia for 3 years in CS Department and have 2 years industry work experience. His research interests include image processing, artificial intelligence, neural networks, and medical image processing. Jaafar Al. Ghazo obtained his PhD and MSc in computer engineering from Southern Illinois University Carbondale in 2004 and 2000 respectively. He joined Prince Mohammad Bin Fahd University (PMU) as a founding dean of the College of Computer Engineering and Science and held various positions including dean of Graduate Studies and Research, Dean of Institutional Relations, and dean of Continuing Education and Community Service. Currently he is Assistant Professor at PMU. His research interests include, modelling and realization of biological mechanism using CAD and FPGAs, modelling and realization of arithmetic operations using CAD and FPGAs, low power cache design, and assistive technology for students with disabilities.