, and Kittisak Kerdprasop Deep Learning for Financial Time-Series Data Analytics : An Image Processing Based Approach

Forex or Foreign Exchange is the largest financial market with a huge amount of daily trading volume. Traditionally, the tool or strategy that Forex traders usually used is divided into Fundamental Analysis and Technical Analysis. Presently, the world computational technological advance, such as Artificial Intelligence plays a significant role in the financial domain. The various existing research applies Machine Learning and Deep Learning to develop powerful models that can be used as a tool for traders in order to develop their trading tools or strategies. However, most existing models are developed targeting for the stock market. There are still lag of research that applies the modern Machine Learning or Deep Learning for predicting the movement of the price in the Forex market. In this paper, we propose a novel predicting model based on Deep Convolutional Neural Network that can be effectively used as a tool in order to make the profits for Forex traders. We evaluate the performance of the proposed CNN model from two perspectives. The first perspective is to evaluate the accuracy of the prediction and the second perspective is to evaluate the ability to make profits. The experimental results show that in term of accuracy of the prediction, our proposed CNN model provide the accuracy up to approximately 77%. This result is similar to the C5.0 algorithm, which is a rule-based algorithm of Machine Learning. The results are different in a trivial faction. In terms of financial perspective, the proposed CNN model performs well as it produces approximately $69K for one and a half year (from January 2017 to September 2018).


I. INTRODUCTION
Foreign Exchange (Forex or FX) is the biggest financial market in the world [1] where the pairwise of many currencies around the world are traded. Like others financial markets, traders who invest in forex market require a good tool or trading strategy in order to make their profits. Typically, there are two types of tool or strategy that Forex trader can be exploited including Fundamental Analysis (FA) [2] and Technical Analysis (TA) [3]. FA is a method that attempts to measure the actual value of the security by examining its values related to the current economic, financial reports, other qualitative and quantitative factors. On the other hands, TA uses the information of in chart value Manuscript  according to the time-series historical data to create the powerful measurement tool for evaluating the value of the stock or the quotation of a currency pairwise. In the Forex market, TA is much more popular than FA because TA is easier to used [4], and it can eliminate the complexity to analyze the financial terms [5]. TA believes that there is unnecessary to analyze the financial or economic data since all the relevant information already includes in the chart. Therefore, a trader, who employ TA allows to focus on analyzing the chart itself for hints into where the quotation may be heading. Presently, there are plenty of tools also known as technical indicators that we can use for TA, such as, Moving Average Convergence Divergence (MACD) [6], Relative Strength Index (RSI) [7], Stochastic oscillator [8], etc. Although analyzing the chart using technical indicators is quick and easy, traders often questioned that which indicators could be used and what are the most suitable parameters could be set for each indicator.
In the last decade Artificial Intelligence (AI), the world computational technological advance, is widely applied in the financial domain. Various Machine Learning algorithms, such as Artificial Neural Network (ANN) [9]- [12] and an optimization algorithm, such as genetic algorithm (GA) [13], [14] have been applied to create a powerful model for predicting the price change volatility in Forex market. The developed models based on these algorithms yield approximately 20 to 50 percent a year. In a few years, Deep Learning, a subfield of Machine Learning research area has been introduced in the financial area [15], [16]. Considering the Forex historical data as a time-series data, an algorithm of Deep Learning that suit for this type of data is Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) algorithm. In 2017, the novel two model-based LSTM are developed by Bao et al. [17] and Yang et al. [18]. An Image Processing Based Approach based on the information described above, there is a lag of research that applied AI technology particularly Deep Learning algorithm in the Forex market. Therefore, in this paper, we propose a novel model based on CNN algorithm that aims to accurately predict the daily movement of the Forex price. The rest of this paper is organized as follows. Section II describes the data and the methodology used for conducting this research. The empirical results are discussed in section III. Finally, section IV represents our conclusions and makes suggestions for future research.

A. Research Framework
In this paper, we propose a Deep Learning model based on an image processing approach for predicting the daily price movement of the Forex market. The accuracy of the prediction is compared with the accuracy of the model-based Machine Learning algorithms including C5.0 [22], Logistic Regression [23], Discriminant [24], ANN [25], CHAID [26] and C&R Tree [27]. For Deep Learning model, the experimental process is divided into six steps including data collection, data preparation, data transformation, data labeling, modeling, performance evaluation and comparison, and financial evaluation. For Machine Learning model, the experimental process is divided into four steps including data collection, data preparation, and data labeling, modeling, and performance evaluation and comparison. The framework that we use for conducting this research is illustrated in Fig. 1.

B. Data Collection
The dataset uses for conducting this research is Forex historical data of the EURUSD currency pairwise from 3 January 2000 to 14 September 2018. The data is exported from a Forex trading platform called Meta Trader 4 (MT4) of FXCM [28] using time-frame daily. Therefore, the dataset consists of approximately 4,865 records.

C. Data Preparation
The dataset is divided into training and testing set using sliding-windows technique [29] as shown in Fig. 1

D. Data Transformation
In this process, the data are converted to be 16×16 pixels grayscale images. An image represents a daily data of the dataset. In order to create an image, which contains 16×16 pixels, we use 15 technical indicators and chart information. For each indicator, 16 different input of parameter is used in order to obtain 16 different values of each indicator. Fig. 3 describe the structure of daily data that is used to create a 16×16 pixels image. For the 16th row, which is the chart information (in Fig. 3) consist of 16 columns. The first 9 columns are the information we create from a candlestick and daily trading volume as shown in Fig. 4, otherwise are trends. We create trend values of 1, 3, 5, 7, 10, 15, and 30 days. Value 1 refers to uptrend, and value -1 refers to downtrend as illustrated in Fig. 5.   In order to create a grayscale image, Min-Max Normalization technique [30] is employed to scale the data between 0 -255. Min-max normalization is often known as feature scaling where the values of a numeric range of a feature of data are increased or reduced to a scale between the new maximum and the new minimum. Therefore, the new value of the member of the set of observed values is calculated according to the new maximum and the new minimum by using the following equation: view = (v -min / (max-min)) (maxnew -minnew) + minnew (1) where v is a member of the set of observed values, min and max are the minimum and maximum values in the range of the set of observed values, maxnew and minnew are the minimum and maximum values in the range of the new values.
Then the image data are created using the PIL library in Python [31]. At the end of the data transformation process, we obtain the 16x16 pixels grayscale images as shown in Fig.  6.

E. Data Labeling
The majority goal of this research is to predict the daily movement of Forex, which is "Up" and "Down". Up means, the price is closed above the open price, and vice versa for Down. Suppose that if we know that tomorrow price will be moved Up, we will send a Buy ticket today, and tomorrow we got a profit. For this reason, we create a signal column, which is the target class that we attempt to make a prediction. Firstly, the values of the signal column are created using the following formula: Then, we lagging-time for 1 day of the signal column. This means that we use today information with tomorrow label according to the signal class. As a result, we can label whether an image is Buy or Sell as illustrated in Fig. 7.

F. Modeling
After we transform data into images, we use the training set of image data to build the prediction model. The model is based on Convolution Neural Network [32] (Fig. 8), which is one of the Deep Learning techniques for analyzing an image. In this section, the details of the proposed CNN model are described.

Buy
Sell Fig. 7. Example of images of the buy and sell signals.
International Journal of Machine Learning and Computing, Vol. 10, No. 1, January 2020 Using 500 epochs to train the model.

G. Performance Evaluation
The performance of the proposed CNN model is evaluated in term of accuracy of the prediction. The accuracy can be computed using the following equation: where TP is the number of true positives, TN is the number of true negatives, FP is the number of false positives, and FN is the number of false negatives.

H. Financial Evaluation
In order to evaluate the performance of the model in term of the financial aspect, the predicted Buy and Sell are made. Then, the profits or loses of each transaction is calculated by the following formula: Profiti = | Close -Open -Spread | × Lot_size (4) Note that if the predicted signal is false, the Profiti is considered to be a negative return. Therefore, the Profiti that calculates according to the equation (4) has to be multiplied by -1. In addition, we set $1,000USD an initial balance, and the trading lot size is equal to 0.1 of ECN account.
At the end of the calculating process, the total balance is calculated by summarizing the initial balance and the profit of each transaction as illustrated in equation 5. 1 Total Blance = Initial blance Profit Table III shows the results of performance evaluation in term of accuracy. As is illustrated in Table III, the accuracy of all testing scenarios seems to be consistent. The 14th scenario provides the highest accuracy at approximately 77.48%, while the lowest accuracy is 70.71 when we use the year 2009 -2013 as the training data and use the year 2014 as the testing data.  Fig. 9 shows the results in term of confusion matrix of the model that is created using data from 2000 to 2016 and testing data from 2017 to 2018. As can be seen in Fig. 9

B. Performance Comparison
The accuracy of the proposed CNN model is compared with other Machine Learning algorithms as demonstrated in Fig. 10.
As can be seen in Fig. 10, the model that provides the highest accuracy is C5.0 (77.67%) followed by our proposed CNN model (77.48%), Logistic Regression model (75.93%) and discriminant model (75.93%), respectively. Based on this result, it could be concluded that the proposed model performs well compared to other Machine Learning models.
International Journal of Machine Learning and Computing, Vol. 10, No. 1, January 2020 However, C5.0 provide a better performance than the proposed CNN model. This result from the possibility that using a rule-based algorithm for creating a model for predicting the time-series data could provide a better result than using the image processing-based approach. However, the result is extremely trivial.

C. Financial Evaluation
In this section, we show the performance of the proposed CNN model in term of the financial aspect. The best model in scenario 14th is used in this experiment. We set up the initial balance at $1,000USD. Trading account is ECN account with leverage 1:400. The spread is 1 pip. Lot size is 0.1. The result shows that from January 2017 to September 2018 (Approximately 442 days), the proposed CNN model can produce profits of approximately $68,353USD as demonstrated in Fig. 11.

IV. CONCLUSION
In this paper, we proposed a novel model based on CNN that can be effectively used as a tool for predicting the daily movement of the Forex. We propose a technique for transforming the financial time series data into image data that can be used with CNN algorithm. The results of the performance evaluation in term of accuracy of the prediction shows that the proposed CNN model performs well compared to other model-based Machine Learning algorithms. Although C5.0 model performs better than the proposed CNN model, it is different in a trivial percentage. In addition, the result in term of financial evaluation also confirms that the proposed CNN model has a great potential to make profits for Forex traders.
For future works, we aim to enhance the performance of the model by tune-up the structure of the CNN model. In addition, we expect to implement the system that can be used in the real-world and can automatically provide a daily signal according to the predicting signal from the proposed model.

CONFLICT OF INTEREST
The authors declare no conflict of interest.

AUTHOR CONTRIBUTIONS
Watthana Pongsena conducted the research and wrote the paper. All authors had approved the final version of the paper.