Deep Learning CNN and GRU used in Banking for Time Series Forecasting
Hi my name is Kreecha Puphaiboon I work for Krungsri Bank in Thailand (https://www.krungsri.com). I am co-writing with Jaturapol Tanarungrueng with a main objective to demonstrate how to use Deep Learning (DL) to forecast Time Series data with Python code as a template for anyone (see code on github down at the bottom of this blog).
A time series is a series of data point indexed in time order. Time series is a sequence taken at successive equally spaced points in time. It is a sequence of time data. Examples of time series in banking would mean how to prepare cash amount for branches and ATMs. Each day demands may be different such as weekend, week day, lottery day and etc. These impact the amount of cash we need to prepare and plan for banks. To deal with time series is not easy.
Dealing with time series, we need to consider randomness, trends, cyclic and seasonal. For examples, every Valentine day bank branch in central metropolitan will have higher demand for cash at ATM. Remarkably, at end of month when the salary date occurs all branch will have to handle quite a sum of withdrawals. So many logics and complexities of time required us to handle. Having a Deep Learning technique can come in handy to forecast, so that all forecasts will not result in over or under stock of cash.
In this blog, we will forecast just for the next day. In fact, at my work we will predict next 28 days for better planning. But for the sake of simplicity and demonstration we will do just one-day forecast.
As this is a regression problem, we will use historical data of 30 days for training to predict the next day.
Column A to L are features that you want to embed (up to your consideration) and the last columns is the forecast amount where input and output are as followed. T denotes time.
Input = Data(T(n-32) -> T(n-1))
Output = Data(T(n))
To forecast here DL we will use Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU). CNN uses convolution operations that can handle spatial and ordered information available in images or tubular data while GRUs have memory which can store temporal or repeated information available in time series data. The two models also can be combined.
From the above Figure 2 CNN alone learns to minimize error till converge can take longer time than GRU based on the epoch, which means you will have to wait or it can cost you more if you are using Cloud from any vendors. Nonetheless, an advantage can be that you can use CNN to find which important features to use and then pass on to GRU (with 20 epochs when combined). Problems are open, this depends on your algorithm or your problem solving skills. I just give you an example of how to combine.
Figure 3 below shows that if you use CNN and GRU together then the error of RMSE is reduced.
Once again what you can do with DL is your option to improve the model. In our real work, we have to select features and transform into the model as numbers. There are for examples anomaly events, probability distribution and weights of features so DL can learn to forecast successfully. We have achieved really well and this is why we want to share to public. But note that i can not reveal anything too detail as it is my work ethnic too.
I hope you will find this writing is fun and entertaining. Thank you to my team mate Jaturapol Tanarungrueng who codes this example. We work as Krungsri AI team by the way. Below is the code.