Nowadays, more and more people participate in the stock market. Recent survey reveals that there is a tendency of increasing number of youngsters, especially university students, get involved in the trading activities. We are no exception. Similar to many other investors, we are interested in forecasting the stock prices by using trends, patterns, moving averages observed from historical data.
However, there have been a certain number of people criticizing the use of past data. Among these people, a French mathematician, Louis Bachelier raised a theory called Efficient Market Hypothesis more than a century ago. The theory states that stock prices follow a random walk, which discouraged the study of historical data. This is very controversial and has led to an ever lasting dispute about the reliability of technical analysis. Nonetheless, people’s curiosity about past data has never gone. Being different from the vast majority who use typical technical analysis, we decide to use predictive data mining techniques which we regard as interesting and accurate in our forecasting.
Forecasting is an uncertain process and therefore a high accuracy is demanded. There are many forecasting techniques in the world. In general, they can be classified into three types: casual model, time-series model and smoothing techniques. Undoubtedly, they are of different features and thus are suitable for prediction under certain circumstances. For casual model, the most commonly used technique is simple linear regression model. In order to study the seasonal effect beside the trend, we choose to use decomposition analysis. There are many different kinds of developed time-series models. Box-Jenkins forecasting model is one of the most famous and relatively accurate time-series models. The univariate version of this model is a self- projecting time series forecasting method. The underlying goal is to find an appropriate formula so that the residuals are as small as possible and exhibit no patterns. The model-building process involves a few steps, repeated as necessary, to end up with a specific formula that replicates the patterns in the series as closely as possible and also produces accurate forecasts. At last, we will go through the smoothing techniques, which is believed to be very easy to use but with high accuracy. We are going to gain an insight at a few techniques, ranging from the simple ones to the advanced model. Basic smoothing includes simple moving averages, weighted moving averages and first-order exponential smoothing. The much more advanced technique we decide to use is Holt's Linear Exponential Smoothing Technique. After finishing the smoothing, we will perform one-step ahead forecast to see whether the technique works well.
In addition to the above models which investigate the historical stock prices in relation to time, we would like to find the correlation between one external factor and each stock. To our perception, those external factors usually have a great impact on the movements of the stock prices. We hope to know if they can explain the noises appeared in the forecasting we did in the previous parts.
2. Stocks Selection
In this project, we would investigate and analyze the trend of stock prices and try to forecast the upcoming trend in the future using different data mining methods. In other to narrow down the source of information, we mainly focus on three comparatively large sectors of stocks – Energy, Properties and Financial stocks. In each sector, we would select one representative stock to study. For Energy stocks, we have chosen The China National Offshore Oil Corporation (0883.hk) and Henderson Land Development Company Limited (0012.hk) for Properties. Also, Hang Seng Bank (0011.hk) is chosen for Financial sector. First of all, let us introduce the three stocks.
2.1. Energy -- The China National Offshore Oil Corporation (0883.hk)
The China National Offshore Oil Corporation...