DEVELOPMENT OF AN IMPROVED HIDDEN MARKOV MODEL BASED FUZZY TIME SERIES FORECASTING MODEL USING GENETIC ALGORITHM

Need help with a related project topic or New topic? Send Us Your Topic

DEVELOPMENT OF AN IMPROVED HIDDEN MARKOV MODEL BASED FUZZY TIME SERIES FORECASTING MODEL USING GENETIC ALGORITHM

CHAPITRE ONE

INTRODUCTION

1.1 Background

A time series is merely a collection of quantitative variables that occur at regular time periods. Time series, whether discrete or continuous, are always nonlinear and non-stationary since they are sample functions realised from stochastic processes (Subanar & Abadi, 2011).

Time series forecasting is useful in a wide range of applications, including predicting university enrollments, stock prices, rainfall, blood pressure, and so on.

For projecting future outcomes, such forecasting often employs a series of historical data points that are typically measured sequentially (Sheng et al., 2009).

Various time series forecasting approaches have evolved in recent decades. When compared to other models, Autoregressive Moving Average (ARMA) and Autoregressive Integrated Moving Average ARIMA-based models stand out.

They cannot, however, deal with time series ambiguity or linguistic concepts (Song & Chissom, 1993). Furthermore, these statistical methods did not perform well on time series with tiny amounts of data (Tsaur et al., 2005).

Furthermore, the criteria for using standard time series with probabilistic models, which require specific assumptions such as the number of observations, normal distribution, and linearity (Egrioglu, 2014).

When these assumptions are not met, these systems produce deceptive forecasting results. As a result, non-probabilistic approaches to time series forecasting have been proposed as an alternative to probabilistic time series forecasting models (Egrioglu, 2015).

Fuzzy time series (FTS) have been developed and widely used to address such problems (Radmehr & Gharneh, 2012). Many academics have been drawn to FTS models in recent years because of its advantages: improved performance in several real-world forecasting

2interacting with data in linguistic terms (Song & Chissom, 1993), and their ability to combine with heuristic knowledge and models (Huarng, 2001).

The determination of fuzzy relations is one of the most critical difficulties in FTS models (Egrioglu et al., 2013). Many strategies for finding fuzzy relations have been proposed in the literature.

Fuzzy logic relationship groups (FLRG), artificial neural networks, fuzzy relation matrices derived from some fuzzy set operations, particle swarm optimisation, and evolutionary algorithms are examples (Egrioglu, 2014).

The fuzzy logic relationship group is the most often utilised method since it does not require sophisticated matrix operations when the FLRG tables are produced.

When the FLRG tables are used, however, fuzzy set membership values are ignored since only the elements of the fuzzy set with the greatest membership value are evaluated (Aladag et al., 2012).

This circumstance results in knowledge loss and may have a negative impact on forecasting effectiveness. Because fuzzy relationships might be nonlinear and complex, an intelligent approach is required to calculate them.

To address these shortcomings, the Hidden Markov Model (HMM) was developed and used in the formulation of the fuzzy relationship, with the model parameters determined using a traditional search technique known as the Baum-Welch algorithm (Li and Cheng, 2010).

Because parameter learning in the Hidden Markov Model utilising the Bawm-Welch algorithm is prone to being trapped in the local optima, a technique for discovering improved estimates of the fuzzy relations while simultaneously avoiding the local optima is required.

Artificial intelligence approaches have been employed in various stages of fuzzy time series methods in recent years (Egrioglu, 2014).

A Genetic Algorithm (GA) method was used in this work to obtain the best approximation of the inner fuzzy relations. GA is a well-known search heuristic that replicates the natural evolution process.

This heuristic is commonly used to find beneficial solutions to optimisation and search issues, such as the partition problem in fuzzy time series (Cai et al., 2013). GA is made up of populations, chromosomes, and genes.

3

The fitness function and genetic operations are two examples of genetic operations. The population represents a collection of appropriate solutions. And each member of the population represents a possible solution to a given object problem.

This population representation defines the search space for the problem solution. Each of the factors that make up an individual is referred to as a chromosome. To construct the individual, the chromosomes are generally coded into a string.

A fitness function evaluates each individual in the population to assess how fit the solution is. The GA keeps track of n alternative solutions, i.e., individuals, with associated fitness values based on the fitness function (Koo et al., 1990).

1.2 Motivation

1.2 Motivation

Many studies have been conducted to improve the accuracy of fuzzy time series forecasting models (Uslu et al, 2013; Bas et al, 2014; Zhang et al, 2013).

Many studies on enhancing the accuracy of FTS models using artificial intelligence optimisation algorithms have also been undertaken (Yolcu, 2014; Aladag et al, 2013, Haneen et al, 2014).

However, the challenge of adequately capturing the relationships and, as a result, enhancing the model’s predicting accuracy continues.

1.3  Statement Of Problem

Fuzzy time series methods are excellent time series forecasting approaches. Since its inception, the study of fuzzy time series (FTS) has gained popularity due to its ability to deal with the uncertainty and vagueness that are frequently inherent in real-world data due to measurement inaccuracies, incomplete sets of observations, or difficulties in obtaining measurements under uncertain conditions.

Forecasting relies heavily on the modelling of fuzzy relations derived from a fuzzy time series. Fuzzy logic group relationships tables have generally been preferred for determining fuzzy logic relationships in the analysis of time invariant fuzzy time series.

The reason for this is that when these tables are employed, sophisticated matrix operations are not required. When, on the other hand, fuzzy logic groups relationship tables

4

are used, and fuzzy set membership values are ignored. In contravention of fuzzy set theory, only the elements of fuzzy sets with the highest membership value are considered.

This condition results in information loss, which reduces the model’s forecasting accuracy. Second, it is likely to meet rule redundancy and processing overhead.

As a result, regardless of the non-linear character of the fuzzy time series data, there is a need for a strategy that can capture the relationships more accurately. Furthermore, the intrinsic uncertainty of time evolution makes state transitions in a system probabilistic.

To address the constraints of existing FTS models, a forecasting model based on the Hidden Markov Model (HMM) for fuzzy time series was used to realise the probabilistic state transition.

Relationship (parameter) estimation for an HMM is often conducted using iterative schemes that are well-defined yet prone to being caught in a local minima. Because of their ability to handle nonlinear interactions, genetic algorithms (GA) have gained popularity.

To improve the relationship representation, a GA-HMM based model was used to accurately capture the relationships and hence increase the model’s forecasting accuracy.

1.4 Significance Of Research

The purpose of this study is to create an improved hidden markov model-based fuzzy time series that can increase forecasting accuracy by effectively estimating the fuzzy relationships that exist between the states of historical time series data.

1.5 Aims And Objectives

The goal of this study is to create an improved HMM-based FTS forecasting model utilising the Genetic Algorithm.

The following objectives were used to achieve the stated goal:

a) Construction of an HMM-based FTS model using the Bauw-Welch estimate approach.

5

b) Using GA to optimise model parameters, develop an improved HMM-based FTS model.

c) Model validation using bivariate benchmark FTS data of Taipei’s daily average temperature and cloud density and comparing findings to those obtained by (Li & Cheng, 2012), using MSE and AFEP performance measures.

d) Testing the generated model on ABU Zaria’s Internet traffic statistics.

Methodology (1.6)

The following highlights the approach used in this study to construct an improved hidden Markov model-based fuzzy time series forecasting model utilising a genetic algorithm.

a) The conventional HMM-based FTS forecasting model was developed utilising the relative frequency Bawm-Welch estimate approach.

b) Re-estimation of model parameters using GA to improve the created HMM-based FTS model.

c) Model validation using bivariate benchmark FTS data of Taipei’s daily average temperature and cloud density and comparing findings to those obtained by (Li & Cheng, 2012), using MSE and AFEP performance measures.

d) Using the constructed model to anticipate short-term Internet traffic data of ABU Zaria from February 29th to March 31st, 2016, collected from the ABU, Zaria data centre, and evaluating its performance.

1.7

1.7 Dissertation Organisation

The overall introduction was given in Chapter One. The remaining chapters are organised as follows: In Chapter Two, a detailed study of related literature and relevant core concepts like Time series, fuzzy time series forecasting, Markov Model (MM), Hidden Markov Models (HMM), and Genetic Algorithms (GA) is performed.

6

Second, Chapter Three presents an in-depth strategy and essential mathematical models outlining the construction of the improved hidden markov model based fuzzy time series forecasting model utilising genetic algorithm.

Third, Chapter Four depicts the analysis, performance, and discussion of the results. Finally, Chapter Five includes a conclusion and recommendations for further work. The appendices provide the complete MATLAB routines.

Need help with a related project topic or New topic? Send Us Your Topic