Enhancing Financial Decision Making Using Multi-Objective Financial Genetic Programming Jin Li, Member, IEEE, Sope Taiwo

Abstract— This paper presents a multi-objective genetic programming based financial forecasting system, MOFGP. MOFGP is built upon our previous decision-making tool, FGP (Financial Genetic Programming) [1]-[5]. By taking advantage of the techniques of multi-objective evolutionary algorithms (MOEAs), MOFGP enhances FGP in a number of ways. Firstly, MOFGP is faster in obtaining the same quantity of diverse forecasting models optimized with respect to multiple conflicting objectives. This is attributed to the inherent property of MOEAs, i.e., a set of Pareto front solutions can be obtained in a single execution of its algorithm. Secondly, MOFGP is friendlier and simpler from the user’s perspective. It is friendlier because it eliminates a number of user-supplied parameters previously required by FGP. Consequently, it becomes simpler as the user no longer needs to have a priori domain knowledge required for the proper use of those parameters. Finally, compared with FGP, which exploits a canonical single-objective approach to tackle a multi-criterion financial forecasting problem, MOFGP demonstrates the above advantages without seriously sacrificing its forecasting performance, although it suffers from an inadequate generalization capability over the test data in this study. Given its strengths and weaknesses, MOFGP could be employed as a useful starting investigative tool for financial decision making.

I. INTRODUCTION

M

any decision making problems in financial forecasting are considered to be hard. Apart from the occurrence of unpredictable cataclysmic events such as World War I and II, or unexpected company scandals such as those of WorldCom, Enron happen etc, there are a number of other reasons for this. Firstly, a typical forecasting problem usually involves a large number of factors, many of which interact each other. For example, the future movement of a share price of a company may be affected by some fundamental elements such as current economic conditions, reflected possibly in current interest rate, and the performance of the company itself, reflected in the price/earnings ratio, net profit or grass profit for each share. Future price movements may also be influenced by its past trading history, reflected in past trends of its closing prices and trading volumes over a certain past period. To make a

Manuscript submitted January 31, 2006. This work was supported in part by the Advantage of West Midlands. Jin Li is with the Centre of Excellence for Research in Computational Intelligence and Applications (CERCIA), the School of Computer Science, the University of Birmingham, Edgbaston, Birmingham B15 2TT, UK (phone: 0044 121 4145142; fax: 0044 121 4142799; e-mail: j.li@ cs.bham.ac.uk). Sope Taiwo is with the School of Computer Science, the University of Birmingham, Edgbaston, Birmingham B15 2TT, UK (e-mail: [email protected]).

prediction, one usually has to identify most pertinent influencing factors by searching through a potentially huge and complex search space constituted by those interactive factors. This searching task is clearly non-trivial. Secondly, many real world financial forecasting problems often require handling of multiple, often conflicting criteria or objectives, rather than a merely single objective. Thus, it is quite common for practitioners in finance to seek assistance from forecasting models in making their investment decisions. Whilst significantly more accurate forecasts are usually unachievable if not impossible, practitioners tend to seek improvements in alternative properties of the forecast models and consequently they turn to alternative models that can target the corresponding sometimes-conflicting performance objectives. For example, the objective of one model might be to reduce risk of investment failure, whilst that of another might be to avoid missing investment opportunities. For a practitioner interested in both objectives the trade-off in performance is such that neither model can be considered strictly “better” than the other. However, different practitioners, who may have different trading attitudes, will have a specific preference for either model or one that lies somewhere inbetween. The former might be of more interest to some conservative risk-averse traders, whereas the latter might be of value to the more aggressive variety of traders. Therefore, for such a multiple objective forecasting problem, the ultimate goal is usually to find a set of optimum trade-off models, also known as Pareto optimal front solutions, which is the set of models where performance of a model in one objective cannot be improved by another without sacrificing its performance in another objective at the same time. The large number of interactive factors and multiple conflicting objectives involved in a financial forecasting problem generally leads to a huge and complex search space. This makes it extremely difficult for one to find the Pareto optimal front with optimal trade-offs in objectives. In the past, most traditional heuristic optimization approaches have been used to solve this inherently multi-objective problem using single objective optimization techniques [6]. The most common approach was to transform the multi-objective problem to a single objective one by aggregating multiple objectives using weights, thereby enabling the application of all the traditional heuristic optimization methods. The disadvantage in using this approach is two-fold. First, setting up weights is a tedious exercise due to sensitivity of results to their values. In many cases, it may also require some prior domain knowledge on the part of the user, which restricts

users ability to apply the method. Secondly, since the approach only generates one solution per execution, the very desirable diverse set solutions can only be achieved through multiple runs, but with no guarantee in success. Recently, evolutionary algorithms (EAs) have received much attention owing to its intrinsic ability to handle optimization problems with both single and multiple objectives [7]-[9]. Unlike most of the conventional heuristic approaches, EAs inherently work with a population of solutions, which makes it more likely to carry out a thorough search in a largely unknown and complex solution landscape. Multi-objective evolutionary algorithms (MOEAs) extend EAs naturally by increasing the chance of finding a set of optimal trade-off (or Pareto front) solutions that vary with respect to multiple objectives. Due to the advantages of EAs and MOEAs, recent years have witnessed numerous applications of the techniques to financial problems. Whist a large part of research studies consider use of EAs for financial problems based on single objectiveoriented techniques, e.g., [10]-[13], only a small part of these studies consider use of MOEAs for multi-objective financial problems, e.g., [15]-[16]. And while most of these address financial portfolio optimizations, very few at all specifically relate to multi-objective financial forecasting. In this paper, we present an evolutionary multi-objective system for financial forecasting, called MOFGP (MultiObjective Financial Genetic Programming). MOFGP combines the principle of multi-objective evolutionary algorithms with the genetic programming technique, aimed at finding a set of diverse forecasting models with varying performance in terms of different prediction criteria. MOFGP is a new extension built on FGP (Financial Genetic Programming), a genetic programming based decisionmaking tool for financial forecasting [1]-[5]. While predicting whether an index will rise by r% within the next n periods, FGP has been demonstrated to be capable of generating varied models by means of a canonical single objective genetic programming based approach, coupled with a novel constraint parameter incorporated in the fitness function [4], [11]. The objective of this study is to investigate whether MOFGP can exhibit any advantages over FGP in handling this naturally multi-objective financial prediction. The rest of this paper is organized as follows: Section II reviews FGP and its way of dealing with a financial forecasting problem with multiple criteria. In Section III, we introduce MOFGP with its architecture and some major components. We also present how MOFGP tackles the same forecasting problem based on two conflicting objectives. Experiments and results are reported in Section IV, in comparison with that of FGP. We conclude in Section V with some future work. II. OVERVIEW OF FGP In this section, we shall briefly review FGP and its achievements so far. In particular, we shall focus on a

constrained aggregating fitness function, whereby a naturally multi-objective forecasting problem is transformed into a single objective problem, enabling FGP, in multiple runs, to find a group of diverse forecasting models with respect to two different prediction criteria. A. History of FGP FGP [1] is a major implementation of Evolutionary Dynamic Data Investment Evaluator (EDDIE) [10]-[13], which is an interactive genetic programming based financial forecasting tool. It aims to help analysts to search more efficiently and effectively for promising financial forecasting models in a possibly large and complex space. FGP does this by accepting information about the factors, which the user considers most relevant to future share price movements. These factors may be based on fundamentalanalysis (e.g., price-earning ratios), technical-analysis (e.g., moving averages) or whatever relevant information. Taking advantage of the genetic programming techniques, FGP then investigates the interaction between these factors in relation to future share price movements and ultimately returns decision trees that are indicative of the largely true association between the user-supplied factors and the future prices. The decision trees generated in FGP are called Genetic Decision Trees (GDTs), which are understandable to human beings. In this way, human expertise is channeled into FGP through factors/indicators input, and experts are allowed to experiment with a variety of indicators more easily. It is worth pointing out that FGP is an interactive decision support tool designed to aid expert practitioners in making investment decision, not to supplant human expertise. The forecasting performance of FGP would largely depend on the quality of the indicators used. FGP has been exploited to address a number of financial forecasting problems with demonstrated accuracy [1]. It is capable of improving forecasting accuracy by combining experts’ forecasts from different sources [5]. It is also capable of making fairly correct predictions over stock indices, individual stocks, and stock index options and futures while tackling a set of prediction tasks: whether an index/stock will rise by r% or more within the next n periods, which is denoted by Pnr (see, [1], [4], [11], [12].) B. Multiple Forecasting Criteria While tackling the prediction problem of Pnr , we have already used a number of criteria to assess forecasting performance in our previous studies [4], [11]. We discuss them as follows. Suppose one is asked to make a forecasting on a daily basis for Pnr . Each day can be classified into a positive position, where the target return will be achieved, or a negative position, where the target return will not be achieved. Given a prediction and the actual reality, one may construct a contingency table shown in Table 1. Several prediction measures are defined as follows. The Rate of Correctness (RC) in a prediction is the number of all

correct predictions over the total number of predictions. The Rate of Failure (RF) is the proportion of positions that were wrongly predicted positive (FP) over the number of positive predictions (N+). The precision is 1 - RF, i.e. the proportion of positive positions that were correctly predicted. The Rate of Missing Chances (RMC) is the number of wrongly predicted negative (FN) over the number of actual positives (O+):

This shall allow the investors to choose an appropriate model, based on their preferences, to assist in decision making. Traditional methods for achieving a set of optimal solutions are to aggregate objectives into a single objective, possibly subject to some constraints. Typical techniques of the methods include the weighting method and the constraint method [14].

TABLE I A CONTINGENCY TABLE FOR A TWO-CLASS CLASSIFICATION PREDICTION PROBLEM # of True Negative Positions [TN]

# of False Positive Positions [FP]

# of False Negative Positions [FN]

# of True Positive Positions [TP]

# of negative positions predicted N- = TN+FN

# of positive positions predicted N+ = FP+TP

RC =

TP + TN O+ + O−

TP + TN = N+ + N−

;

RC is a major criterion for any prediction system. Ideally, one would like RC to be 1, i.e., 100% predictive accuracy. How close RC could approach 1 is restricted by the quality of the factors considered (e.g., how relevant input variables are related to the forecasting?), as well as the capability of the forecasting algorithm. In reality, a predictive accuracy beyond a certain high level is impossible to achieve. Therefore, one possibly tends to seek the improvement on two alternative measures, i.e., RF and RMC. Both RF and RMC are also good indicators of forecasting performance from the perspective of an investor. The positive prediction given here may mean an investment. If it is wrong, the investor will not be able to achieve the return desired. Such a mistake could be costly. Therefore, reducing FP (or RF) would be highly desirable to an investor who does not want to take risk. On the other hand, RF may notably be reduced to 0 if the system makes no positive prediction. This means all available opportunities are missed, resulting in poor performance in RMC. This is not acceptable to an investor who does not want to miss few precious chances during a certain short period. There is surely a trade-off between RF and RMC. In summary, for Pnr , the multiple objectives are to maximize RC, and meanwhile, to minimize both RF and RMC at the same time. However, in reality, given a certain level of RC, a predicting model that performs better regarding RF would usually perform worse regarding RMC, or vice versa. RF and RMC are considered to be conflicting objectives in general. C. A constrained aggregating fitness function Given the multiple criteria discussed above c, Pnr is naturally a multi-objectives forecasting optimization problem. The ultimate goal of a forecasting model is to provide the investors with a set of diverse near-optimal forecasting models that vary with regard to different criteria.

RF = (1- precision) =

Actual # of negative positions O- = TN+FP Actual # of positive positions O+ = FN+TP Total # o f predictions T = N+ + N-

FP ; N +

RMC =

FN O+

;

Our previous studies [1], [4], combine both traditional approaches together by introducing a constraint-driven linear objective fitness function to FGP, given by f = w_rc * RC - w_rmc* RMC - w_rf * RF

(1)

It involves three error measures, i.e. RC, RMC and RF, each of which is assigned a different weight: 0 ≤ w_rc, w_rmc, or w_rf ≤ 1, respectively. Obviously, the goodness of a forecasting model is no longer assessed merely by one criterion, but by a synthetical value, which is the weighted sum of its three performance rates. By adjusting the three weights, in general, one is able to place more emphasis on one criterion than the others. For example, a model having a desired lower RF is almost always a favorite choice to most investors. One possibility of achieving low RF is to assign a high value to w_rf (e.g., 0.62) to penalize models with poor performance in RF, to set w_rc to 1 to reward models with high performance in RC, and to fix w_rmc to 0 without taking RMC performance into account. Although our previous experiments [4] demonstrated that the linear aggregating objective function did work occasionally, it still lacks robustness in achieving a low RF. To overcome the weakness of the fitness function, we introduce an additional parameter into f, as a constraint, R = [Pmin, Pmax], which defines the minimum and maximum percentage of positive positions that we instruct FGP to make in the training data (like most machine learning methods, the assumption is that the test data exhibits similar characteristics). Efficacy of this constraint-based linear fitness function was reported in our previous studies [1], [4]. With varied non-overlapped constraint choices ranged from a looser [50%, 65%] to a tighter [5%, 10%], the function f effectively guides FGP to find forecasting models with gradually improved performance in RF, equivalently, reducing RF.

Although our hybrid approach is effective in generating forecasting models more robustly than before without the constraint, it does illustrate some disadvantages in handling a naturally multi-objective optimization problem like Pnr . Firstly, setting up three appropriate weights is non-trivial, as performance of FGP is sensitive to the three values. Secondly, finding which constraint to use and setting up a value for the constraint R found are both challenging, as they may require the user to have some prior knowledge about the problem itself and what effect the constraint might have upon the performance of FGP. Finally, this method requires multiple program runs to obtain a set of near-optimal forecasting models diversified regarding RF and RMC, which was proved to be more time consuming.

‘‘Negative’’ means otherwise. ((IF (MV_50 < -18.45) THEN Positive ELSE (IF TRB_5 > - 19.48) AND (Filter_63 < 36.24) THEN Negative ELSE Positive

C. Architecture of MOFGP The idea behind the framework for MOFGP is to apply the technology for what it does well and allow human expertise do what technology cannot do. With this in mind the forecasting problem is structured as an iterative

III. MULTI-OBJECTIVE FGP A. Modifying the fitness function The first step taken towards addressing the shortcomings of FGP described above is to reconsider the use of a linear objective fitness function and its underlying assumptions. We now assume that the solution surface and efficient frontier defined by the error measures are unknown to the user at the outset. Consequently, given the contingence table in Table 1, we represent the multiple objectives by defining two simple fitness functions as follows:

frf = frmc =

RF RMC

(2) (3)

The first objective is to minimize RF whilst the second objective is to minimize RMC. The two objectives are often conflicting each other in practice, though, in theory, there is a possibility that both measures could approach 0 if a perfect prediction (i.e., 100% forecasting accuracy) were achievable. There is generally a trade-off between the two objectives, i.e., a lower RF may be available at the cost of a higher RMC or vice versa. The goal of MOFGP is to find a set of optimal or near-optimal Pareto-front forecasting models in terms of the two objectives.

B. Solution Representation in MOFGP As with FGP, a candidate solution is represented by a genetic decision tree (GDT). Its basic elements are rules and forecast values. Each rule consists of a technical indicator, which is a simple derivative of the time series, a relational operator such as “greater than” or “less than”, and a threshold real numerical value. Interaction between single rules is realized through logic operators such as “Or”, “And”, “If-Then-Else” and “Not”. We use the same 3 types of technical analysis rules adopted in our previous work, which include Moving Averages, Filter Rules and Trade Range Break Rules (for more details of the rules, readers can refer to [4]). A simplistic example of GDT is shown below, where a ‘‘Positive’’ prediction means that the goal can be achieved;

Fig. 1. Block Diagram of MOFGP Architecture. The focus of this study is only on the MOFGP search engine and its benefits. The Steering Interface is currently still in development.

combination of multi-objective searching and interactive “conversation” with the user where he/she refines his/her user preferences. The search algorithm is only concerned with providing the user with a flavor of the optimal trade-off surface composed by the current user-defined objectives by generating a set of diverse Pareto-optimal solutions. The idea is to then allow the user analyze the attributes of suggested solutions, and decide which subrange of the optimal surface to explore further via a constraint-driven secondary search which may be multi-objective or singleobjective. In effect, the users steer the search process in directions that are of interest, while MOFGP provides the engine to transport them in to their favored destinations. But because the user is at first unfamiliar with solution terrain, he/she must be provided with an outline of the solution landscape which is the job of the search engine. The overall

architecture of MOFGP is depicted in Fig. 1. As of the time writing this paper both the steering interface and the secondary search optimization algorithm for targeting specific areas were still under development. The area of focus for this study is the performance of the search engine in its task of delineating the solution landscape.

D. The Multi-objective Search Engine NSGA-II (Non-dominated Sorting Genetic Algorithm) [20], NPGA (Niched Pareto Genetic Algorithm) [21], and SPEA2 (Strength Pareto Evolutionary Algorithm) [22] are some typical examples of the many standardized state-ofthe-art MOEAs available in literature to choose from when building a multi-objective search engine. Outlined below are the key considerations we took into deciding which of these standardized algorithms were best suited to form the basis of our particular application. 1) Diversity of solutions: A good Spread of Pareto optimal solutions is vital in order to give the users a clear picture of the trade-off surface and enable them to steer the search capably. 2) Quality of solutions: true optimality of solutions produced. 3) A reasonable speed of convergence. 4) Ability to address large scale problems. 5) Flexibility: not too sensitive to parameter settings. Based on these factors, SPEA2 was selected to serve as the platform for the multi-objective search engine of MOFGP. The distinctive characteristics of SPEA2 were its ability to hang on to boundary solutions during each execution and return sets of solutions with consistently good level of diversity. The flow diagram of the multi-objective search engine of the resultant MOFGP is depicted in Fig. 2. Since the main feature under consideration is the fitness assignment method and associated selection processes, these are the aspects of FGP that are modified in order to achieve outstanding results in MOFGP. E. Handling Overfitting using Validating Data Our preliminary testing runs found that the performance of the search engine suffered from the problem of overfitting, i.e., performance of solutions on the test data is not as good as that on the training data. The lack of generalization capability is a commonly encountered datamining problem. To rectify this, we utilized a validation data set split from the training data. Here the training data is subdivided into two sets with equal number of data points. During training each individual candidate GDT in the population is evaluated with respect to both subsets yielding two pairs of fitness values (RF1, RMC1 & RF2, RMC2). The worse RF and RMC values are then selected to become the fitness of that individual. Thus, a model generated is not overfitted to either data set. In this way, a lightly improved performance on generalization was observed in our experiments.

IV. EXPERIMENTS AND RESULTS The major goal of the experiments is to investigate what impact the integration of multi-objective functionality into FGP, would have on its forecasting performance. More specifically, we would like to assess the expected major benefit of MOFGP, namely, its capability in finding the true optimal trade-off Pareto front of forecast models in a single execution. We paid special attention to the spread of optimal solutions generated in terms of RF and RMC, as this is indicative of MOFGP’s ability to comprehensively outline

Fig. 2. The algorithm of search engine part of MOFGP. It is based on SPEA2.

the options available to the user. For the purpose of comparison of MOFGP with FGP, the arranged prediction task is the same one addressed by FGP in [4], which is to classify each trading day into positive or negative positions depending on whether an index/stock will rise by 2.2% or more within the next 21 days. We take the same DJIA closing index from 07/04/1969 to 09/04/1981 (3035 trading days), used by FGP. MOFGP was run on a Pentium PC (200MHz) using a population size of 2500, with a termination condition of 30 generations, or 2 hours timing whichever took up less time. SPEA2 is an elitist MOEA in a sense that an archive of

the best overall solutions across all generations is preserved throughout each run. The archive size for MOFGP is set to 10 in our experiments. Operating parameters for crossover

and mutation remain the same. A crossover rate of 90%, reproduction rate of 10% and mutation rate of 1% is used.

TABLE II MEAN PERFORMANCE OF MOFGP IN COMPARISON TO FGP

[80, 100] [65, 80] [50, 65] [35,50] [20,35] [15, 20] [10, 15] [5, 10] [0, 5]

Mean results of MOFGP on the training data (A) Average # of RF RMC Model 47.1% 6.6% 1.9 42.4% 23.2% 0.9 39.5% 36.7% 1.1 35.2% 51.5% 1.7 30.8% 66.0% 1.2 26.1% 76.5% 0.7 23.1% 84.3% 0.9 17.8% 92.3% 0.4 6.2% 99.0% 2.2

Mean results of MOFGP on the test data (B) # of RF RMC RC PPP 47.8% 7.9% 51.9% 1043.6 46.6% 31.1% 52.3% 764.5 43.6% 46.4% 54.1% 563.7 42.0% 63.2% 53.1% 375.2 39.9% 76.9% 51.8% 229.1 39.1% 88.2% 50.0% 115.0 34.8% 92.4% 49.6% 69.6 23.7% 97.0% 48.8% 24.5 5.0% 99.6% 48.0% 3.4

MOFGP was run 10 times. We recorded performance of 10 non-dominated Pareto-front solutions in the archive for each run, so the performance of a total 100 forecasting models (10 x10) is reported in this paper. For the purpose of comparison with FGP, each individual model generated by MOFGP is allocated into one of 9 subranges based on the percentage of total number of positive positions that it makes. In an order of increasing tightness, the 9 subranges are [80, 100], [65, 80], [50, 65], [35, 50], [20, 35], [15, 20], [10, 15], [5, 10], and [0, 5]. All figures in brackets are in unit of percentage (%). It is worth pointing out that six of these subranges (i.e., [50, 65], [35, 50], [20, 35], [15, 20], [10, 15], and [5, 10]) were used in the preceding FGP study, as only these tighter constraints can effectively guide FGP to find models with increasingly lower rate of failures (RFs) as descried in [4]. Table II gives a breakdown of the distribution of the resultant models from MOFGP based on the 8 subranges. Panels A and B describe performance of the MOFGP on the training and test data respectively. Panel C provides performance results of FGP in [4] on the same test data for comparison. The third column in Panel A gives the average number of models in the archive grouped into each of the 8 subranges based on the 10 runs over the training data. For example, MOFGP returns an average of 1.1 (out of 10) models per run that fall into the [50%, 65%] subrange. After 10 runs it is observed that never are there more than two range categories, where no (zero) models are returned in any given run. And only once (out of 10 runs) does more than 2 models appear within any given sub-range (5% -10% category of the 3rd run). These results demonstrate that thanks to the SPEA2-based search engine, MOFGP can consistently generate a diverse range of models, which collectively provide the full spectrum of Pareto optimal solutions available in the solution space. The first two columns of Panels A give the RF and RMC

Mean results of FGP on the test data (C) RF

RMC

RC

# of PPP

46.7% 40.1% 36.0% 31.0% 28.6% 13.5%

45.5% 65.7% 75.3% 85.7% 94.1% 99.1%

51.3% 53.7% 53.4% 51.7% 49.7% 48.2%

606.2 338.9 229.8 125.1 49.3 6.2

performance of the models generated by MOFGP on the training data. Once again the evenness of the spread/diversity of solutions is apparent. It is worth pointing out that the full spectrum of RMC qualities lies in between 0-100%, whereas only a 0-47% RF range is possible given the actual number of positive positions within the training data used and the solution landscape defined by the limited number of technical analysis rules functioning as building blocks for this study. Mean Per formance (RMC vs RF) Com parison of MOFGP against FGP 50% 45% 40% 35% 30%

RF

R[Pmin,Pmax]

25% 20% 15% 10% 5% 0% 0%

10%

20%

30%

40%

M OFGP Results on Training Results

50%

60%

70%

80%

90%

100%

RMC

M OFGP Results on Testing Results FGP Results on Testing Data

Fig. 3: Plot of the performance of the archive of models generated by MOFGP on the training and the test data compared to results of FGP models (represented by triangular data points on the graph)

The average performance results of MOFGP on the test data are given in Panel B of Table II. The results show that MOFGP is capable of achieving models with gradually decreasing RFs and correspondingly increasing RMCs, which are spread over the 8 ranges in an order from the top to the bottom. For example, models in the loosest range [80%, 100%], on average, have the worst RF (i.e., 47.8%),

but the best RMC (i.e., 7.9%), whereas, models in the tightest range [0%, 5%], on average, possess the best RF (i.e., 5%), but the worst RMC (i.e., 99.6%). Models in the middle ranges have modest performance in RF and RMC. These confirm that RF and RMC are conflicting each other with the trade-offs between the two objectives. The beauty of MOFGP lies in this capability of finding these trade-offs in single runs, enabling the user to make a choice of them according to his/her preference. A direct comparison can be made with the corresponding average performance results of FGP within 6 sub-ranges, given in Panel C. In most constraint ranges, FGP achieves on average, slightly better results (lower RF and lower RMC for each subrange) than MOFGP, although for a few subranges MOFGP outperforms FGP. This is understandable, given FGP dedicates greater computational time and resources towards converging on distinctly optimal solutions within a reduced searching space. It is worth noting that these diversified models were generated by FGP only one at a time through multiple runs using different constraint parameters, (i.e., the subranges here) as explained earlier in Section II. In terms of the number of positive positions predicted (# of PPP, see both fourth columns in Panel B and Panel C), MOFGP shows no difference against FGP on the test data. Interestedly, a t-test on the mean difference of RF for each subrange indicates that there is no statistically significant difference between MOFGP and FGP, except for the range of [15%, 20%]. We could argue that MOFGP can quickly achieve a spectrum of diverse models in one run without sacrificing its forecasting performance regarding RF and RMC much. A plot of the performance of MOFGP on the training and the test data along with corresponding FGP performance is given in Fig. 3. It is noted that although the employed validation strategy reduces the deviation between the MOFGP performances on the training and the test data, it does not eliminate it completely. Future efforts shall be committed to improving on this generalization aspect of MOFGP performance. I. DISCUSSIONS AND CONCLUSIONS

A. Discussion The experimental results have suggested that MOFGP demonstrates several advantages over FGP. Firstly, MOFGP is faster than FGP in generating the same quantity of diverse solutions. Given the same amount of time, MOFGP can generate a set of diverse optimal trade-off solutions in a single execution of the algorithms, whereas FGP can only produce one solution at a time. Hence, in order to obtain a set of n number of diverse solutions, MOFGP needs one time unit, whereas, FGP requires n time units. Such an advantage allows an analyst to make a choice immediately afterwards. More importantly, due to the mechanisms of MOEAs, resultant models are almost guaranteed to be diversified in a single run with respect to different

objectives. In contrast, FGP has no such luck, except for using the dedicated constrained fitness function aforementioned. Secondly, MOFGP is a simpler and user-friendlier algorithm. It eliminates four user-defined parameters required in FGP, i.e., the three weights and a constraint. This elimination improves the appeal of the system, as it removes the obstacle that requires the user to understand the effects of the four parameters upon the system performance. In other words, the user does not need prior knowledge any more for building up the fitness function like f in Equation 1. Finally, MOFGP provides users with the flexibility in the process of decision making. Users are allowed to make decision before, during or after the process of MOFGP. In contrast, FGP requires users to make decisions in advance, by setting up an investment preference using weights for each objective respectively and the constraint in a singleobjective fitness function. This flexibility is especially useful under circumstances where domain knowledge is hardly available or is not available at all. Despite those advantages, results of MOFGP in this study also indicate its disadvantage, i.e., for some ranges; the models created by MOFGP are bit worse in its generalization capability in terms of RF and RMC (see Fig. 3). Given the major strengths and minor weaknesses of MOFGP, i.e., it is faster, simpler and more flexible in offering diverse models; but, it somehow lacks generalization ability, we envisage that MOFGP would be particularly suitable to be employed as a starting investigative tool to quickly offer a variety of promising solutions over unknown space across the users’ interests. Afterwards, the user could commit herself/himself to searching for more promising solutions by focusing on some specific space with high potential, which has already been unveiled by MOFGP. Because the specific solution space is usually a reduced one in size and full of interesting solutions tailored to the users’ demands, it is more likely for any optimization models to obtain better solutions with improved performance in generalization capability.

B. Conclusions This paper presents a multi-objective genetic programming based system, MOFGP, for financial forecasting. We have applied MOFGP to predict whether a required rate of return can be achieved within a userspecified period on DJIA historic data for over more than 10 years. Experimental results have indicated that MOFGP enhance FGP in a number of ways. The beauty of MOFGP lies in its capability of finding a diverse optimal or nearoptimal Pareto front forecasting models merely in one execution of the algorithms. Provided with an holistic picture of possible diverse models that one may achieve, the user is placed in a better position to take further analysis to focus on specific forecasting models with the user’s own preferences. In addition, MOFGP is simpler and friendlier from the user’s perspective. It eliminates several parameters

required in FGP. As a result, the user does not need to know any prior knowledge about those parameters. More importantly, MOFGP gains these benefits without sacrificing its forecasting performance much, in comparison to a dedicated single objective model, FGP. As for future work, there is some certain scope where we would like to continue this study on MOFGP. Firstly, we would like to gain more knowledge regarding the characteristics of MOFGP as a multi-objective forecasting system. For example, its convergence, diversity, and robustness are worth further investigating. Secondly, the impact of model parameter settings on its performance of the above properties is also of interest to us. Thirdly, generalization of MOFGP needs to be improved. Finally, we shall explore the principle of MOFGP further into other decision making application domains, where multi-criteria decision making dominates and an overall picture of optimal trade-off solutions are always preferred to be available to decision makers. Examples of the domain include medical diagnosis, fraud detection and network troubleshooting, etc. REFERENCES [1] [2]

[3]

[4]

[5]

[6] [7] [8] [9] [10] [11]

[12] [13] [14] [15]

J. Li, “FGP: a Genetic Programming Based Tool for Financial Forecasting,” PhD Thesis, University of Essex, Colchester, Essex, UK, 2001. J. Li and E.P.K. Tsang, “Improving technical analysis predictions: an application of genetic programming,” in Proceedings of the 12th International Florida AI Research Society Conference, Orlando, Florida, 1999, pp. 108-11. J. Li and E.P.K. Tsang, “Investment decision making using FGP: a case study,” in Proceedings of the 1999 Congress on Evolutionary Computation, IEEE Press, Washington DC, USA, 1999, pp. 12531259. J. Li and E.P.K. Tsang, “Reducing failures in investment recommendations using Genetic Programming,” in Proceedings of the sixth International Conference on Computing in Economics and Finance. Society for Computational Economics, Barcelona. 2000. E. P. K. Tsang and J. Li, “Combining ordinal financial predictions with genetic programming”, in Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning (IDEAL-2000), Hong Kong, December 2000, pp. 13-15. D. B. Fogel and Z. Michalewicz, How to solve it: Modern heuristics. Springer, Heidelberg, 2000. T. Back, D. B. Fogel and Z. Michalewicz, Handbook of Evolutionary Computation. New York: Oxford. 1997. K. Deb. Multi-Objective Optimization using Evolutionary Algorithms. John Wiley & Sons, Chichester, UK, 2001. C. A. C. Coello and G. B. Lamont, Applications of Multi-Objective Evolutionary Algorithms, World Scientific, Singapore, 2004. E. P. K. Tsang, J. Li, and J. M. Butler, “EDDIE beats the bookies,” International Journal of Software, Practice and Experience Vol.28 (10), Wiley, 1998, pp. 1033-1043. E. P. K. Tsang and J. Li, “EDDIE for financial forecasting,” in S-H. Chen editor, Genetic Algorithms and Programming in Computational Finance, Kluwer Series in Computational Finance, Chapter 7, 2002, pp. 161-174. E. P. K. Tsang, J. Li, S. Markose, H. Er, A. Salhi and G. Iori, “EDDIE In Financial Decision Making.” Journal of Management and Economics, Vol.4, No.4, November 2000. E. P. K. Tsang, P. Yung, and J. Li, “EDDIE-automation, a decision support tool for financial forecasting,” Journal of Decision Support Systems 37, 2004, pp. 559-565. J. L. Cohon, Multiobjective Programming and Planning. New York: Academic Press, 1978. S-H. Chen (Ed.), Evolutionary Computation in Economics and Finance Springer. Heidelberg, 2002.

[16] S-H. Chen (ed.), Genetic Algorithms and Genetic Programming in Computational Finance, Kluwer Series in Computational Finance. 2002. [17] F. Schlottmann, and D. Seese, “Financial applications of multiobjective evolutionary algorithms: recent developments and future research directions,” in C. A. C. Coello and G. B. Lamont, (ed.) Applications of Multi-Objective Evolutionary Algorithms, World Scientific, Singapore, Chapter 26 2004, pp. 627-652. [18] R. Subbu, P. P. Bonissone, N. Eklund, S. Bollapragada, and K. Chalermkraivuth, “Multiobjective Financial Portfolio Design: A Hybrid Evolutionary Approach,” in Proceedings of the 2005 Congress on Evolutionary Computation, IEEE Press, 2005, pp. 1722 – 1729. [19] E. Zitzler, “Evolutionary Algorithms for Multiobjective Optimization: Methods and Applications,” PhD thesis, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland, November 1999. [20] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A Fast and Elitist Multiobjective Genetic Algorithm: NSGA–II,” IEEE Transactions on Evolutionary Computation, 6(2):182–197, April 2002. [21] J. Horn, N. Nafpliotis, and D. E. Goldberg, “A Niched Pareto Genetic Algorithm for Multiobjective Optimization,” in Proceedings of the First IEEE Conference on Evolutionary Computation, IEEE World Congress on Computational Intelligence, volume 1, Piscataway, New Jersey, IEEE Service Center, June 1994, pp. 82-87. [22] E. Zitzler, M. Laumanns and L. Thiele, “SPEA2: Improving the Strength Pareto Evolutionary Algorithm,” in K. Giannakoglou, D. Tsahalis, J. Periaux, P. Papailou and T. Fogarty (eds.) EUROGEN 2001, Evolutionary Methods for Design, Optimization and Control with Applications to Industrial Problems, Athens, Greece, 2002, pp. 95--100,

Enhancing Financial Decision Making Using Multi ...

diverse forecasting models optimized with respect to multiple conflicting objectives. ..... focus for this study is the performance of the search engine in its task of ...

263KB Sizes 0 Downloads 132 Views

Recommend Documents

Enhancing Cloud Security Using Data Anonymization - Media12
Data Anonymization. Cloud Computing. June 2012. Enhancing Cloud Security Using Data. Anonymization. Intel IT is exploring data anonymization—the process ...

Enhancing Expert Finding Using Organizational ...
files are constructed from sources such as email or documents, and used as the basis for .... expert finding system that it has access to a large pool of experts.

Enhancing Cloud Security Using Data Anonymization - Media12
Data Anonymization. Cloud Computing. June 2012. Enhancing Cloud Security Using Data. Anonymization. Intel IT is exploring data anonymization—the process ...

Intuitive Decision Making
Of course, business is not a game, and much ... business decisions I ultimately rely on my intuition. .... (9 a.m.-5 p.m. ET) at the phone numbers listed below.

SITE-BASED DECISION-MAKING TEAM
Mar 26, 2013 - Roll Call. Members: Sydney Travis, Kate Grindon, Renee Romaine, Chris ... approved with one correction to roll call name. ... Old Business.