Artificial Intelligence Techniques Applied to Reduction ...

Viewer
Transcript

Artificial Intelligence Techniques Applied to Reduction of Uncertainty in Decision Analysis Through Learning Jerry Felsen Operational Research Quarterly (1970-1977), Vol. 26, No. 3, Part 2. (Oct., 1975), pp. 581-598. Stable URL: http://links.jstor.org/sici?sici=0030-3623%28197510%2926%3A3%3C581%3AAITATR%3E2.0.CO%3B2-6 Operational Research Quarterly (1970-1977) is currently published by Operational Research Society.

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/journals/ors.html. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

The JSTOR Archive is a trusted digital repository providing for long-term preservation and access to leading academic journals and scholarly literature from around the world. The Archive is supported by libraries, scholarly societies, publishers, and foundations. It is an initiative of JSTOR, a not-for-profit organization with a mission to help the scholarly community take advantage of advances in technology. For more information regarding JSTOR, please contact [email protected].

http://www.jstor.org Sat Sep 22 20:22:19 2007

Opl Res. Q., Pergamon Press 1975. Vol. 26, 3, ii, pp. 581 to 598. Printed in Great Britain

Artificial Intelligence Techniques Applied to Reduction of Uncertainty in Decision Analysis Through Learning JERRY FELSEN

University of Southwestern Louisiana, Lafayette, Louisiana 70501

Most of decision-making in the real world takes place under conditions of uncertainty, because usually the probability laws characterizing the decision situation are initially unknown. Formal treatment of such situations requires programming "judgement" or "intelligence". Thus this paper presents a computeroriented conceptual framework for decision analysis under conditions of uncertainty which enables the application of artificial intelligence techniques. We have developed a general model of the learning decision process which applies artificial intelligence techniques to programming decision-making by weighing evidence, under conditions of uncertainty, in recurrent situations. Specifically, we use generalized perceptron-type pattern recognition techniques, heuristic methods and learning system theory. An illustration is given in investment analysis and the experimental results indicate that through machine learning algorithms we can gradually reduce uncertainty in decision analysis and improve the decision system's performance.

INTRODUCTION DECISIONS are made under conditions of uncertainty if the probability laws characterizing the decision situation are unknown. If the relevant probabilities are known or can be accurately estimated, then decisions are made under conditions of risk. Most of contemporary decision-theoretic literature deals with decision-making under risk. However, most decision-making in the real world is done under conditions of uncertainty, with the decision-maker using "intuition" or "judgement and intelligence". Thus automating or programming decision-making under uncertainty can generally be done only with the aid of techniques borrowed from artiJicial intelligence (AI). In this paper we develop a conceptual framework for decision analysis which enables the application of A1 techniques. The modern way for dealing with uncertainty is the learning approach. That is, a learning mechanism is included into the decision system with the aim of gradually improving its performance and reducing the element of uncertainty. In the following sections we first develop a general model of the learning approach to decision analysis. Then, within the framework of this model, we develop an application of A1 techniques to programming decision-making by weighing evidence, under conditions of uncertainty, in recurrent situations. Specifically, we program problem-solving processes where decisions are made by adding up evidence obtained from various measurements, experiments or

Operational Research Quarterly Vol. 26 No. 3, ii observations of the decision situation where the probability laws are initially unknown. Examples of such decision problems include analysis and control of economic processes, medical diagnostics and therapy, many industrial process control situations and investment analysis. Such decision processes can be programmed with the aid of non-parametric pattern recognition techniques. For example, we may use various generalized perceptron-type pattern recognition devices, and the performance of the decision system can be gradually improved by a learning mechanism. (We use certain probabilistic iterative learning methods which are based on stochastic approximation techniques.) The machine learning algorithms are primarily responsible for the decision system's superior performance. Investment analysis was used as an example where learning decision models for both stock market timing and investment selection were developed. No assumptions about the probability laws of the investment decision situation were made in either model. Their performance was then gradually improved through learning from past experiences, i.e. after the decision system has been implemented and put into operation, its performance was optimized under direction of error-correcting feedback derived by evaluation of actual investment decisions. Both models were tested in actual investment analysis. The experimental results indicated that through a learning process investment performance can be improved and uncertainty in the decision situation reduced. THE GENERAL MODEL Our conceptual framework for decision analysis, like many others, divides the decision process into four functional phases: (1) identification activity, (2) design of alternative solutions, (3) choice and implementation of the best alternative and (4) evaluation of results and learning from past decision-making experiences. (1) The identijication phase. The purpose of the identification phase is to formally define the decision problem together with the relevant information needed for its solution. In other words, its aim is to find differences between the present (existing) state of the system and its desired state. Thus the identification phase consists of searching the environment for (1) conditions calling for decisions, and (2) relevant information describing the decision situation. This includes also the task of numerically encoding information for decision-making. (2) Design phase. The design activity invents, develops and analyses alternative solutions to the decision problem. This includes constructing models of the decision process and the environment of the decision system. The decision models are then used to generate possible solutions. Thus the output of the design phase is one or several alternative courses of action. (3) Choice and implementation. The function of the choice phase is to select the optimal solution from the set of available alternatives, and then to implement the solution. Comparison among alternatives usually uses the decision models from which the solutions were derived.

J. Felsen - Artificial Intelligence Techniques (4) Performance evaluation and learning. Real-life decision situations are usually so complex, unstructured and poorly understood, that optimal computerbased decision systems cannot be designed initially. So if an automated problem solving system is to operate successfully in the real world, it must be implemented as a learning system which can gradually improve its performance by learning from past experiences. In particular, the improving mechanism itself should be improvable. The learning process may gradually modify the decision system's internal structure or its decision models-using current information-under the direction of performance feedback. Automation of decision-making consists of three steps: (a) developing general formal methods for mechanizing the above four phases of the decision process, (b) applying these methods to designing formal procedures for specific decision processes and (c) programming these procedures on a computer.

Summary, notes on implementation The learning decision system operates as follows. First, the identification activity searches the system and its environment for information for decisionmaking and defines the decision problem. Then, in the design phase alternative solutions to the decision problem are developed and the best among them REAL WORLD

implementation

I

DECISION MAKER

PERFORMANCE

EVALUATION,

L E I OBJECTIVES

DECISION SYSTEM

FIG. 1. The anatomy of the learning decision process.

Operational Research Quarterly Vol. 26 No. 3, ii selected and implemented. A subsequent evaluation generates learning feedback which is used to modify the decision system with the aim of improving its performance (Figure 1). Each of the four functional components of the decision system involves a combination of various elementary information processes which must be formalized. THE MATHEMATICAL APPARATUS In this section we outline a few design principles for automation of decision analysis. In particular, decision-making is programmed by weighing evidence, in recurrent situations, under conditions of uncertainty using methods of pattern recognition (PR). We start with the identification phase. Programming the identijication activity Normally the decision problem is well defined and so the identification activity consists of numerically encoding information for decision-making using the state vector x = (x,, ...,x,). We define the state of a system and its environment at any time as the information needed to determine the behaviour of the system from that time on. Hence the state variables xi, i = 1, ...,r, represent observables or measurements characterizing the various features (properties or attributes) of the decision situation that have some predictive significance. Since the amount of computation needed to process an information pattern grows approximately exponentially with its size, through a mathematical Feature Space Transformation (FST) the state vector x is transformed into a feature oector $(x) = (pl(x), ...,ps(x)). This FST greatly reduces the size of x but without reducing its information content too much. That is, the FST performs aggregation, filtering and condensation of information, in order to emphasize the important aspects of the situation. 4(x) is a numerical encoding of the information for decision-making and is the input for the second phase of the decision process. Design and choice of the optimal policy Decision-making is the process of converting information into action, i.e. it is a mapping from the state (or feature) space onto a policy space. When programming decision-making by weighing evidence, this mapping is obtained by perceptron-type pattern recognition techniques. Let the policy space be A = (a&, where ai, i = 1,2,3, ..., are the alternative policies and define linear discriminant functions

where Wi= (w,,, ...,w,,) is the weight vector. Then the choice mechanism (i.e. the mapping from information into a course of action) can be achieved by a

584

J. Felsen - ArtiJicial Intelligence Techniques set of discriminant functions where a unique discriminant function is associated with every policy a$: ai ++ gdx). Then for any decision situation the system selects that policy ai for which the corresponding gi(x) is highest, i.e.

In this way a unique policy is associated with every pattern of information x. If the policy space contains only two courses of action, we need only one discriminant g(x). Then a, is selected if g(x)>0, and a, is chosen otherwise. The learning mechanism The purpose of the learning process is to gradually adjust certain system parameters so as to optimize the decision system's performance. The learning process is directed by information feedback derived through an evaluation of past decisions. The performance criterion can be chosen in several ways, e.g. the probability of error. By "error" we mean that a wrong decision has been made, e.g. the expected outcome was not attained. The decision system is then optimized by minimizing the probability of error, i.e. the average number of errors. The performance of the decision system is measured by a performance evaluation function V(W,x) which usually represents some measure of error. The expected value of V(W, x ) is the performance index I( W ) of the decision system, i.e. I( = E{ V( w, XI). The objective of the learning process is to find that setting of system parameter W* for which the functional I ( W ) takes its extreme value, i.e. for which

w>

grad I( W ) = E{grad V(W, x)) = 0, where the gradient operation is taken with respect to the weight vector W. Learning system theory provides us with the following recursive algorithm for iteratively determining the optimal weight vector W*: where a, is a constant and the subscript n represents the time when x and the corresponding V(W, x ) was observed or measured. Let us assume for simplicity that our policy space contains only two courses of action, i.e. A = {a,, a,}. Define where y = 1 i f al was the correct decision and y = - 1 otherwise, and signz = 1 if z > 0 and signz = - 1 otherwise. (The value of y represents the feedback

Operational Research Quarterly Vol. 26 No. 3, ii information determined by the performance evaluation of actual decisions after they were carried out. We also observe that V( W, x) has a non-zero value only when the decision system made an error. Thus Z(W) is proportional to the error rate and its minimization by algorithm (3) will reduce the decision system's probability of error.) Then substituting (4) into (3) we obtain the learning algorithm

Figure 2 shows the general structure of a two-category learning PR system. Scheme (5) is shown to give a computational procedure or program for gradually determining that configuration of system parameters W* for which the decision system's performance is best (i.e. I ( W ) is minimum), that is, the probability of making the wrong decision is smallest.

.Learning input

(Desired output)

FIG. 2. Structure of a perceptron-like learning pattern recognition system.

THE LEARNING DECISION SYSTEM

The learning mechanism improves our initial poor understanding and lack of

knowledge about the decision situation and in practice proceeds gradually in

an iterative manner. The information describing the decision situation is first

encoded in the form of the state vector which is preprocessed and transformed

into the vector of relevant problem features characterizing the decision situation.

Next the optimal policy is selected using the decision rule (2), and the decisions

are carried out. Then results of the actions taken are observed and evaluated

.

J. Felsen - Artificial Intelligence Techniques

leading to an adjustment of the various parameters of the decision system (weights) to improve the system's performance (Figure 3). The procedure continues until the error rate becomes essentially constant. A detailed description of the learning process will be given in the following section for a practical example. START

initialize the system; set n = 1. Initialize W"'.

cision situation (me-

Compute $ ( x , ) ; store +lxn).

cy; implement decisions.

zing ALGORITHM (51, I

I

FIG. 3. Flow diagram for the error-correction learning algorithm.

The decision system does not need the explicit knowledge of a utility (or value) function but only that the state of a goal being achieved can be recognized by the performance evaluation mechanism. It is apparent that some type of memory is needed to implement the learning algorithm. The memory is used to store the weight vector W, and the characteristics of the current decision situation +(x,). Since only the most recent values of W, and 4(x,) need to be stored the memory requirements of the proposed learning schemes are minimal. The proposed decision mechanism represents a realistic model of learning from experience because it has the ability to improve its performance using current information-step by step-as this information gradually becomes available. An advantage of this procedure is that the memory requirements are

587

Operational Research Quarterly Vol. 26 No. 3, ii quite small and its implementation is quite simple. Moreover, it can be used with relatively little a priori knowledge about the decision problem. The proposed scheme appears to be particularly well suited for mechanizing certain real-life decision processes since, firstly, the decision system can (theoretically) handle a large number of state variables or features and if the decision situation gives great variety, the system can cope with it (within the limits of available computing resources) by using a large feature space. Secondly, a decision system based on a perceptron-type scheme is inherently quite reliable in the sense that failures or errors in one or more state variables or features (or other components) may not markedly deteriorate system performance. In any real-life system, the information emanating from the situation will be "noisy", that is, it will never be very precise and may always be subject to various environmental disturbances. This is in addition to the fact that available information is usually also "fuzzy" by nature. Fortunately, the perceptron may continue to function properly even if some piece of input information is missing or incorrect. In this sense the proposed decision system resembles the human brain: it functions in terms of patterns, and if some elements of the pattern are missing or are corrupted with noise, the brain in effect reconstructs them. Thirdly, this decision mechanism does not require that a utility function is known analytically but only that the state of the objective being achieved is recognizable. This conforms with most real-life problems where it may be impossible to assign utility values to outcomes of decisions, but in retrospect it is usually easy to select the best decision that should have been made. APPLICATION TO AUTOMATION O F INVESTMENT ANALYSIS The concepts are now applied to programming stock market forecasting and investment selection. Since the methods come from various cybernetic disciplines, the term Cybernetic Investment Decision Systems (CIDS) is used to denote our programmed investment decision schemes and the general methodology is called the cybernetic approach to investment analysis. The investment decision problem is far too complex and cannot therefore be handled by any simple problem-solving procedure. So we use the hierarchical approach to problem solving. The spirit of this approach is to divide and conquer I Break up a complex problem into simpler subproblems that are within the reach of our capabilities, develop solutions (or problem-solving systems) for these subproblems, integrate them into a viable system and thus come out with a solution to the original problem. Accordingly, our CIDS is hierarchically structured. We divide the global investment decision problem into two major subproblems-investment timing and selection. The timing problem is subdivided into two local subproblems: (a) general market timing and (b) individual stock transaction timing. Similarly, the selection problem is also divided into two local subproblems: (c) security analysis and (d) efficient portfolio diversification (Figure 4).

588

J. Felsen - Artijicial Intelligence Techniques Subproblems (a), (b) and (c) involve decision-making by weighing evidence. For example, general market timing decisions (i.e. stock market forecasts) are made by weighing evidence obtained through the analysis of the market's (monetary, political and economical) fundamentals, technical factors, psychological measurements, observations of the news background and so on. Thus INVESTMENT ANALYSIS

INVESTMENT TIMING

(A)

General market timing

(B) individual security transaction timing

INVESTMENT

SELECTION

(C) Security analysis

(Dl

Efficient diversification of investment portfolio

FIG. 4. Hierarchical structure of the investment decision process.

subproblems (a), (b) and (c) can be programmed with the aid of pattern recognition techniques. Subproblem (d) can be handled by mathematical (quadratic) programming methods and therefore will not be considered any further. To each local subproblem (a), (b) or (c) corresponds a separate decision subsystem of the CIDS. Each subsystem is organized in the form of a functional three-layer hierarchy so that higher layers determine or adjust some parameters on the lower levek4 That is, the system can be functionally decomposed into three layers: (1) a layer of programmed decision processes, (2) a learning layer and (3) a layer of non-programmed decision processes (global planning layer). This functional hierarchy emerges naturally in reference to three essential aspects of the investment decision process: (1) the search for a preferable or acceptable course of action under known, prespecified conditions; (2) the reduction of uncertainties and improvement of performance; and (3) the selection of strategies to be used in investment management. The flow of information and control between these three functional parts of the CIDS is shown on Figure 5.

General market timing In this article we consider an application to forecasting the general market only, i.e. timing subproblem (a).

Operational Research Quarterly Vol. 26 No. 3, ii We define the general market state in terms of stock price trends and trend reversals and choose to work with four states of the market: uptrend, top, downtrend and bottom.

II I

1

-

Investment Selection Mechanism

SM Timing Mechanism

-

1 I

1

-

I

PROGRAMMED INVESTMENT

DECISION MAKING

LEARNING INVESTMENT

DECISION SYSTEM

I

I

STOCK MARKET

FIG. 5. Functional rnultilayer hierarchy of an investment decision system.

Now, many market practitioners are trend followers: They prefer to buy at major market bottoms, hold during the uptrend, sell (or sell short) near a market top and do nothing (or hold short positions) during a downtrend. Thus trading in the stock market is a recurrent decision process whose policy space contains four well-defined courses of action, and we can program this decision process with the aid of PR techniques. The success of the programmed decision system depends on its ability to predict future direction and changes of trends in stock prices. It is apparently impossible to predict future price changes through an analysis of past price changes. But if the information pattern for stock market analysis becomes very cornplex-and includes also fundamental state variables, psychological measurements, new background, etc., in addition to technical factors-then

J. Felsen - Artzjicial Intelligence Techniques prediction of future price changes appears possible. The information pattern x, which serves as a basis for investment decisions, must necessarily be so complex that the human mind is generally unable to ascertain the correlation between present values of x and future changes in stock prices. But this relationship can be gradually determined through a machine learning algorithm. AN ACTUAL IMPLEMENTATION We have implemented a cybernetic general market timing mechanism. It has been experimentally tested in actual investment analysis since 1970. Our scheme may be regarded as an attempt to program the well-known General Indicator Approach to market timing. This method synthesizes investment decisions by weighing evidence obtained through an analysis of a broad spectrum of indicators, i.e. measurements or observations, describing the state of the market. We observe that the CIDS is a man-machine system which is tailored to the needs of its users. So its design will generally reflect the user's resources such as the type of information to which he has access, his attitudes toward risk and expected returns on investments, available computational resources, etc. For these reasons there is an injinity of possible realizations of CIDS. Our choice was made for computational simplicity which can in fact be performed by hand. The decision system again operates in four phases. The first phase, the identification activity, consists of searching the environment and numerically encoding both objectively and subjectively all information characterizing the state of the general market. The relevant information is obtained by monitoring the various fundamental, technical, economical, monetary and psychological indicators ;reading daily newspapers; watching the investment news-background, etc. All these observations and measurements are represented by the state vector x = (x,, ...,x,,). So the state variables xi, i = 1,2, ...,51, represent a numerical encoding of individual SM indicators, observations of the news background, psychological measurements, etc. In short, x contains all the information about the market's past that is relevant for the prediction of its future. Through an FST the 51 state variables in x are transformed into feature vectors representing the fundamental, technical, psychological, etc. features of the general market. Now, different investment policies are generally used for different time horizons. And different information patterns are needed to predict the market's behaviour during different time horizons. Thus, since we work with three time horizons-short-, intermediate- and long-term-we need three feature vectors +%(x), i = 1,2,3, respectively. The elements of $%(x) represent a numerical encoding of the following general market features: p%,(x)- aggregation of political, economical and monetary fundamentals;

59 1

Operational Research Quarterly Vol. 26 No. 3, ii pi,(x)-technical factors, e.g. advance-decline line, volume and price trends, etc. ; pi,@)-aggregation of psychological measurements ascertaining the present attitudes of the investing public; pi4(x)-a quantitative measure of the news backgroundunderlying stock price changes; pi5(x)-measurements derived from the Elliott Watie Theory; pi& j m a r k e t statistics like average duration and extent of SM price trends; pt,(x)-a measure expressing the present market eficiency. The short-term feature vector (i = 1) contains five of the above features, the intermediate-term vector (i = 2) contains all seven features and the long-term vector (i = 3) contains six features. All information needed for the SM identification activity is obtained from the Wall Street Journal and Barron's. The design and choice mechanism, i.e. the mapping from the decision situation onto the policy space, is obtained by discriminant functions. Assume that our policy space contains four courses of action, e.g. buy, hold, sell (sell short) and do nothing (hold short positions). Since we need a policy space for each time horizon, we work with three policy spaces : Ag = (aij], i = 1,2,3 and j = 1, ...,4, means long-term buy, etc. e.g. a,, represents short-term hold, We program the choice mechanism by associating a unique discriminant with every trading policy: aij g$j(x), where gij = Kj +$(XI. Then for any investment decision situation the system selects that set of trading policies adj, i = 1,2,3, for which the corresponding

-

g,,(x) > max gi,(x),

j, k = 1, ...,4.

Performance of this choice mechanism is then gradually optimized through a learning procedure (Figure 6). To simplify the computations, we reduce the size of our three policy spaces to two courses of action: (1) buy (cover shorts) or hold long positions if bought earlier, and (2) sell (sell short) or do nothing (hold short positions) if sold earlier. In this case we need only three discriminant functions-one for each time horizon. Then, for example, a short-term trading strategy can be mechanized by the following program: (1) buy (cover shorts) as soon as g,(x) > 0, after a decline, then (2) hold as long as g,(x) > 0, then (3) sell (sell short) as soon as g,(x) < 0, then (4) do nothing (hold short positions) as long as gl(x) < 0,and when g,(x) z 0, go to (1).

J. Felsen - ArtiJicial Intelligence Techniques Through the application of heuristic methods we can also program more complex trading strategies involving more than one time horizon.

START

set n = 1, initialize

activity; determine value of state vector x.. Diagnose the market: compute

+,(x.), i= 1, 2, 3; store

every B,(x.).

Compute optimal pclicies, implement investment

.

I

Evaluate outcomes.

-2%-1

sion made?

-

FIG. 6. Flow diagram of the learning algorithm for trading in the stock market.

The error rate, i.e. the average number of wrong investment decisions, is then minimized through a learning process. The learning mechanism is initialized as follows. The three time counters n are set to one. (Using as a unit of time for the short-term decision subsystem 1 week, for the intermediate term 4 weeks and for the long term 16 weeks.) The weighing coefficients a, are set to one, and all weight vectors W%,i = 1,2,3, are initially set to one. The learning decision process proceeds gradually in an iterative manner starting with the identification activity which is carried out continuously. Every day the Wall Street Journal is read and other news media are scanned for information that may be relevant to the stock market. Any relevant information is recorded and later encoded in the state variables xi, i = l , ...,51. At the appropriate time intervals the state vector is transformed into feature vectors $,(x), i = 1,2,3, and the corresponding set of trading policies computed. These policies are then implemented and the results are observed and evaluated. If the evaluation of results indicates that the optimal policy has been followed,

Operational Research Quarterly Vol. 26 No. 3, ii no corrective actions are taken. However, if not, the decision system is corrected according to algorithm (5). For instance, suppose that trading policy a,, (intermediate-term buy or hold) is carried out at time n, that is, g2(xn)>0; but during the next several weeks the market suffered an intermediate sized deline, which resulted in a sharp decline in value of the investment portfolio, then the value of W2is "corrected" by computing and the new value of W2is used in subsequent computations of the trading policy. If the market would not have declined, the weight vector would remain unchanged (i.e. WP+1) = WP)), but four weeks later the time counter n would be increased by 1. The same correction would take place if the a,, policy is in error, but the sign in equation (6) would be reversed. The weight vector remains unchanged until the next error is made. This error-correction learning process continues until the error rate becomes stationary. The same learning algorithm is employed for all three time horizons, but the required learning times differ approximately by a factor of four between time horizons. Thus if it takes on the average 20 weeks to optimize the short-term decision system, it may take 80 weeks to optimize the intermediate-term system and approximately 6 years to optimize the long-term decision system. However, methods for acceleration of the learning process do exist.l EXPERIMENTAL RESULTS The experimental results of the tests of CIDS are encouraging: They seem to indicate that with the cybernetic approach we can (1) program some important judgemental aspects of investment analysis; (2) improve the quality of investment decision-making, i.e. amplify the intellect of the human analyst; and (3) attain above average investment performance. General market forecasting We have been testing two mechanical trading strategies. In the first of them (S-1) selling short is not permitted, i.e. during declining markets a "do nothing" policy is followed. In the second strategy (S-2) selling short is permitted. An intermediate-term time horizon has been used, i.e. all decisions are made at intermediate market turning points, i.e. when the sign of g,(x) changed. The two major results of our experimental testing are: (1) Performance of the decision system was gradually improved by the learning algorithm. The improvement was considerable. For instance, during a 20-week period the error rate of the short-term timing mechanism has dropped from almost 50 per cent to 20-30 per cent. A similar improvement of performance was observed on the intermediate- and longterm decision systems. The same learning algorithm is employed for all

J. Felsen - ArtiJicial Intelligence Techniques three time horizons, but the required learning times differ approximately by a factor of four between time horizons. The short-term learning curve is shown in Figure 7. During a 49-week time period the short-term weight vector has changed from the initial value Will = (1,1,1,1,1) to Wi4" = (1~307,1~145,1~045,0~850,0~202).

I

I

I

I

I

10

20

30

40

I

+

Time (n) in weeks

FIG.7. Learning curve of the short-term decision subsystem.

(2) After some learning period, reversals of stock market trends were usually recognized sufficiently early for profitable action. The relationship between the values of discriminants and the corresponding SM trends is displayed in Figures 8 and 9. It is seen that near the start of an up-trend the value

FIG.8. The sign of the discriminant is usually the same as the direction of the corresponding trend. The time horizon t may be short-, intermediate- or long-term.

of the corresponding discriminant becomes positive, it remains usually positive during the entire duration of the uptrend, and then it becomes negative near the beginning of the next downtrend. (But, of course, there are some errors from time to time.) Thus the trading policies generated by the decision system appear to result in above average returns while taking no more than average risks. For example, from 1970 to 1973

Operational Research Quarterly Vol. 26 No. 3, ii strategies S-1 and S-2 were applied to the NYSE Composite Index. These strategies generated net (after commissions) returns of 32.5 and 38 per cent, respectively, while the general market return during the same

BUY

-0.5

'"'"J

-

x--.,,-,.. ,' ,.i":

1 1 l 1 1 l 1 I l I 1 l l l l 1 1 I 1 1 1 1 1 1 1 1 1 1 8 152229 6 132027 3 101724 1 8 152229 5 121926 2 9 1623 2 9 16

Oct

Sept.

1972

Nov.

Dec.

Jan.

March

Feb. 1973

FIG. 9. Values of short- and intermediate-term discriminants (g,(x) and g2(x), respectively) from September 1972 to March 1973 with subsequent changes of stock prices.

time period was 29 per cent (dividends were discarded). A commission of 2.5 per cent per round trip (in and out) has been assumed. The risk of strategy S-1 is below average because approximately one-third of the time investible funds are held in riskless cash, while the risk of strategy S-2 is average.

Automating investment selection We have implemented a simple programmed scheme for security analysis designed according the principles outlined in this paper. The results attainable by such programmed selection systems appear to be superior to intuitively made selections. For example, we have tested this scheme by participating in The 1972-73 Value Line Contest of Stock Market Judgment. The 25-stock portfolio selected by the CIDS was placed 342nd out of a total of 89,744 entries. Within a 6-month period it outperformed the averages by almost 14 per cent while taking no more than average risks, and it won a cash prize.

J. Felsen - ArtiJicial Intelligence Techniques It must be pointed out, however, that these results should be considered as preliminary because our experiences with the cybernetic approach are rather limited to date. More research into this approach, and more experimental testing of our CIDS will be performed in the future. BIBLIOGRAPHICAL REMARKS This article is an outline, or rather a "synopsis", of work which we have reasonably thoroughly documented in a series of three forthcoming One1 includes about 200 references to relevant cybernetic literature and contains surveys of the relevant aspects of cybernetics, artificial intelligence in general and PR and learning system theory in particular, as well as decision theory, and operations research. It also develops the mathematical details of Cybernetic Decision Theories, and considers the implementation of Cybernetic Decision Systems within the framework of Management Information Systems technology. SUMMARY AND CONCLUSIONS We have explored the concept of reducing uncertainty in decision analysis through learning: First, we presented a general model of the learning decision process. Then, within the framework of this model, methods for programming decision-making by weighing evidence under conditions of uncertainty were presented. Finally, to illustrate the usefulness of the learning approach, we have developed its application to automation of investment analysis. Investing in the stock market is a typical example of making decisions under conditions of uncertainty since most investment analysts are unable to predict the market's future-not even in probabilistic terms. So we made no assumptions about the probability laws describing the investment decision situation. (In fact, it is not even necessary to assume that the decision situation is stationary.) Instead, a learning mechanism is included into the decision system which will gradually alleviate problems caused by initial uncertainty. Thus the modern way for developing programmed decision systems in the real world is via the learning approach. The learning scheme then gradually compensates for uncertainties caused by poor understanding of the situation and thus improves the decision system's performance. We note that after the performance of the decision system has been optimized, i.e. after the performance rate becomes stationary, the decisions are no longer made under certainty. Instead, decisions are then made under conditions of risk because it is now usually possible to determine the probabilities of outcomes. For example, after our programmed investment decision system has been optimized, i.e. after the error rate has become stationary, investment decisions are made no more under uncertainty because the probability of error is now known fairly accurately, e.g. it was about 20-30 per cent in our example.

597

Operational Research Quarterly Vol. 26 No. 3, ii Taus through a learning process decision-making under uncertainty has been converted into decision-making under risk. We also observe that our illustrative example features decision-making in recurrent situations. It is a man-machine system, and the learning ability is limited in relevance to the process of searching for the solution of a single decision problem. Very rarely is such learning transferable between different decision problems. On the other hand, there is some evidence that generalproblem solving can be learned. If a sufficiently high level of abstraction is reached, the same information processes take place in a large class of different problem-solving activities. Thus if the problem-solving system is sufficiently complex and general, it can learn also on non-repetitive problems. But more advanced learning techniques are needed. For instance, in creative thinking, the result of an intellectual experience is used not simply to adjust a few parameters but to construct a new way to represent something, or even to make a change in some administrative aspect of the decision system. So before we can design fully automated decision systems, we must first develop more general learning and problem representation techniques. REFERENCES J. FELSENCybernetic Decision Systems. (To be published.) J. FELSEN(1975) Cybernetic Approach to Stock Market Analysis versus Eficient Market Theory. Exposition Press, New York. J. FELSENDecision Making Under Uncertainty: An Artificial Intelligence Approach. (To be published.) D. MACKOand Y. TAKAHARA (1970) Theory of Hierarchical, Multilevel M. D. MESAROVIC, Systems. Academic Press, New York.