Three ways to construct adaptative WEB menus using ...

Viewer
Transcript

>Paper 132

1

Three ways to construct adaptative WEB menus using ants Gérard Kubryk, Maxime Kubryk

Abstract— Web engineering is becoming more and more important in the last years. The research community has identified the need to offer new methods and methodologies in order to build a good environment to develop web information systems and to offer to the users menus which are perfectly adapted to their requirements. WEB and audio services have to provide the best services possible. To achieve this goal, they have to find out what the customers are doing without altering their privacy. This paper presents three ways to manage and build adaptative menus. These methods are ACO model of Dorigo, learning by ants analogy with two smoothing methods. Later on, a comparison of these three methods will be made based on two criteria: efficiency (answering time and computer load) and accuracy with customer expectation. The final step will be to carry out psychological analysis of user activity, meaning, “What is my perception of time into and between service consultation” to determine ways to set parameters of such a system. Index Terms— Web and audio services, requirements analysis, adaptative and customized menus, models and methods specifications, ants, artificial ants, intelligent agents.

I. INTRODUCTION When a customer is accessing a WEB site or uses a value added service, he wants to have efficient access whereby he can access information as quickly as possible. One way to achieve this goal is to remember what the user has previously done and then to offer him a menu or an access organization suiting perfectly his preferences without wasting time. The customer could also expect that the agent responding to his request has a good knowledge of the topics he likes. Analyzing previous interactions with the service could provide what he wants. This means that an automatic system is able to analyze a customer’s actions and choose the right call center agent. It must also be able to provide this agent with information that is easy to understand about caller preferences. Of course, all the process must be performed in real time. Developments of a customer’s loyalty, preferences and dislikes also have to be managed. We must also analyze the relation between themes. Themes could be close and Manuscript send January 3, 2008. Gérard Kubryk, Laboratoire I3S/CNRS in Sophia Antipolis, Université de Nice, gkubryk@ i2m.fr Maxime Kubryk, Université Pierre et Marie Curie (Paris 6) [email protected]

reinforcing themselves or could be opposed and weakening themselves. Finally, for effective marketing analysis, a company needs to know where the services are located in their own life curve. Characterization of a customer’s action uses age of action, length, repetitions and duration period of these repetitions. We have to ensure user privacy and the way this information is stored. We also have to maintain the possibility for new services to be created and presented. The main problem is to find a way to describe a phenomenon, which is a function of 1/t but has 2 different equations: An original equation An equation after the first consultation and the ones that follow To reach this aim we will compare 3 different ways: ACO model of Dorigo learning by ants analogy using integration with a sliding window learning by ants analogy using integration with the trapeze method These methods will be compared to identify their ability to create the right menu and to achieve it in real time for a large amount of requests. The last step will consist in experimenting with the chosen method and the parameters coming from it with real users. Research in this area mostly works in a static way. This means that there are analyzing users to define an “average user” or categories of users and then the menu that will suit these users. M. Perkowitz [1] [2] published many works in this domain. Our way is very different, as we want to define the right menu for each user. Actually, the system we want to build is a particular case of a general learning system. Users give information to the system and the system sends back proposition that is the result of a learning process. The particularity in our case is that we study independently each user and the object under learning is always a menu (a set of classified themes)

>Paper 132

2

II. GENERAL ARCHITECTURE OF THE TARGET SYSTEM The process begins with information input. This come from 3 sources: Information tickets coming from telephonic platforms supporting audio services Tickets of WEB services Preferences given by users and managed by service providers This information is used to define users’ preferences as expressed by them in previous accesses. These preferences allow the system to adapt the menu for the next access. Methods have to deal with all kinds of channels or terminals (i.e.: Web, telephonic services, PDA, 3rd generation telephone, etc.) One problem remains: there is no possibility for a new service to be presented so we need to introduce these services in a random way into the customer menu. One solution could be to introduce full random access on new and non-used services that could be adapted to fulfill needs. This function is not part of this paper. This architecture (as shown in figure 1) allows for the discovery of new services automatically and as soon as they are used audio Services User Data

data input

Informa-t ion Management

WEB Services New Services

CRM informa-tio (1) n

M

Menus adaptation

Call Center & CRM Web Services Audio Services

or data. So we need information that we could organize like a vector so that the system could compute the best menu as previously defined [3]: TABLE I DESCRIPTION OF THE USED VECTOR

User ID Theme these two parts are keys to access the right vector ID allows all calculations Date /hour this values is the function of the time used to f(t) calculate the rank of the theme for the user the derivative of the above function (the slope of the f’(t) curve) is a good way to analyze the status of the theme for the user: If slope is positive with a high gradient, we are in a period of a user’s strong interest If slope is positive or negative but with a low gradient, the user’s interest is reducing If slope is negative with high gradient, we are in a period of a user’s poor interest The system has to create a vector for each user each time a theme is accessed. This vector Xi , for item i, includes information on time and duration of this access and is used to calculate a new vector when there is a new access that allows us to trace past actions. All theme vectors for a given user give the organization of the best possible menu when the user re-enters the service.

X

New & non used services

i

Fig. 1. General architecture of the proposed system

According to psychology, we could assume two rules: A - interest in a theme decreases according to the time between the last access and the previous one. B - interest in a theme increases with repetition. We could describe this by an additive model according to the Bellman theory: 1 E(n) = f(E(n – 1) , ) tn - tn - 1

where En (state at time n) is the function of state En-1 and of 1 (which is inverse of period between time n and tn - tn - 1

f(t)

theme organization

(we define f(t) as the function allowing the modification of Xi according to the time elapsed between two accesses) Vector Xi could have two forms according to the fact that we could have one vector for each event (n vectors for each theme) or only one if we make a calculation of this vector for each access (1 vector for each theme). In all cases, vectors must include the following components: User ID URL or theme ID Time and date of event Duration of event

III. MATHEMATICAL ANALYSIS

In the “1 for each theme” method, we need to add the last two components which are: f(t) at event time f’(t) at event time

The problem we have to solve is an organization problem that allows managing time on the way users access WEB pages

This means that at each event for a “user – theme” couple, an f(t) and an f’(t) are evaluated.

time n-1).

>Paper 132

3

Vectors Xi are: N for each theme

User Id Theme ID date / hour duration

1 for each theme

User Id Theme Id date / hour duration f(t) f'(t)

The choice between the two forms has a strong effect on the way the system responds. In the case “n for each theme”, we have to recalculate all the strings of events to get the final level and then the final rank. This could take a very long time and this time increase at each new consultation. Therefore, we think that the solution “1 for each theme” is more efficient. It is also clear (if the computation is not reversible) that “1 for each theme” allows a better guarantee of user privacy as it could be impossible to go back and get information about previous users’ accesses. We could use the same method to analyze a theme or an URL. Any user of the system accesses events are used. In this case, the vector becomes: Theme Id date / hour duration f(t) f'(t)

There is no “user ID” as the user has no meaning in this case. Thus, it becomes very easy (in the same way as described above) for a provider to have a real time analysis for a product or a theme using f’(t), to know where it is in its life’s curve, and to act on them accordingly. IV. THE ANTS

organization that differentiates it of most of the current systems created by the man. The self-organization also means that a functional structure appears and remains spontaneously. Realization of a self-organized system needs a large number of components (or agents) interacting in the same way as insects or neurons. The system is dynamic: components state change constantly according to the state of the other components. Because of the interdependence between components, the changes are not made in a arbitrary way but according to certain preferential states. Certain states are strengthened by the interaction between components while the others are eliminated. The important point for Gershenson [6] is that a self-organized system has, typically, properties of a level superior to that of its elements which is the result of their interaction. One of the examples the best known for the self organized systems is given by the social insects and among these the ants. That is why we are going to study in the following paragraph the main laws and the main characteristics of this type of system as well as the applications that ensue from it. A. Real Ants Ants are simple insects with a limited memory and only capable of simple actions. However a colony presents complex behavior bringing "intelligent" solutions of problems such as to move a big object, to make a bridge or to find the shortest road between the ant-hill and the source of food. Numerous teams among which those of Benzatti [7] and Bonabeau [8] studied the colonies of ants and their organization. Their conclusions brought important points for the transposition of these organizations in the world of the data processing.

The current information systems depend on such a quantity of modules, data, network connections, input-output that it becomes impossible to predict or to control their interactions. The result is translated by software full of bugs, corrupted data, breach in security, virus and catastrophic edge effects. Besides, the complexity becomes such as a human mind cannot apprehend anymore, that is aggravated by the acceleration of the modifications of the equipment (which the capacity grows in a exponential way), software, protocols and by the increase of the expectations of the users. This report leads to the perception of a chaos

An isolated ant has no global knowledge of the current task that means that the actions of the ant are accomplished on a local base and then are of this fact generally impredictible. The intelligent behavior appears spontaneously as a consequence of the self organization and from the communication between ants. It is what is called an " emergent intelligence ".

To ease this chaos and the impossibility in the human scale dread everything, the researchers of IBM think of systems widely independent from a human supervision, adapting, curing, and repairing themselves. They use the metaphor of the autonomous nervous system which controls our body in a unconscious way although the method to reach this control is badly defined. This means saying that the system is self organized.

To find the shortest road, ants communicate by depositing a track of pheromone on the road that they follow (See fig 2). By the game of the evaporation of the pheromone and the successive deposits made by ants circulating on the road, they are capable of finding the shortest road by using only the information supplied by this track of pheromone deposited by all the ants and this without any visual information.

For Heyligens and Al [4] [5] a self-organized system not only regulates or adapts its behavior but also generates his own

The collective memory (of ants for example) is according to Lévy [9] a shared memory distributed at the same moment in the brains of ants but also more widely on the surrounding objects (the track of pheromone on the track example).

>Paper 132

4

already visited cities (initially it is empty). This memory serves to: estimate the feasibility of the solution estimate the quality of the solution allow the return of the ant In every decision, the ant chooses the following not-yet-visited city by using the information (deposited pheromone) supplied by the previous ants, which founded a

good solution. Fig. 3. Example of simulation of traveling salesman problem with ants

Fig. 2. Way ants managed an unexpected obstacle on their track

The evaporation is a complex phenomenon appealing to the temperature, the wind speed, the dew point, the nature of the liquid and its impurities, etc. This is why its modelisation is difficult. So we have used, during our work, a law defined by Dussutour [10] by taking an evaporation of 1/40. B. Dorigo algorithm Dorigo defined one of the algorithms which brought many results by using analogies with ants. ACO (Ant Colony Optimization) is a system conceived by Bonabeau and his team, inspired by the behavior of ants such as described by Dorigo[11] [12]. ACO is an algorithm appealing to an iterative search where in every iteration; each of the artificial ants builds a solution by following a road in a tree of decision. If ACO is applied to the problem of the traveling salesman (see fig. 3), we obtain the following description: m ants are positioned on m cities extracted from a set of n cities for every ant the city of departure is chosen in a unpredictable way The memory Mk of every ant k contains all the

The information of the quantity of pheromone is contained in a matrix where every element ij indicates the quantity of pheromone worn by the link of the city i towards the city j for i, j [1:n]. The initial value of ij is put in a slightly positive value for all the links lij of the graph. In this structure the probability of choice of a link is bound to the quantity of pheromone ij but also in a local heuristics ηij. This heuristics has for value: ηij = 1 / Dij i and j.

where Dij represents the distance between cities

It means that the more distance between cities i and j is small the more ηij is big and thus the more the link is attractive for the ant. The table of routing of the ant k for the city i, is supplied by all the values [aij(t)(k)] in which t represent the roundtrip. aij (t)(k)=

[τij(t)]α[ηij]β [τil(t)]α[ηil]β

j Ni i

j

l Ni

In this formula, Ni is all the neighbors of i, α and ß are two

>Paper 132

5

parameters, which control the respective weight of the track of pheromone and the heuristics. If α = 0 only the distance i-j then (Dij) is taken into account and the closest city is chosen. If ß = 0 only the quantity of pheromone is taken into account. According to Dorigo it quickly leads to a situation in which all the ants take the same road, which is generally not optimal. The probability Pij(t)(k) defines the choice of the city j in the iteration t for the ant k, which is in the city i (for j Ni(k), Ni(k) representing all the accessible cities for the ant k and not yet visited). Pij(t)(k) =

aij(t) ail(t) l Nik

When all the ants ended their roundtrip, every ant k deposits a quantity of pheromone: k (t) = 1 / Dk(t) Where Dk(t) is the length of the roundtrip for the ant k in the iteration t. That makes: τij(t+1) τij(t) + k(t) This calculation of k(t) makes that, the more the roundtrip is short, the more the quantity of pheromone deposited on every link is big, this quantity is thus dependent on the performance of the ant. We can specify that there are two methods of pheromone deposit: The step-by-step deposit which is made as one goes along as make ants (là j’ai pas compris le sens: faut modifier) the deferred deposit which is made at the end of the roundtrip The difference is important because the step-by-step deposit can only take into account the length of the link gone through at the moment t while the deposit at the end of roundtrip can (only?) integrate at the same moment the length of every link and the total length of the route.(la phrase est pas très claire) The analysis of the Dorigo’s algorithm shows that the length of the link is integrated only into the heuristics ηij, it(c’est quoi “it”) intervenes only indirectly, through the length of the roundtrip, in the deposit of pheromone on every link individually, (the same quantity of pheromone is deposited on every link). To take into account Dij would return the more complex but also more successful algorithm by privileging directly the shortest links inside the same tourballot (and not only through the heuristics ηij) while continuing to privilege the shortest solution. (j’ai pas compris le paragraphe) The formula of pheromone deposit becomes then: Ψ(t)(k) = 1/Dij(t)(k) τij(t)(k) = (1/Dij(t)(k))( τk(t)) k

Ψ(t)(k) In which the quantity of deposited pheromone is conversely proportional to Dij. It would be necessary to test, in front of the algorithm native of Dorigo, the results, which would bring this modification of the algorithm. In the original algorithm, after the deposit of pheromone, the ant dies and the evaporation is operated on all the links lij of the graph. The evaporation has the following law: τij(t+1) (1 – ρ) τij(t) where the rate of evaporation ρ [0,1]. Dorigo found that the optimal values for the functioning of this algorithm are: ρ = 0,5, α = 1, β = 5. These values are surprising because the heuristics has a weight five times superior to that of the pheromone track. V.

ANTS APPLICATIONS

We will now discuss the three ways we are using for designing WEB menus using ants according to the above points. A. Methodology To test our algorithms we use simple but efficient strategies (using?) simulated users. We used a set of 10 items [A, J] which are associated to a rank by our system according to the past of the simulated user on the site. These ranks [1, 10] where 10 is the higher rank. This means that rank differences are > 0 when rank becomes better and <0 when it becomes worse. There are three sets of strategies: The rank is constant all along consultations The rank is changing according to strategies described in table II and III in which lines are item A to J. In strategy, described in table II, ranks are switched when going from white column to the green one, and when going from the green column to the white (switching occurs at time {8, 15, 35, 42} of the simulated time for a simulation of 156 steps). In strategy described in table III, ranks are increasing one-step by one-step up to 10 then decreasing. There is some accident in this due to the impossibility to have two items at the same rank. In this, we go directly from rank 10 to rank 1 or from rank 1 to rank 10. The rank is randomly 1 or 10 with a mean rank of 5.15 TABLE II INVERSION STRATEGY

>Paper 132

6

A B C D E F G H I J TABLE III STRATEGY OF PROGRESSIVE GAP

A B C D E F G H I J

1 2 3 4 5 6 7 8 9 10 9 8 7 6 5 4 3 2 1 2 3 4 5 6 7 8 9 10 1 10 9 8 7 6 5 4 3 2 3 4 5 6 7 8 9 10 1 2 1 10 9 8 7 6 5 4 3 4 5 6 7 8 9 10 1 2 3 2 1 10 9 8 7 6 5 4 5 6 7 8 9 10 1 2 3 4 3 2 1 10 9 8 7 6 5 6 7 8 9 10 1 2 3 4 5 4 3 2 1 10 9 8 7 6 7 8 9 10 1 2 3 4 5 6 5 4 3 2 1 10 9 8 7 8 9 10 1 2 3 4 5 6 7 6 5 4 3 2 1 10 9 8 9 10 1 2 3 4 5 6 7 8 7 6 5 4 3 2 1 10 9 10 1 2 3 4 5 6 7 8 9 8 7 6 5 4 3 2 1 10

B. Application of the algorithm of Dorigo The analysis of the formula of Dorigo allows us to do some simplifications in the particular context of this work. This analysis also leads us to some necessary modifications in order to avoid problems such as division by zero in certain cases.

1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10

1 2 3 4 5 6 7 8 9 10

10 9 8 7 6 5 4 3 2 1

10 9 8 7 6 5 4 3 2 1

10 9 8 7 6 5 4 3 2 1

1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 9 10 10 10

ranks(pas compris) l belonging in n. To avoid a division by zero (Dorigo doesn’t have this problem) when the choice is the same in time t-1 and t we transform this factor to Note: the term 0,5 used for the calculation of ηij and ηil avoids the case where the denominator is equal to zero. It creates no confusion, as we have no decimal distance in our simulation. ρ, α and ß are the parameters supplied by Dorigo, who will be used at first with the values which he defined (ρ = 0,5, α = 1, β = 5). The analysis of factor of the denominator of the above formula shows that in our particular case it is a constant. As n = [1, 10] and as in every stage, from every rank, all the ranks including the actual rank can be chosen, it is possible to write:

The basic Dorigo formula is:

In this study, we make adaptations (to apply) this formula: t in the algorithm of Dorigo express the complete roundtrip of the ant with a number "x" of stages between the departure and the arrival. In our case t will represent a time of the simulation i=the rank in time t-1 j=the rank in time t is the level of pheromone already deposited during the previous choices for an item given by the menu, on the road allowing to go of the rank i to the rank j is a heuristics equal to where is the distance between the rank i and the rank j. To avoid a division by zero when the choice is the same in time t-1 and t we transform this factor to

is a heuristics equal to where is the distance between the rank i and the one some of the

In this formula k is the one some of the ranks and whatever is the rank k of departure. In the context of our particular case, the formula of Dorigo can be thus simplified in the following way:

without distorting the relationship between the various values of Pij. The same analysis can be practiced on the factor This analysis shows that in time t of the simulation has the same value for any value of the rank i towards all the possible ranks j (i itself included). Thus is not a factor discriminating between the curves of the various strategies and then can be deleted in the formula without distorting the relationship between them. The final formula that was used in our simulations is: Pij(t)=

>Paper 132

this had for direct effect an improvement of the global performances of the system. Analysis of the results of this method was very disappointing. The obtained results showed themselves chaotic and not coherent even after a smoothing through a sliding Fig. 3. Analysis of random strategy with Dorigo algorithm

window of size equal to 50 times of the simulated time. The most characteristic result is the one of the fig. 3, where the strategy of unpredictable rank (average rank 5,15), which would have to give, after smoothing, a curve close to the visible curve between those of rank 9 and 1, gives a completely moved curve. To conclude this method does not work for the construction of adaptatives menus and we have to abandon it. C. Application of the real ants We saw above that ants are capable of complex tasks by using limited means.

7

definitions. Through the set of information, the system learns gradually, which are the tastes of the user, which menu to propose him and in which order for its next visit on the site. The main problem we have to solve in this algorithm is the way we apply pheromone deposition. We could have three different situations according to rank difference ( ): >0 when rank increase <0 when rank decrease = 0 if rank remain the same. The first idea was to put pheromone on the track only for the positive as a reward but this does not work. It creates a dissymmetry, which produces a situation where curves cross themselves in an anarchic way when the strategy switches. To solve that we create a negative deposition using the rang difference in a natural way. This idea comes from Schoonderwoerd [13] who also used a negative contribution to solve problem in its algorithm. This last version gives us the following curves:

We postulate that we have a nest in which each of the items of the consulted site is the equivalent of a source of food for ants, connected with the nest by a more or less long track. As for ants, every consultation provokes the deposit of a quantity of pheromone, which pheromone is the object of evaporation according to diverse modalities, which will be studied further.

A user going out of the nest (beginning a consultation) is thus going to find in front of him a range of tracks having each a different level of pheromone meaning a different attractiveness. It allows defining a classification, from the most attractive track to the least attractive.

P=8

P=15

Pheromone

Track can also have a different length that would, only by the time of of the ant course, has a direct influence on the evaporation and thus on the attractiveness of the track. This attractiveness can be also directly connected to the distance, (distance defined as the distance between the rank of the last consultation of this item and the rank attributed during the new consultation.

Pheromone quantity for 4 consultations

Increasing P

P=35 P=42

Decreasing P

Time

As for the real ants, this attractiveness is only defining a probability of consultation, which the user cannot respect. If by not respecting it, he uses a track having a weaker attractiveness, he is going to alter elements allowing the realization of this classification. He is going to alter at first the relationship between the levels of pheromone of the various tracks by bringing a quantity of pheromone to a less attractive track which by this risks becomes more attractive and thus improves its classification. The new rank is also going to modify the distance defined as a distance between the ranks of the various consultations of the item in question. This set of elements thus shows us a system having a function of time (the evaporation) and endowed with a memory (the level of pheromone). This point returns us to our initial

Fig. 4. Analysis of pheromone quantity according different strategies

>Paper 132

8

These curves crossing in a complex way are difficult to analyze. Additional tests making vary the rate of evaporation or the contribution of pheromone do not have globally modified this aspect of things that brought us to the methods of smoothing described below.

Fig. 6. Analysis of pheromone quantity after smoothing with a window of 50 times of the simulated time for different strategies Fig. 7. Analysis of pheromone quantity after smoothing with a window of 50 times of the simulated time for unpredictable strategy

Analysis of fig. 6 shows that after the smoothing relationship between curves becomes readable. This allows us very easily to define rank of items according to the different strategies. This also confirms us that this method could give us a way to achieve our goal to create real time adaptatives menus. Analysis of curves of fig. 7 shows that unpredictable strategy is close of the strategy using permanent rank 5 except at the beginning because of the time needed to establish smoothing. We could also see that when curve becomes far of the strategy of rank 5 it is because curve answers to irregularity of the random series with the delay introduced by the smoothing. These curves also confirm us that the analogy with the real ants allows us to build in a dynamic way menus that adapt themselves to the user according to his activity on a Web site. Finally, the results are in accordance with the beginnings of this study. P=8 P=15 Increasing P Fig. 5. Analysis of pheromone quantity according unpredictable strategy

A last test nevertheless allowed validating the global P=35 P=42 Decreasing P

Strategy

Rank 9 Rank 5

coherence of these curves; it is the test using an unpredictable strategy. This test show a curve around the one of continuous strategy with a coherent mean value (see fig. 5) D. Smoothing using a sliding window The first Rankidea 1 we have, to smooth these curves, is to compute the means of the X last values using a sliding window. This method gives good results as the curves of fig. 6 and fig. 7 show.

However, a finer analysis of this method of smoothing shows two inconveniences which brought to us to pursue our study to find another method. These inconveniences are: The smoothing is sensitive to the strategy of the user. If this one lengthens the time between two consultations, the smoothing becomes ineffective. The examination of the fig. 6 shows the appearance of this phenomenon. If the curve representing the smoothing of a strategy with a period of 8 times of the simulation is well smoothed the curve for a strategy with a period of 42 times of the simulation remains irregular. The consideration of this point leads us or to use a very long window for smoothing with a risk of degradation of the performances or to create an algorithm adjusting automatically the period of smoothing with even here a risk of degradation of the performances. The smoothing imposes to keep a large quantity of data on the user and this is against the objectives presented at the beginning of this study. This conservation of data complicates strangely the protection of the private life of the users, thus imposes more important security measures, and thus is more expensive. E. Smoothing by integration (first step) The other way we have tested to smooth the curves is integration. We made this choice knowing the filters’ theory (which can be mechanical, electronically, thermal, etc filters), which permits to predict that when a filter makes an integration of its entry, the variations of this entry are reduced, (and out of phase). Moreover, this is what we are looking for, in order to analyze our curves.

>Paper 132

Therefore we make a numerical integration of the pheromone quantities we shown Fig. 4, using the trapeze method which gives the following formula: Where Q is the quantity of pheromone at time t of the simulation and ij has the same meaning that in Dorigo algorithm.

9

According to the first problem we met, we have to find a new model for pheromone evaporation, which could give negative quantities (corresponding to decreasing parts of the curve after integration). So the new model we are looking for may be quite further from the physical reality of ants’ colony. Since ants are not the problem we want to solve, but only a tool to build models of adaptative menus, we can allow us to adapt this model to got conclusive results. To got negative pheromone quantities, we are now using the new following formula: ij(t)= ij(t-1)-Ev+deposit(t) Ev: is the pheromone amount which is evaporate at each time deposit(t): is the pheromone quantity deposed if an ant goes on the way ij at time t This is quite different of the previous evaporation as there is no relationship between the quantity of pheromone at time t-1 and the quantity evaporated at time t. Application of this gives the following curves (fig. 9) for the quantity of pheromone and the smoothing is fig 10.

Fig. 8. Result of the integration of curves of fig. 4 (except for decreasing and increasing strategies)

We can see that there is a first problem: the model we have used until this time always gives positives pheromone quantities because ij(t+1) is a ratio of ij(t), except when there is an ant going on the way ij then it deposes a certain pheromone quantity. So the problem is that the integrals of our curves are always monotonic and increasing. The monotonic curves we got represent the evolution of customer’s interest for each theme, and can’t allow us to conclude because this interest never decreases. But we notice important advantages in using this smoothing method: The point of the curve representing the interest at time t+1 is calculate only with quantities got at time t with the following formula which is another way to write the above formula:

Ev is evaporation and deposit is pheromone deposit (if it occurs at this time of the simulation) By this method we do not have to memorize so much points, the result is that the system’s performance are better The confidentiality required by the customer is respected This is why we want to adapt the real ants model to keep using this smoothing method in a way it could give us better results.

Fig. 9. Quantity of pheromone

>Paper 132

Fig. 10. Smoothing of fig. 9

10

Fig. 11. Quantity of pheromone with evolutionary evaporation

Those results are better: the curves describe first an increasing interest and secondly a decreasing interest. But they give illogical ranks for the four strategies we use because the system does not forget the preference order: it’s all time the same. Now we are expecting modifications in curves’ order, showing that themes consulted with short periods are exceeded by them which are consulted with longer periods; because for the first ones the latest consultation occure very soon, and after they are not consulted any more whereas the others are still consulted. The order may evolve approximately as in the figure 6. This is why we need to sharpen again our model. Our idea is an adaptative evaporation described below. F. Smoothing by integration (second step) In our new model, evaporation value increases between each pheromone deposit and goes back to an initial value when there is a deposit. So that the simulated interest of themes which are not consulted for a long time quickly goes below the other ones whereasTapez une équation ici. it keep being higher for themes which are often consulted. Then we use this new evaporation ruled by the following formula: E(t)=E(t-1)+(t-Lt)*evap between two pheromone deposits E(t)=Ev if there a deposit at time t Lt is latest deposit time Evap is a constant which rules the evaporation increasing Ev is initial evaporation value And then the pheromone quantity is: ij(t)= ij(t-1)-E(t-1)+deposit(t) deposit(t) is the same that above This model gives us new pheromone amount curves (fig. 11), which smoothed by integration gives fig. 12

Fig. 12. Smoothing of fig. 12

Here results are conclusive; we obtain a simulated interest evolution quite similar with the figure 6, which respect our requirements. Therefore we now have to practice the second test on this model to confirm it. We simulate random strategy which make a theme goes randomly to ranks 1 or 10, and we smooth the curves by integration:

>Paper 132

11

same purposes. These groupings use user models and are of this fact impossible to adapt to the construction of individualized screens in real-time. The general architecture of the system, which we want to build, is represented fig. 1. It shows at the same moment the entering information and the classes of possible usages of results of treatments.

Fig. 13. Quantity of pheromone for unpredictable strategy

Fig. 14. Smoothing of fig. 13

We can see that after smoothing, the curve corresponding to the random strategy is very close to the one corresponding to the strategy using permanent rank 5. That achieves to confirm our model. VI. CONCLUSION In this work, we wanted to show how it was possible to build real time individualized web pages user by user. We showed how numerous works had tried to obtain presentations, which are specific to groups of users. These methods use heavy statistical techniques leaning on groupings users with the same method of consultation or users having the

Three methods to build pages using ants were presented which will shortly analyze. They are: The Dorigo algorithm does not work in our specific situation. I think that is due to the possibility to remain at the same rank because this obliged us to make modification mainly to avoid a division by zero. The analogy with real ants also obliged us to introduce a non-natural modification when the rank becomes worse. It is the negative depot of pheromone. Results are quite good but difficult to use as curves are very chopped and irregular. This obliged us to apply smoothing method. Two methods are analyzed; the first one is a method using a sliding window of 50 time of the simulation, the second one use integration by the trapeze method. Results are: o The sliding window gives us very good results that could be see fig 6 and fig. 7 but does not fit exactly what we wants because the result depends of the relationship between the access period of the user and the size of the window. If the period is too long, in front of the window size, smoothing no more exist, this let us the choice between a very long window or a way to modify dynamically this size according to user strategy. The second problem is privacy preservation that becomes more difficult as we are obliged to get more data about the user previous actions. That is dangerous and expensive to preserve these data. o The integration method gives us a better solution to preserve privacy. It has also the advantage to be independent of the user strategy and then solve the two problems previously exposed. We have now to check this model about computer load and response time. This model has also to be confronted with real users to be sure that system follows what users expect. The next step will be to make more tests but before to use real users we feel that artificial users with more sophisticated strategies that the rustic strategy used for the first tests will allow us to determine parameters more close of what is needed for the final system and then reduce the risk to borrow users. These artificial users are defined according three parameters, which are: The interval of time between two consultations with three

>Paper 132

12

modalities (long, short, irregular) the variety in the choice of items with two modalities (large diversity, weak diversity) the fidelity in the choices of the user (with fidelity, without) This give use 12 kinds of users that are resumed in the following table: TABLE IV LEGEND OF TABLE V

  

high weak irrégular

Among the questions which must be put and especially receive an answer: Is it acceptable for the user to receive a modified page even though its favorite items are placed at the head of the page? What is the risk of creating confusion for the user (syndrome “lost in the hyperspace”)? At what distance of its previous position can an item be placed during the evolution of the menu without provoking disorientation? What is acceptable for the users in terms of memorization of past by the system? What is the meaning of the time used to read the page (time between display of this one and the first click on it)? This time can it be analyzed as the opposite of the quality of the presented page, as the track of a problem, whatever it is, between the user and the presented contents or is it is only a time of reading? What is the meaning for the user of period between two consultations? What is the law followed by the perception of the time? Is it linear, exponential, logarithmic or does it follow a more complex law to be determined? What value is it necessary to grant to the preferences declared by the user during its registration on the service? It is necessary to grant a different importance for the consultations according to: o the hour during the day? o the day of the week? o a longer period?

TABLE V STRATEGY OF ARTIFICIAL USERS

Users 1 2 3 4 5 6 7 8 a b c d

Frequency            

Diversity            

Fidelity            

What we expected of the standardized users is: Realization of comparisons of performance between the various methods Realization of comparisons of the quality of the constructed pages and thus the quality of the realized learning Verification that the methods are useful whatever is the mode of consultation of a site Refining the parameters of the method Verification by using typical populations (created by fixing percentages of each of the fashions of consultation for a set of users) the capacity of the system to accept load resulting of the used method The last work will be the check of the psychological perception of the method by the users at the same time as the definition of the parameters to be applied to this model.

This document constitutes the first step of our work. We have now a method to construct, dynamically, menus for users but we have to assert this first with simulated users then with real users. All questions open us a large field of research, which need a web site and real users to be sure to give right answers to questions that in many case could, by a wrong answer conduct to an inadequate solution for users. REFERENCES [1]

[2]

[3] [4]

[5]

Mike Perkowitz, Oren Etzioni. Adaptative web sites: automatically synthesizing web pages, Proceeding of Eight International conference on Artificial intelligence (AAAI), pages 727-732, 1998 Mike Perkowitz, Oren Etzioni. Adapative web Sites : Automatically learning from user access patterns, proceeding 6th international WWW conference, Santa Clara, 1997 Gérard Kubryk, Models’ specifications to build adaptative menus “, ICEIS 2005 Miami, May 2005. Francis Heylighen. Collective intelligence and its implementation on the WEB: algorithms to develop a collective mental map. Computational and Mathematical theory of organizations, 5(3), pp 253-280, 1999 Francis.Heyligens, Carlos Gershenson. The meaning of self-organisation in computing:IEEE intelligent systems, section trends & controversies – Self organisation and information systems, May/June 2003

>Paper 132

[6]

[7]

[8] [9] [10]

[11] [12]

[13]

Carlos Gershenson, Francis Heylighen. When can we call a system self-organizing, Advances in Artificial Life, 7th European Conference, Dortmund, Germany, pp. 606-614. LNAI 2801. Springer, ECAL 2003 Danilo Benzatti. Ants colony and multi-agents, Optimizing the ant system, Simulation and results,AI depot, Emergent intelligence, Copyright 2001-2003. Eric Bonabeau, Guy Theraulaz, Swarm smarts, Scientific American, March 2000, pp72-79. Pierre Lévy. Collective intelligence : mankind’s emerging world in cyberspace, Plenum, New york, 1997. Audrey Dussutour, Vincent Fourcassié, Dirk Helbing, Jean-Louis Deneubourg. Optimal traffic organisation in ants under crowded condition. Nature, Vol 428, pp 72-73, 4 march 2004 Marco Dorigo, Gianni Di Caro, L. M. Gambardella. Ant algorithms for discrete optimization. Artificial Life, 5(2), 137-172, 1999 Marco Dorigo, Gianni Di Caro. Ant colony optimisation: a meta-heuristic, Proceeding of the congress on evolutionary computation, vol 2, mois 6-9, IEEE press, pp 1470-1477, 1999 R. Schoonderwoerd, O. Holland, J. Bruten, L. Rothkrantz. Ant based load balancing in Telecommunications network, Adaptive behaviour, 5(2), 169-207, 1996

13

Three ways to construct adaptative WEB menus using ...

Finally, for effective marketing analysis, a company needs to know where the services are ... (i.e.: Web, telephonic services, PDA, 3rd generation telephone, etc.).

Download PDF

664KB Sizes 3 Downloads 223 Views

Report

Three ways to construct adaptative WEB menus using ...

Recommend Documents