Sensory System as a Tool to Highlight Information ...

Viewer
Transcript

Sensory System as a Tool to Highlight Information Structure in the Sensorimotor Loop Harold Roberto Martinez Salazar

Hidenubo Sumioka

Artificial Intelligence Laboratory Department of Informatics University of Zurich Zurich, Switzerland Email: [email protected]

Artificial Intelligence Laboratory Department of Informatics University of Zurich Zurich, Switzerland Email: [email protected]

Abstract—In an embodied agent, the sensor morphology is a fundamental element to shape the information structure of the sensorimotor activity. Basically, the density and the sensor distribution work as a filter reducing the dimensionality of the sensory data. The result on a theoretical model shows that the discretization is strongly related to the task the agent has to perform. Moreover, we can take advantage of this relation to define the sensory system, which reduces drastically the dimensionality of the system and highlights the information structure in the sensorimotor loop. Keywords—embodied interaction, dimensionality reduction, information structure, sensory morphology

I. I NTRODUCTION In the field of developmental robotics, the engineering challenge is to build an artificial system that acquires cognitive capabilities through the interaction with an environment. Such systems are requested to process high dimensional, multimodal sensorimotor data more effectively as they have more degrees of freedom and sensors. Although several dimensionality reduction techniques such as PCA can be applied to solve this problem, interrelations among the body, the control system, the sensory system and the environment are not taken into account. These interrelations are the basic substrate where cognitive capabilities are built [1]. For this reason, we have to implement an effective information processing, which exploits the dynamic reciprocal coupling of these interrelations between them. In order to analyze the interrelations, several studies have been devoted to identify the flow of information structure in the sensorimotor activity [4], [3], [5]. They showed that an embodied agent, through coordinated and dynamically coupled sensorimotor activity, induces quantifiable statistical changes in the sensorimotor information - including decreased entropy, increased mutual information, integration, and complexity. Interestingly, in the experiments on retinal cells that are modeled as log-polar geometry, they showed that their density and distribution of retinal cells work as a filter to reduce dimension of input data as well as to highlight causal relations in sensorimotor activity [4], [5]. This implies that sensor morphology has the property of dimensionality reduction and computation.

In the area of robust control, it has been shown that when a control system is limited by a communication channel there is an optimal discretization of actions that is based on a logarithmic law and keeps the system stable [2]. In general, this finding shows that if the system is very far from its goal, a rough action is good, but when the system is pretty close to the goal a more careful action is needed. We can see the discretization of actions as a discretization in the sensor space, which implies that the sensory system can be used as the tool to provide good information to the controller. In this study we present an algorithm to select a discrete the sensory system, which takes into account the interrelations among the body, the control system, the sensory system, and the environment. We show that the discretization is strongly related to the task the agent has to perform. Moreover, it reduces drastically the dimensionality of the system and keeps the information structure in the sensorimotor loop. This paper is organized as follows. In section II, we describe the models used for our simulations, we introduce the definition of the measures used to compare the different sensor morphologies. Next, in section III, we show the results of the proposed algorithm and the performance of each sensory system. Later, in section IV, we discuss about the implications of the different sensory systems generated with our algorithm. We conclude the paper in section V with our conclusion. II. M ETHODS In this section we describe the methods used to develop the simulations and the measures used to compare the different sensory systems. A. Value Iteration Algorithm Consider the discrete invariant system, xk+1 = f (xk ) + g(xk )uk ,

(1)

where xk is the state vector at time step k, and uk is the action applied to the system by the controller. We can define a cost function ci ci (xk ) = x′k Qxk + u′k Ruk , (2) where ′ denotes transpose, Q and R are definite positive matrices, and the cost in the goal state xg0 is zero (ci (xg0 ) = 0).

The optimal control problem is to identify a policy uk = h(xk ) that minimizes the cost for each xk . The total cost starting at xk until the system is in the goal is known as the value function, and can be written as: V (xk ) = x′k Qxk + u′k Ruk + V (xk+1 )

(3)

where V (xk ) is the value function. Using a linear combination of basis functions to approximate the value function and the controller, let us rewrite Eq.(3) as: φ(xk )Wv = x′k Qxk +(φ(xk )Wu )′ R(φ(xk )Wu )+φ(xk+1 )Wv (4) where φ(xk ) is the basis function, W are the coefficients of the linear combination, φ(xk )Wu is the policy, and φ(xk )Wv is the value function. This representation help us to implement the value iteration algorithm to find the optimal control [6]. First, we have to initialize the policy uk = h0 (xk ), and the value function. Then we exploit the representation of the approximation function to calculate the new value function using least-squares method. Then we can obtain the new Wvj+1 from the equation: φ(xk )Wvj+1

= ci (xk ) +

φ(xk+1 )Wvj ,

(5)

then based on the new Wvj+1 we can calculate the new policy that minimizes the value function as shown in Eq.(6) Wu = arg min(ci (xk ) + φ(xk+1 )Wvj+1 ), ∂ j+1 ) = 0, ∂Wu (ci (xk ) + φ(xk+1 )Wv ′ ′ φ(xk )(2R(Wu φ(xk )) + g(xk ) ∇φ(xk+1 )′ Wvj+1 )′

(6) (7) = 0. (8)

The action selected in xk defines the final state xk+1 . For this reason it is not possible to solve the equation for Wu . As a result, we have to use gradient descendant to find the new policy that minimizes the value function. Then with the new policy we start again the process until the coefficients that define the value function do not change much. B. Sensory system generation In order to generate the sensory system, we introduce the sensory morphology as part of the policy. We let the value iteration algorithm converge as usual, and in the last iteration when the optimal solution is computed, we use a genetic algorithm to minimize the value function. In this last step the sensory system is optimized keeping the previous Wu . The sensory system used in our study discretize the state variables xk for the following discrete linear system 0 0 0.1 uk , (9) xk + xk+1 = 1 0.3 −1 where xk are the state variables and uk the input to the system produced by the controller. The solution of the algorithm was restricted to the range of [-5,5]. We limited the number of possible discretized intervals to {7, 9, 11, 13, 15, 17} for both state variables. The genetic algorithm started with a uniform discrete sensory systems and the algorithm used the cost to

identified the optimal discretization. The cost functions used to show the different sensory system are: c1

=

c2

=

1 0 9 x′k 0 x′k

0 xk , and 9 0 xk . 1

(10) (11)

We calculate the performance evaluating the cost in a uniform grid of 100 × 100 inside the region of interest. For all these initial states we calculated the accumulated cost produced in 10 time steps. The performance is the summation of all these costs. C. Information Metric We adopted the mutual information (MI) to measure the deviation from statistical independence between xk and xk+1 . The MI quantifies the error we make in assuming xk and xk+1 as independent variables. The formal definition in terms of single and joint state probability distributions is M I(xk , xk+1 ) = −

J I X X

Pxk ,xk+1 (i, j)log

i=1 j=1

Px (i)Pxk+1 (j) Pxk xk+1 (i, j)

(12) If xk and xk+1 are two statistically independent random variables, Pxk xk+1 (i, j) = Pxk (i)Pxk+1 (j) and M I(xk , xk+1 ) = 0. For this reason any statistical dependence between xk and xk+1 yields M I(xk , xk+1 ) > 0. The model implementation and data analysis were carried out in MATLAB(2010a, The MathWorks). III. R ESULTS In this section we show the results of the algorithm used to generate the sensor morphology. Fig 1 depicts the different sensory systems generated for two different cost functions in the region [0,5]. Each cost defines a family of sensors, independent of the number of discrete regions. We applied the performance measure defined in II-B to compare three sensory systems using the controller for the cost in Eq.(10). The first sensory system is generated using the same cost function that is used to generate the controller. The second is a uniformly discrete sensory systems, and the third is a sensory system generated for a different controller. Fig 2 shows that the best performance under the control in Eq.(10) is obtained with sensor designed for this controller. We evaluated the MI between the actual and future state of the system. A high MI identifies an effective pair of sensory system and controller. This means that the controller is effectively producing an action based on the actual state, and it forces the system to evolve in the desired direction toward the goal. Fig 3 shows that the higher MI appears with the sensory system that is designed with the same cost function as the controller.

Mutual Information 5

5

4.5

ctr 1 s1 ctr 1 s2 ctr1 suniform

4.5

4 4 3.5 3.5 MI

y

3 2.5 2

3 2.5

1.5 2 1 1.5

0.5 0

0

1

2

3

4

5

x

Fig. 1. The points represent the discretization for both state variables. The sensory system generated with the cost function from Eq.(10) are in blue (dark gray) and the sensory system generated with the cost function from Eq.(11) are in green (light gray). The results are the average of 10 trials. 7

4.5

control 1

x 10

ctr 1 s1 ctr 1 s2 ctr1 suniform

4

1

6

8

10

12 discret intervals

14

16

18

Fig. 3. MI for different sensor configurations on different numbers of intervals. The MI is measured the between the actual (xk ) and future state of the system (xk+1 ), under the controller defined in for the cost function in Eq.(10). Three different sensory system where evaluated. First, the sensor system generated with the cost function from Eq.(10) (blue (dark gray)). Second, the sensory system generated with the cost function from Eq.(11) (magenta (gray)), and finally, an uniformly discrete sensory system (green (light gray)). The error bars represent the standard deviation of 10 optimization process for each set of discrete intervals {7, 9, 11, 13, 15, 17}.

cost

3.5

3

2.5

2

1.5

6

8

10

12 discret intervals

14

16

18

Fig. 2. Difference between performances for different sensor configurations on different numbers of intervals. The performance presented in this figure is calculated using the controller defined by the cost function in the Eq.(10) and three different sensory systems. The sensor system in blue (dark gray) was generated with the cost function Eq.(10). The sensor system in magenta (gray) was generated with the cost function Eq.(11). The sensor system in green (light gray) is a uniformly discrete sensor system. The error bars represent the standard deviation of 10 optimization process for each set of discrete intervals {7, 9, 11, 13, 15, 17}.

IV. D ISCUSSION These results support the idea that the sensory system can be used for dimensionality reduction. The sensory system discretized the state space in such a way that the controller has the best performance possible for the amount of quantized levels (Fig 2). Moreover, the sensory morphology depends on the task (Fig 1). For this reason, we can exploit this relation to implement an algorithm (see section II), which can select the best sensory system. This bounding relation between the controller and the sensory system can be measured in terms of information structure (Fig 3). As a consequence, we hypothesize that it is possible to exploit the sensory system to implement a developmental algorithm: since the sensory system filters out data that is not relevant for a specific controller, the agent needs to identify the controller which allows the high information structure in the sensorimotor loop.

V. C ONCLUSION This study shows that an embodied agent can accomplish an specific task with a discretized sensory system. We identified a relation between the sensory system and the controller in terms of performance and information structure. This oneto-one relation allows us to think in the possibility of exploiting the sensory morphology to implement developmental algorithms which identify appropriate controller maximizing the information structure. We argue that the identification of this controller is going to be fast, because the sensory system reduces the dimension of input data, highlighting information structure in sensorimotor loop. ACKNOWLEDGMENT The authors would like to thank Juan Pablo Carbajal for the fruitful discussions about [2], which has been a great inspiration for this study. The research leading to these results has received funding from the European Community’s Seventh Framework Programme FP7/ICT-2007.2.2 no 231864ECCEROBOT, and FP7/ICT-2009-4 no 248311-AMARSi. R EFERENCES [1] Pfeifer, R., Lungarella, M., Iida, F.: Self-organization, embodiment, and biologically inspired robotics. Science, 318, pp.1088-1093,2007 [2] Elia, N. and Mitter, S.K., Stabilization of linear systems with limited information, Automatic Control, IEEE Transactions on, 46(9), pp. 1384– 1400, 2001 [3] Lungarella M, Pegors T, Bulwinkle D,, and Sporns O, Methods for quantifying the information structure of sensory and motor data. Neuroinformatics 3 pp.243–262, 2005 [4] Lungarella, M. and Sporns, O., Mapping information flow in sensorimotor networks, PLoS Computational Biology, 2(10), e144, 2006 [5] Martinez, H., Sumioka, H., Lungarella, M., and Pfeifer, R. On the influence of sensor morphology on vergence, In proceedings of 11th International Conference on Simulation of Adaptive Behavior, pp.146– 155, 2010 [6] Lewis, Frank and Vrabie, Draguna. Reinforcement learning and adaptive dynamic programming for feedback control, IEEE Circuits and Systems Magazine, 3 (9) pp.32–50

Sensory System as a Tool to Highlight Information ...

Email: [email protected] ... to define the sensory system, which reduces drastically the ..... information, Automatic Control, IEEE Transactions on, 46(9), pp.

Download PDF

80KB Sizes 0 Downloads 296 Views

Report

Sensory System as a Tool to Highlight Information ...

Recommend Documents