Orchestration of Advanced Motor Skills in a Group of ...

Viewer
Transcript

Orchestration of Advanced Motor Skills in a Group of Humans through an Elitist Visual Feedback Mechanism Thrishantha Nanayakkara*, J. M. L. Chandana Piyathilaka**, A. Prasanna Siriwardana**, S. A. Akila Mike Subasinghe**, and Mo Jamshidi*** * Department of Mechanical Engineering, University of Moratuwa, Katubedda, Moratuwa, Sri Lanka ** Rinzen Laboratories (pvt) Ltd, 92A, G. H. Perera Mawatha, Rattanapitiya, Boralasgamuwa, Sri Lanka *** Department of Electrical and Computer Engineering, University of Texas at San Antonio, Texas, USA Abstract – A group of humans with diverse body dynamics and training backgrounds working on machines with different dynamics can be considered as a system of live systems. To the best of the knowledge of the authors, there is no automated mechanism to evolve an elite skill in such a system of systems in a factory through interaction among the systems nor there is a simple model that could explain the dynamics of the orchestration of skills through interaction. This paper presents a method that can be adopted to automate the evolution of an elite skill in a factory of workers operating a given type of machines to produce a given product through successive induction of an evolved elite skill on other workers. It also proposed a simple model that can be used to explain complex phenomena that can not be explained by the conventional learning schemes. A wireless data exchange system was adopted to transmit the machine speed profiles to a central data server. A technical expert selected the current elite speed profile from among the database of profiles registered by each worker in the server and broadcast the selected profile to the data terminals of all other workers. Every worker made an attempt to match the given elite profile. The crossover between the elite target profile and the natural skills of each worker successively generated machine speed profiles that could beat the given elite profile. This process repeated until the average efficiency of the whole group converged to an optimum. The proposed method was implemented in a leading Garment exporter in Sri Lanka. The results showed that the overall efficiency of the factory improved from 45% to 74%. Index Terms - – Evolution of elite skills, dynamics of learning as a group, linguistic optimality criteria, heuristic selection of an elite.

I. INTRODUCTION Operating a machine to manufacture a product involves continuous generation of corrective motor commands that interact with the dynamics of the muscles and that of the machine to make a product with desired features. A motor skill consists of motor primitives that map a particular area in the perception space to a simple motor command [1] - [5]. A complex motor skill seems to be constructed by adaptive combination of these motor primitives [4]. Therefore, the skill to operate a machine to give desired results largely depends on the basis set of motor primitives, their shapes, and the way they are combined given the need to take a corrective action. Humans acquire most of the motor skills through inspection. A

good elaboration on how players may be acquiring the skill to play squash through inspection of expert players is given in [6]. Therefore, visual inspection must be helping the brain to construct new motor primitives. A model that translates visual extraction of movement features to internal motor primitives is suggested in [7]. Yet, to the best of the authors’s knowledge there is no work done on the dynamics of skill refinement in a group of humans with a common objective through exchanging visual feedback among each other in an optimal sense. A group of humans operating a set of machines can be viewed as a live system of systems with diverse machine dynamics, human body dynamics, attitudes, and training backgrounds. The fact that each individual is characterized by his/her own attitudes that change over time, a live system of systems is highly unpredictable and chaotic. Given a common task to be optimized, each individual may give priority to different aspects of the vector of optimality criteria. Therefore, if there is an automatic mechanism to harmonize the activity in the group and guide individuals to improve the skills continuously, a large number of manufacturing companies and training centers can accrue immense benefits. In this study we focus on the evolution of a machine operating skill in a group of workers of a Garment factory. The skill was evaluated using three evaluation criteria: maximization of smoothness of the machine speed profile, minimization of time taken to finish the job, and minimization non value adding time during the job. A data terminal was attached to each machine that could sense the speed of the machine, plot it on a screen in front of the worker, and transmit the data to a remote data server through a wireless data link. A technical expert visually inspected the speed profiles registered at the remote server and picked what he perceived to be the best profile. In this case, there was no explicit mathematical model that quantified each element of the vector of the multiobjective evaluation criteria. Once an elite profile was chosen, it was transmitted back to all the data terminals of the workers to be plotted on top of each machine speed profile. The workers who had been verbally briefed about the evaluation criteria had to then extract pattern mismatches and understand the

learning strategy to improve the performance. It was observed that while trying to match the elite profile, some workers emerged as new elite workers due to the emergence of new concepts as a result of an interaction between a given elite profile and the natural dynamics and attitudes of another worker. This process of skill innovation through an interaction among complex human dynamics and thoughts continued till the live system of systems converged to a high standard of performance. The rest of the paper is organized as follows. Section II elaborates the proposed model of internal model construction and learning. The hardware setup used for data collection and communication is described in Section III. Section IV analyses the experimental data collected in a real factory environment. Finally, a discussion and a conclusion is given in Section V. II. THE PROPOSED MODEL OF INTERNAL MODEL CONSTRUCTION AND LEARNING In the Garment factory considered in this study, supervisors used three evaluation criteria to assess workers:

J 1 = max(t i ), ∀ω (t i ) > 0 J 2 = ω&&(t i ) J 3 = ∑ t i , ∀ω (t i ) < ε where

ω (t i ) is

the speed of the machine at time

t i , ε is a

evaluation he/she will earn given a machine speed profile. Here, we propose the skill innovation model shown in figure 1, where ψ

ψ

*

is the elite machine speed profile at a given time,

is the actual machine speed profile of a given worker,

is the visual error between the reference speed profile, the corresponding weight given by the subject,

∆V

α V is

∆ J is the

difference between the internal evaluation of the actual machine speed profile and that of the elite speed profile, α J is the corresponding weight assigned by the subject,ψ m is the modified machine speed profile innovated by an internal model responsible for profile planning, and u is the motor command computed by the motor primitives. Since visual feedback of the elite speed profile and that of the worker concerned is given on the screen, the internal critic can calculate the difference between the evaluations of the two profiles given by ∆ J . The subject may also decide to give more weightage to the visual disparity

∆ V between the elite

reference speed profile and that is enacted by the subject. Both of these information can be used to update an internal model that explores for novel machine speed profiles. The modified or innovated speed profile is kept in the memory and compared with the resulting speed profile. The direct error between these two speed profiles can be used to update the parameters of the internal motor model that controls the muscles in a supervised learning mechanism. The innovation process can continue till the worker is satisfied with the internal evaluation he/she receives.

threshold for the machine speed. Though the factory supervisors do assess the skill of the workers using these criteria, they find it very difficult to give specific instructions to the workers as to where they should improve and exactly what they should do to achieve that improvement because there is no visual comparison among alternative solutions. In order to provide specific visual feedback, a data terminal attached to each machine transmitted the machine speed profiles of each worker to a remote data server. A technical expert selected the best speed profile from among all these machine speed profiles based on the three evaluation criteria mentioned above. The exact set of weights assigned to each objective in the multiobjective evaluation criteria depended on the individual expert. Once an elite profile was chosen, it was transmitted back to the data terminals of all other workers and displayed on top of their individual machine speed profiles. The workers combined this visual information with the supervisors’ feedback to construct an internal model that associated features in the visual display of velocity profiles with evaluation criteria. This internal model is known as an internal critic in a reinforcement based learning mechanism [8]. The internal critic helps the workers to predict the

Figure 1: The proposed reinforcement based innovation mechanism However, one should notice that in the proposed learning mechanism, there are many possibilities for the learning outcome. The worker concerned can decide to follow a conservative approach by trying to match the elite speed

profile to maximize the evaluation he/she receives, or he/she can decide to go for an innovation. Though the choice to go for an innovation is risky for the worker concerned, it can lead to revolutionary improvements in the whole factory concerned. Figure 2 shows how two workers responded to such an elite profile across two trials. It is evident in figure 2 that the worker 1 (subject 1) has taken an innovative approach to optimize the evaluation criteria whereas the worker 2 has made an attempt to match the pattern given by the elite. Therefore, this way of evolution of elite skills in a group of diverse humans is quite different from a colony of robots or machines that tries to mimic the salient features of an elite behaviour selected or demonstrated by a human. In contrast, the dynamics in a human group can be very explorative though the trigger to change is given by a particular machine speed profile chosen by another human technical expert. For instance, the worker 1 had been operating the machine in a different manner until the particular elite profile was displayed on his/her screen. Therefore, the crossover of new information with the existing background training, attitudes, creativity, and of course the body dynamics and that of the machine itself could produce new elite that is much better than what is available at a given time.

the Standard Minute Value (SMV) which is the standard time taken by an average operator to complete the operation. Before starting the operation the operator has to log in to the database server. Keyboard and the I - button input are used to enter the information such as style number, operation number and the EPF number of the operator. These information are sent to the main database server via a wireless link which then resend the Standard Minute Value (SMV) to the graph module. This SMV value is then used to scale the X axis and the Y axis of the display. As the operator log in to the system data points of the standard graph which is chosen prior by the manager as the best available graph for the current operation is sent to the graph module via the wireless link to the Graph module. Once the data points are received by the graph module check sum is calculated and if it is correct standard graph is plotted in the background of the display. When the operator start the operation machine speed profile is plotted while standard curve is displayed in the background. Once the operator beat the standard curve continuously ten times a new graph is sent to the main database server and saved until it is studied by a manager. RF Wireless Communication Graph Module

Database server

Speed Signal

Sewing machin e Figure 3: The data communication network Figure 2: Example of emergence of new machine operating skills. III. DATA ACQUISITION AND WIRELESS COMMUNICATION Motor speed of the machine was sensed using optical encoder and then sent to the graph module as shown in figure 3. This pulse train is counted using the internal Hardware counter of the microcontroller. The sampling time is set proportional to

Each graph consists of 120 Y scale values. Since the maximum number of data points that could be sent in a data packet is 28 points, 5 packets are sent for each graph. IV. DYNAMICS OF SUCCESSIVE IMPROVEMENT OF GROUP SKILLS

Two stitching tasks T A shown in figure 4 and TB shown in figure 5 were selected from among a set of actual operations handled by a particular group. For simplicity, we have shown the results of two workers W1 and W2 for task T A to demonstrate two successful conservative and innovative strategies. Worker W3 for task TB was chosen to demonstrate the fact that washing off the learnt skills can also be done through a visual feedback system.

Figure 6: Correlation coefficient between the successive attempts of W1 and the elite machine speed profile for task T A . Figure 7 shows how the individual cost components evolve with respective to that of the elite speed profile. Obviously, they converge towards the elite cost components. Yet, it is unlikely that this type of a strategy will lead to a new elite. Figure 4: The elite speed profile for task A.

Figure 7: Variation of normalized individual cost components and the total cost for W1 . The dotted line shows the respective Figure 5: The elite speed profile for task B. Figure 6 shows the strategy of the worker W1 to improve the performance once an elite speed profile was displayed on his/her screen. The increasing correlation coefficient suggests that the worker tried to follow the elite profile. This involves adaptation of the internal motor primitives to match those of the worker who produced the elite speed profile.

cost of the elite speed profile for task T A normalized for the data of worker W1 .

An interesting phenomenon can be seen in figure 10 and 11, where the subject W3 tried to innovate a new machine speed profile to reach the cost of an inferior reference. The fact that the correlation coefficient between the speed profile of the reference and that of worker W3 reduces across trials depicts that the worker

W3 was deviating from the reference pattern

on the screen.

Figure 8: Correlation coefficient between the successive attempts of W2 and the elite machine speed profile for task T A . In contrast to worker W1 ,

W2 has adopted a strategy of deviating from the elite profile for task T A . This is elicited by

the fact that the correlation coefficient in figure 8 reduces across trials. Figure 9 suggests that the worker W2 seems to have given top priority to reduction of idle time during the task followed by minimization of total time spent to complete the task and reduction of the jerk of the machine speed profile respectively. This is a creative move triggered by the presence of the elite speed profile on the screen of worker W2 .

Figure 9: Variation of normalized individual cost components and the total cost for W2 . The dotted line shows the respective cost of the elite speed profile for task T A normalized for the data of worker W2 .

Figure 10: Correlation coefficient between the successive attempts of W3 and the elite machine speed profile for task TB .

Figure 11: Variation of normalized individual cost components and the total cost for W3 . The dotted line shows the respective cost of the elite speed profile for task normalized for the data of worker W3 .

TB

Yet, the evolution of the cost components shown in figure 11 clearly shows that the worker was reaching the levels of the cost components attributed to the given reference. This phenomenon of can not be explained by traditional learning theories, because the fact that the cost of the speed profile of the worker concerned converged towards that of the reference should traditionally imply that the speed pattern also converged towards that of the reference. Therefore, this scenario suggests that there is a learning mechanism where the subject gives prominence to the minimization of the error between the internal evaluations between two speed profiles without having to worry about the pattern mismatch they see on the screen. Let us refer to the model shown in figure 1 in order to explain the above phenomenon. The proposed model suggests that a subject has the choice to compare two speed profiles displayed on the screen by the pattern itself or by the evaluation given by the internal critic. The error between the cost estimated for the reference and that for the actual speed profile of the worker is given by ∆ J and the visual disparity between the two speed profiles is given by

∆ V . The worker has few

options for the strategies he/she can take based on this internal signal. A normal subject would decide to modify the speed profile so that

∆ V is minimized. Yet, a subject could also

A simple model like the one proposed in this paper can be very useful in guiding a team of workers to evolve an advanced skill through exchange of visualized information. This can also be useful in training players in any sport where intricate details of a speed profile of an arm or leg movement could carry the biggest secrets of the talents. However, selecting the best speed profile out of a pool of profiles seemed to be very subjective because different managers could hold different elements of the vector of movement evaluation criteria more important than others. This may sometimes lead to confusions among the workers. Our results showed that the plasticity of the brain could give room for wrong elite speed profile to wash away some of the skills learnt so far. Therefore, when applying this techniques in a factory environment, care should be taken to plan the training sessions well and give clear instructions to the workers and supervisors. It is our frank feeling that behavioral patterns in a system of live systems can not be generalized and can not be repeated because the orchestration of novel skills largely depended upon the attitudes, training background, supervisors available, and of course the level of fatigue among the workers given a day. In fact this is the challenge in studying the dynamics of a system of systems.

take the decision to give more weightage to minimize the norm of the error of the cost given by

∆ J . Another subject

might even take the decision to maximize

∆ J , so that a

completely innovative elite solution is emerged. The choice depends on the personality and attitudes of the subjects. The results shown in figures 10 and 11 can be explained only by assuming that the subject W3 took a strategy where

α V << α J , and wanted

to minimize

∆ J . This led to the

emergence of a machine speed profile that was different from the reference, but as far as the cost is concerned, the emerged speed profile was similar to the reference speed profile. V. DISCUSSION AND CONCLUSION The orchestration of skills in a system of live systems differs from that of a system of mechanized systems in that the results can not be predicted due to many possibilities of the strategies the individual systems could take to interact with other systems. This paper proposes a simple model to explain this complex phenomenon. The experiments were done for human subjects in a Garment factory for two tasks. Completely non invasive techniques were adopted to collect data, and all data communications were done using a wireless network without adding constraints on the positioning of the machines. Results of three subjects has been analyzed to demonstrate the effectiveness of the model.

REFERENCES [1] E. Bizzi, F. A. Mussa-Ivaldi, and S. Giszter, ”Computations underlysing the execution of movement: A biological perspective”, Science, vol. 253, pp. 287 – 291, 1991. [2] M. Arbib, “Perceptual Structures and Distributed Motor Control” in V. B. Brokks, ed., Handbook of Physiology: Motor Control, The MIT press, pp. 809 – 813, 1981. [3] K. Thoroughman and Reza Shadmehr, “Learning of action through adaptive combination of motor primitives”, Nature, pp. , 2001. [4] R. Shadmehr and F. A. Mussa-Ivaldi, “Adaptive representation of dynamics during learning of a motor task”, Journal of Neuroscience, vol. 14, pp. 3208 – 3224, 1994. [5] D. M. Wolpert, Z. Ghahramani, and M. I. Jordan, “An internal model for sensory motor integration”, Science, vol. 269, pp. 1880 – 1882, 1995. [6] B. Abernethy, “Expertise, visual search, and information pick-up in squash”, Perception, vol. 19, pp. 63 – 78, 1990. [7] M. J. Mataric, V. B. Zordan, and Z. Mason, “Fixation behavior in observation and imitation of human movement”, Cognitive Brain Research, vol. 7, no. 2, pp. 191 – 202, 1998. [8] R. S. Sutton and A. G. Barto, “Reinforcement Learning”, MIT press, Cambridge, Massachusetts, 1998.

Orchestration of Advanced Motor Skills in a Group of ...

best of the knowledge of the authors, there is no automated mechanism to evolve an elite skill in ... data server through a wireless data link. A technical expert.

Download PDF

135KB Sizes 0 Downloads 128 Views

Report

Orchestration of Advanced Motor Skills in a Group of ...

Recommend Documents