FAST FINGERTIP POSITIONING BY COMBINING PARTICLE FILTERING WITH PARTICLE RANDOM DIFFUSION Ko-Jen Hsiao, Tse-Wei Chen, and Shao-Yi Chien Media IC and System Lab Graduate Institute of Electronics Engineering and Department of Electrical Engineering National Taiwan University BL-421, No. 1, Sec. 4, Roosevelt Rd., Taipei 106, Taiwan {variant, sychien}@media.ee.ntu.edu.tw ABSTRACT A new and efficient algorithm to find the fingertips for human computer interface is proposed in this paper.The Fingertip Positioning algorithm of this paper combines particle filtering with Particle Random Diffusion to find the different fingertips quickly and robustly. There are two special methods used in this algorithm, which are Particle Random Diffusion and Fingertip Particle Selection. Without checking every pixel in the image of video sequence, this algorithm works well with low computational cost. This algorithm comes out with good performance in cluttered backgrounds and processes in real time. Based on the accurate positions of fingertips, hand gesture recognition can be done efficiently and robustly. 1. INTRODUCTION Hand gesture recognition has been a popular research topic in recent years. It enables people to control or interact with computers and consumer electronics [1]. There are lots of works related to realtime hand tracking and hand gesture recognition [1, 2, 3, 4], and various hand models [5] have been constructed for different purposes. Several steps of hand gesture analysis, such as gesture segmentation, tracking, and feature extraction are essential to gesture recognition, and there are many works related to these topics. In another aspect of gesture recognition, the most important part is to give commands accurately which can be done with the accurate positions of fingertips and the center of palm [6, 7]. The contribution of this paper is that it presents a new method, Fingertip Positioning algorithm, to estimate the position of each fingertip very quickly, and to recognize hand gestures by calculating each finger-palm vector from the positions of fingertips and palm. The skin color Gaussian Mixture Model is used to establish a skin probability distribution model. After converting the frame to a binarized image, Pattern Matching Condensation algorithm [8] is employed to spread particles inside the hand. After that, Particle Random Diffusion is used to make these particles randomly diffused to the contour of hand. Then a quick and efficient method called Fingertip Particle Selection, which distinguishes particles near fingertips, is used. Even though this method may not distinguish all the particles which are near the fingertips, the next step, Particle Grouping, can find each fingertip correctly. One advantage of this algorithm is that it can work with cluttered background or incomplete skin color probability distribution model. Because through the use of the Pattern Matching Condensation algorithm, most particles will be spread inside the largest skin color
region at the first step and will not diffuse to the non-hand regions in most cases. So if the background is cluttered, even though there are many small regions would be miscalculated as skin color areas, particles will not diffuse to these regions. Since most particles are located inside the hand, the algorithm can estimate the fingertips’ position correctly. The algorithm proposed in this paper can quickly find different fingertips in each frame, so it can be robust and refined by using temporal information when recognizing hand gestures. 2. PROPOSED ALGORITHM An overview of the proposed algorithm is illustrated in Fig. 1 Each Frame of Video Sequence
Skin Color Detection
Pattern Matching Condensation Algorithm
Random Diffusion of Particles with Directional Constraint
Fingertips and Hand Center Estimation
Particle Grouping
Fingertip Particle Selection
Variance Estimation for Next Frame
Fig. 1. An overview of the proposed algorithm. Fig. 1 indicates the flow of this Fingertip Positioning algorithm. Each step is expatiated in the following subsections. 2.1. Skin Color Detection Before starting this Fingertip Positioning algorithm, a skin color density distribution model, which is a Gaussian Mixture Model (GMM) [9, 10] in YUV color space, is constructed by Expectation Maximization Algorithm [11]. This skin color density distribution model is used to calculate the probability of being skin color for each pixel. With the probability data of each pixel, we can set a threshold value and binarize each image of sequence. The result is shown in Fig. 2(b) 2.2. Pattern Matching Condensation Algorithm To identify the fingertips quickly, Pattern Matching Condensation algorithm [8], which is the application of Particle Filtering, is used at the beginning of this algorithm. At the first frame of the input sequence, N particles are generated and randomly located in the
(a)
(d)
(b)
(e)
diffusion vectors always point outward.The diagram is shown in Fig. 3(a). Each of the particles will diffuse several times until it reaches the non-skin area. As the particle reaches the non-skin area, it will move back to its previous position, which is in the skin area. The particle will diffuse from this position again. After the particle has reached the non-skin area three times, it will stop moving. See Fig. 3(b) According to the positions of S and T , perform binary search and use the binarized image data, a point F , which is near the contour and inside the hand can be found between these two positions. Finally, let the particle move to this point F. After applying Particle Random Diffusion with Directional Constraint to N particles, the particles will finally diffuse to the contour. The result is shown in Fig. 2(d). Diffusion Vector
3rd time to T F non-skin area
(c)
Particle
S
2nd time to non-skin area
C Hand center
(f)
1st time to non-skin area
(b) S4
Fig. 2. The results of different hand gestures in different steps. (a) Original image, (b) Binarized image, (c) Particles’ distribution after Condensation algorithm, (d) Random diffusion of particles (e) After Fingertip Particle Selection, (f) Final results.
Particle Hand center
S2 S3 Particle
S1
binarized image generated in the previous step. Pattern Matching Condensation algorithm can spread most of these N particles inside the hand in every image of sequence quickly. The result of this step is shown in Fig. 2(c). Even though the skin color detection of the previous step may mistake some non-skin color area for skin color area, Pattern Matching Condensation Algorithm can locate most of the particles inside the hand. As most of the particles are located inside the hand, the following steps, Particle Random Diffusion and Fingertip Particle Selection, can be successfully accomplished.
(a)
(c)
Fig. 3. (a) The diagram of Particle Random Diffusion with Directional Constraint. (b) The diagram of the diffusion vector. (c) The diagram of the Fingertip Particle Selection.
2.4. Fingertip Particle Selection 2.3. Particle Random Diffusion with Directional Constraint After most particles are located inside the hand, the temporary hand center can be estimated by the mean value of the coordinate of each particle. With the hand center position data, particles start to randomly diffuse toward the contour. The position of the particle after k times diffusion,rk , is shown below: k
rk = center +
vi ,
(1)
i=1
where center is the hand center position, and vi is the i-th diffusion vector which can be presented as (xi , yi ), and both xi and yi are random variables with the Gaussian probability density function. The variance of the Gaussian probability density function is directly proportional to the palm radius in the previous image. The method of palm radius estimation is expatiated in Sec.2.5.1. To let the particles diffuse outward to the contour, the diffusion vector should follow the constrain, which is shown below: vi · c > 0,
(2)
where c = (rk−1 − center).The (2) means the dot product of the diffusion vector and the vector c. Therefore the directions of the
To distinguish the particles in the fingertips from the particles that diffused to the contour in the previous step, Fingertip Particle Selection is adopted. This technique includes the following three steps: Step 1: Generate K sticks for each particle near the contour. The positions of two ends of each stick are generated by the Gaussian probability density function. Take one particle for example. Let the particle be the center of the stick, then the vector, which is from the particle to the one end A, is generated by the method used to generate the diffusion vector. The other end A’ is the symmetric point of A. An example of a particle and its four sticks is shown in Fig. 3(c). Step 2: Set an index, initialized to zero, for each particle. Determine whether the two ends of each stick are located on the skin area or non-skin area. If two ends are located on the skin area, the index of the particle will decrease by one. If both the two ends are located on the non-skin area, the index of the particle will increase by one. If one end is located on the skin area and the other is located on the non-skin area, the index of the particle will remain unchanged. Take Fig. 3(c) for example, the stick 1 and stick 2 increase the index by one respectively, and the stick 3 decreases the index by one. The stick 4 doesn’t change the index. Therefore, the final index will be computed from the K sticks. Step 3: Classify the particles as the points on the fingertips if the indices of these particles are larger than 0.75k, which shows the best
result by experiments. The result is shown in Fig. 2(e). 2.5. Particle Grouping and Palm Scale Data Estimation In this section, the methods to estimate the palm radius and to find the positions of different fingertips are shown as follows:
2.5.2. Particle Grouping After the Fingertip Particle Selection, the particles which are classified as the points on the fingertips are left, and others are eliminated. In this step, the Particle Grouping is used to distinguish the fingertips of different fingers. The steps of Particle Grouping are shown as follows: Step 1: Give each particle a group index which initialized to zero. Step 2: Randomly choose a particle Q with zero group index, and set its group index a new group number. Step 3: Choose another particle with zero group index, and the distance between Q and this particle is less than the palm radius. Check whether or not the points between Q and this particle are in the skin color area by binarized image data. If the points are all in the skin color area, set this particle’s group index to be the same as Q’s. Step 4: Repeat Step 3 for all other particles whose group index are zero. Step 5: Repeat Step 2 to Step 4 until all particle’s group indices are not zero. Step 6: Calculate the average position of particles with the same group index. The different fingertips’ positions are then found. If the number of particles with the same group number are too small (less than N/20), then this group, composed of these particles, will be ignored. 2.5.3. Variance Estimation by Palm Radius After calculating the radius of the palm, the scale of the hand is estimated. Based on this information, the variance of Gaussian probability density function used in Particle Random Diffusion and Fingertip Particle Selection for next frame can be determined. Since the lengths of the vectors and sticks change with the scale of the hand in the previous frame, the fingertips of hands with different scales can still be quickly estimated. The block diagram of this process is shown in Fig. 4. After the above steps, the positions of different fingertips and palm radius can be correctly estimated. The result of each frame is shown in Fig. 2(f)
Pattern Matching Condensation Algorithm Mask Size
Center of Particles after Random Diffusion
Random Distribution with Directional Constraint
Palm Scale Data Estimation Variance Estimation
Choosing the Variance of Gaussian pdf for Next Frame
2.5.1. Palm Radius Estimation In Sec. 2.3, a temporary hand center has been estimated. Since the particles at that step are generated by the Pattern Matching Condensation algorithm, most particles are located in the palm, this temporary hand center is a little lower than the real hand center. After the Particle Random Diffusion of particles, another temporary hand center is calculated. Since some particles are close to the fingertips, this new hand center is a little higher than the real hand center. To be more accurate, the midpoint of these two temporary hand centers is calculated to be the final hand center. After estimating the new hand center’s position, take it as the starting point, some vectors with random direction are generated and extended until their tips reach to the non-skin area. Then calculate the average of these vectors’ length to determine the palm’s radius.
Next Frame
Current Frame Center of Particles after Condensation Algorithm
Fingertip Particle Selection
Fig. 4. Block Diagram of Variance Estimation by palm redius. 3. EXPERIMENTAL RESULTS Since the purpose of the fingertips tracking algorithm is to distinguish the different fingertips in the image and to recognize different hand gestures, in the experiments, the algorithm is applied to several sequences, which are composed of the images of different numbers of fingers and various hand gestures. The accuracy of this algorithm is high and the computational time is low. The results are shown in Table 1, which shows the accuracy of distinguishing the different fingertips, and in Table 2, which shows the accuracy of recognizing different hand gestures. Table 1. The accuracy of distinguishing different fingertip. Fingertip Number
Test Frame Number
Correct Frame Number
Accuracy
0 1 2 3 4 5
100 100 100 100 100 100
90 98 95 98 99 95
90% 98% 95% 98% 99% 95%
Table 2. The accuracy of recognizing different hand gestures. Test Frame Number
Correct Frame Number
Accuracy
Hand Gestures Close Open UP Down Left Right
100 100 100 100 100 100
90 98 98 99 92 93
90% 98% 98% 99% 92% 93%
The experiments are performed on a Pentium IV 2.6GHz computer with 512MB memory. The video sequences are captured at 10 frames per second. The resolution of each image is 320×240 pixels. In all experiments, 1000 particles (N =1000) are used in the proposed algorithm. The fingertips can be tracked in real time by the proposed algorithm with accuracy more than 95%. Since this Fingertip Positioning algorithm has no search window and does not need to check
all the pixels in the image, when it computes sequences with higher resolution, the computational cost is still low. There are some other different hand gesture recognition results shown in Fig. 6. In Fig. 6, different fingertips of various directions can be found correctly. When the hand scale changes, the position of each fingertip can still be estimated. As mentioned in Sec. 1, this Fingertip Positioning algorithm can work with cluttered background and can work when some non-skin color areas are miscalculated as skin color areas. When it encounters some other skin color areas, which are the face or other hands, in the background, by the method proposed in Sec. 2.2, most particles will not enter these regions and will move forward to the largest skin color region. So this algorithm still works well. The experimental results are shown in Fig. 5, which shows that this algorithm can correctly estimate the positions of different fingertips.
(a)
(a)
(b)
Fig. 5. The binarized images and the results when there are other skin color regions.
4. CONCLUSIONS AND FUTURE WORK The experiments show that the Fingertip Positioning algorithm proposed in the paper estimates the positions of fingertips with effectiveness and efficiency. The algorithm successfully diffuses particles inside the hand to the contour by Particle Random Diffusion and distinguishes the particles on the fingertips from that on the contour. In addition, the algorithm can be applied to finding the fingertips when there are other small skin color regions in backgrounds. Based on this Fingertip Positioning algorithm, one can communicate with computers through human computer interface. For future developments, this Fingertip Positioning algorithm would be improved to find the fingertips of more than one hand by modifying the way of spreading the particles inside the hand. As the fingertips of more than one hand can be positioned, more applications of human computer interface can be integrated. 5. REFERENCES [1] C. Shan, Y. Wei, T. Tan, and F. Ojardias, “Real time hand tracking by combining particle filtering and mean shift,” 2004, pp. 669–674. [2] Y. Fang, K. Wang, J. Cheng, and H. Lu, “A real-time hand gesture recognition method,” July 2007, pp. 995–998.
(b) Fig. 6. Some other hand gesture pictures: (a) Original data, (b) Results. [6] K. Oka, Y. Sato, and H. Koike, “Real-time fingertip tracking and gesture recognition,” IEEE Computer Graphics and Applications, vol. 22, no. 6, pp. 64–71, Nov. 2002. [7] S. M. Dominguez, T. Keaton, and A. H. Sayed, “A robust finger tracking method for multimodal wearable computer interfacing,” IEEE Transactions on Multimedia, vol. 8, no. 5, pp. 956–972, Oct. 2006. [8] E.-J. Holden and R. Owens, “Recognizing moving hand shapes,” in Proceedings of International Conference on Image Analysis and Processing, Sept. 2003, pp. 14–19.
[3] H. Zhai, X. Wu, and H. Han, “Research of a real-time hand tracking algorithm,” vol. 2, Oct. 2005, pp. 1233–1235.
[9] J. Verbeek, N. Vlassis, and B. Krose, “Efficient greedy learning of gaussian mixture models,” Neural Computation, vol. 15, no. 2, pp. 469–485, Feb. 2003.
[4] L. Bretzner, I. Laptev, and T. Lindeberg, “Hand gesture recognition using multi-scale colour features, hierarchical models and particle filtering,” May 2002, pp. 405–410.
[10] Z. Xu and M. Zhu, “Color-based skin detection: Survey and evaluation,” Jan. 2006.
[5] Y. Wu and T. S. Huang, “Hand modeling, analysis and recognition,” IEEE Signal Processing Magazine, vol. 18, pp. 51–60, May 2001.
[11] T. K. Moon, “The expectation-maximization algorithm,” Signal Processing Magazine, IEEE, vol. 13, no. 6, pp. 47–60, Nov. 1996.