Optical Font Recognition Based on Global Texture Analysis N. ZAGHDEN, M. CHARFI, A.M. ALIMI, Senior Member, IEEE Research Group on Intelligent Machines (REGIM),University of Sfax, ENIS, 3038 Sfax, Tunisia Email: zaghdenn@ yahoo.fr,
[email protected] ,
[email protected]
Abstract: A new statistical approach based on global texture analysis is proposed to the widely neglected problem of font recognition. It aims to the recognition of Arabic fonts of the text from an image block without any knowledge of the content of that text. The recognition is based on the K Nearest Neighbour classifier and operates on a given set of known fonts. The effectiveness of the adopted approach has been experimented on a set of 10 fonts, which are the most commonly used in Arabic writings. Font recognition accuracies of 100 percent were reached by two different methods (fractal dimension, wavelet). Experimental results are also included on the robustness of the methods against image degradation (varying resolution, Gaussian noise) and on the comparison with existing methods. Keywords: Font recognition, fractal dimension, wavelet, K nearest neighbour classifier, image degradation. I.
INTRODUCTION
A considerable amount of research has been dedicated to optical character recognition (OCR) of printed texts. Up to now, there are no commercially available OCR machines that perform true omnifont recognition, although some do well on single fonts or on a small number of mixed fonts. The tendency of recent developments was oriented toward omnifont recognition systems, which aim at recognizing characters of any fonts. In spite of the importance of the font identification for the identification and analysis of documents, this problem is usually neglected [3]. Shi and Palvidis [15] use structural features such as histogram of word length, histogram densities and stroke slopes for font feature extraction. Zramdini and Ingold [19] present a statistical approach for font recognition based on local topographical features. Zhu et al [18] use Gabor filters to apply a global texture analysis on document images. Also Cruz et al [3] use statistical moments to characterize textures of documents. In this paper the global analysis approach is followed to apply the a priori Optical Font Recognition (OFR) approach. We use two main approaches to characterize Arabic fonts. In the first method we apply the fractal dimension to discriminate fonts. We extract primitives of fonts by using the most known wavelet in the second
methods, such as Daubechies, Coiflets, and Symlets. In section II we describe the image basis and the classifier used for font recognition. Section III describes the algorithms used to calculate fractal dimension of images. We describe font features extraction based on wavelet primitives in section IV. Experiments and results are discussed in section V, whereas conclusions are presented in section VI. II.
FONT RECOGNITION
The text to be analysed is contained within the free format file TIFF. The images were saved with a 300 dpi resolutions. The fonts considered in this work are 10, which are the most commonly used in Arabic writings: Ahsa, Arabic Transparent, Diwani, Dammam, Buryidah, Kharj, Khobar, Koufi, Naskh and Tholoth (figure 1). The identification of fonts is applied on block texts without any use of preprocessing such as text line location, text line normalization, spacing normalization and text padding used in [3, 18]. Font recognition based on given features vectors is a typical pattern recognition problem. To identify fonts, a K Nearest Neighbour classifier was chosen. We have tested the classification by using three types of distances between train and test data: Hamming, Euclidean and Tchebychef distances. As this classifier is of the supervised type, two stages were required: learning stage and classification stage. For each font 60 percent of image basis details were taken for the training data and the second half of text image were chosen to effectuate classification. III.
FRACTAL DIMENSION
The fractal dimension is a useful method to quantify the complexity of feature details present in an image. Until today there is no common definition of what is fractal, but it is clear that fractal has many differences with Euclidean shapes. The fractal dimension is the main characteristics of fractals and it is assumed that it exceeds strictly topological dimension of fractal sets. In this paper we propose a new algorithm to estimate the fractal dimension of images and we compare this method with existing methods.
(a)
(b)
(e)
(f)
(c)
(d)
(g)
(i)
(h)
(j)
Figure 1. Examples of text blocks printed in ten Arabic fonts (a): Ahsa; (b): Arabic Transparent; (c): Buryidah; (d): Dammam; (e): Diwani; (f): Kharj; (g): Khobar; (h): Koufi; (i): Naskh; (j): Tholoth.
There are mainly two different methods to calculate fractal dimension: Box Counting and Dilation methods. Several algorithms are been derived from the box counting approach such as differential box counting [12, 13] and the reticular cell counting [1]. The main idea in Box counting algorithms is to divide images by similar box sizes. Then the fractal dimension of the set can be estimated by the equation: D=log (Nr)/log (1/r), Where Nr represents the number of boxes comprising the sets each scaled down by a ratio r from the whole (figure 2). Sarkar and Chaudhuri [12, 13] proposed the differential box counting approach (DBC), which add a third coordinate for 2D images, corresponding to the gray level value of boxes. In each box (i, j), the authors calculate the maximum and minimum gray values: L and K. Then the value of gray value to be considered for that box is: nr (i, j) =l-k+1. The total contribution of gray value of the image is the sum of nr (i, j). The new method that we propose for estimating fractal dimension is derived from the latter method and it is called the CDB method (Comptage de Densité par Boîte). We
consider that the image of size M × M pixels has been scaled down to a size s × s where M/2 >s>1 and s is an integer. Then we have an estimation of r = s / M. The (x, y) space is partitioned into boxes (i, j) of size s × s. On each box we calculate the density of black pixels nr (i, j). Nr = nr (i, j)
∑ i, j
represents the total contribution of the image.
Figure 2. Image decomposition by boxes
Then Nr is counted for different values of r, and we can estimate fractal dimension from the least square linear fit of Log (Nr) against Log (1/r).
Compared with the DBC method, our method (CDB) represents well the gray values of different boxes. In fact, if we suppose that a box contains only two gray levels: 0 and 255. Then nr (i, j) is equal to 256 with the DBC approach. The same value of nr can be obtained with another box, which contains a large number of black pixels than the first one, and have also 0 and 255 as the minimum and maximum gray levels respectively. The dilation method was also used to calculate fractal dimension. The main idea of this approach is to transform the white pixels at distance n to the black pixels of the set to black pixels. Then we calculate the surface area A (Xn) for the dilation order n. The fractal dimension can be derived from the least square linear fit of Log (A (Xn)) against Log (n). IV.
WAVELET DECOMPOSITION
Two basic properties of wavelet decomposition, space and frequency localisation make it a very attractive tool in signal and image analysis. Taking the wavelet transform of an image involves a pair of filters, one high-pass and one low-pass, with the image. This process is repeated as many times as the number of the desired number of decomposition levels of the image (figure 3). For levels higher than two, the decomposition is applied on the image approximation, which contains the main energy of the original image. We have applied several types of wavelets to extract features of text images. For each level we extract the mean and standard deviation values of the approximation, diagonal, horizontal and vertical images. The number of features extracted for each level is 8. Then if we decompose the image up to the third level we have 24 parameters for each image. In the object of finding the best wavelet which describes well different fonts, we have tested five wavelets. The wavelets used are: Daubechies, Coiflets, Symlets, Discrete Meyer and Bior.
used to discriminate fonts are fractal and wavelet features A.
Fractal features We have calculated fractal dimensions of images by three different methods: dilation method, Differential Box Counting (DBC) [12, 13] and our own approach (CDB: Comptage de densité par Boîte). We have chosen to calculate fractal dimension for the DBC and CDB method by using five different parameters corresponding to the maximum size box: 10, 15, 20, 30 and 40. For the dilation method we have calculated fractal dimension of images by using five values of order dilation: 5, 10, 15, 20, and 30. Many studies have been taken to define the limit of box sizes [2, 1, 13]. Referring to [1] the maximum box size is: L=M/G, where M represents image size and G corresponds to the number of different gray levels in the image. Sarkar et al [12, 13] have considered M/2 as the maximum size of boxes for an M × M image size. In our approach (CDB) the images are converted to binary images. Then the number of gray levels is two. The maximum box size is M/2. It is clear that this limit verifies well the size limit of boxes proposed by both Bisoi and Sarkar. Concerning the dilation method we have chosen to consider low values of the order dilation. In fact the most of researchers who calculate fractal dimension use box counting method in their approach and there is not a big interest on the dilation method. The table I presents font recognition accuracies using the three methods of fractal dimension. As one can see our method gives good results compared to other methods of calculating fractal dimension. The best rate of font recognition is 96.5 percent. TABLE I.
FONT RECOGNITION RATE BY FRACTAL DIMENSION
Method Rate (%)
Dilation 88
DBC 58.5
CDB 96.5
B. Wavelet features We have extracted features of different images by using five wavelets: Db3, Coif2, Sym5, Dmey and Bior 4.4. The font recognition accuracies vary between 97.5 and 100 percent. The table II presents rates of these wavelets. Figure 3. Wavelet decomposition (at level 3)
TABLE II.
FONT RECOGNITION RATE BY USING WAVELET FEATURES
V.
EXPERIMENTAL RESULTS
Extensive experiments have been carried out to identify Arabic fonts. The two main approaches
wavelet Rate(%)
Db3 97.5
Coif2 100
Sym5 99
Dmey 97.5
Bior4.4 98.5
The results obtained by the Coiflet wavelet are very surprising, so we make some changes to the training and test data in order to show their effect on font recognition. First we choose random image features for the training data; next we reduced the number of training data features to 4 percent of the whole data. In both cases we have obtained the same results with Coiflet wavelet (100%). C.
Wavelet and fractal features We have also made some other tests to characterize fonts. We combined fractal and wavelet primitives. The table III presents the results obtained by the combination of features from CDB and different wavelets studied in this paper. TABLE III.
wavelet Rate(%)
FONT RECOGNITION RATE BY USING AND WAVELET FEATURES
Db3 100
Dmey 99.5
Sym5 99.5
CDB
Bior4.4 98.5
D.
Robustness test All the above experiments were carried out on noise- free images. Nevertheless, it is important to consider noise because images are usually contaminated by noise. For each of the 10 fonts, we use 60 noise-free samples for training and other 40 noisy images for testing. We studied the effect of additive white random noise (Gaussian type; with zero mean value) for different levels (figure 4). The signal noise rate characterizes the degree of the noise and it is defined by:
SNR=10 log
∑I
∑ (I
m ,n
2 m ,n
− J m ,n ) 2
,
Where Im,n and Im,n represent the original and the noisy image, respectively. We have applied the CDB and Wavelet Coiflet as primitives characterising degraded images. We have remarked that font recognition decreases well using wavelet features, although the CDB method has good results for values of SNR above twenty. The table IV shows that at noise level SNR=20, where the image is significantly contaminated, the CDB method is still capable for achieving a recognition rate as well as 90.5 percent. TABLE IV.
FONT RECOGNITION RATE BY THE METHOD AT DIFFERENT NOISE LEVELS
SNR Rate(%)
No noise 96.5
50 96.5
40 96.5
30 96
CDB
20 90.5
10 30
The robustness of the algorithm is also examined in terms of varying resolutions. For each
of the 10 fonts we used 300 dpi samples for training and different resolutions for testing: 200, 100, 75, 60, 50 and 40. As well as the degraded images, Coiflet features have bad recognition rates in terms of varying resolutions. However Fractal dimensions extracted by the CDB method have good results for resolutions above 60 dpi. The results are shown in table V. TABLE V.
FONT RECOGNITION RATE BY THE METHOD AT DIFFERENT RESOLUTIONS
Resolution (dpi) Rate (%)
CDB
200
100
75
60
50
40
96.5
96.5
96
95
95.5
85
E. Comparison with existing methods The results obtained in this paper can be compared with those given by recent works on font identification. We have obtained a recognition rate of 100 percent by two different methods: Coiflet wavelet and the combination between the CDB method and the Daubechies wavelet. We have applied the recognition without any use of preprocessing stages. In [3] the font recognition rate is 100 percent, however in this study many steps of pre-processing were applied such as the partitioning of text lines in the image blocks. In terms of the robustness of the methods applied in this paper, we have concluded that wavelet features aren't able to discriminate fonts in degraded images. However the CDB method has better recognition rates on noisy images and in terms of varying resolutions of testing images. The results obtained for images at difference noise levels are slightly worse than those obtained by [3], how else the results that we have obtained in terms of resolutions under 300 dpi are slightly better than [3]. It is also important that approaches described in this paper are equally applicable at different scripts and languages (Chinese, English and Spanish), whereas approaches based on local typographical features, such as [19] are like to be script and language dependent. VI. CONCLUSIONS We have presented in this paper two main approaches for automatic font recognition. In the first method we have developed a new method for the calculation of fractal dimension (CDB). In the second method we have extracted features of different images by using five wavelets. The ideal font recognition accuracies were obtained by both Coiflet wavelet and the combination between CDB method and Daubechies wavelet (100%). The CDB method has also been demonstrated to exhibit strong robustness against noise and resolution variations.
(a)
(b)
(c)
(d)
(e)
(f)
Figure 4. Document images at different noise levels: (a) : SNR=10; (b): SNR=20; (c): SNR=30; (d): SNR=40; (e): SNR=50; (f): SNR=0
REFERENCES [1] A.K.Bisoi, J.Mishra, "On calculation of fractal dimension of images", Pattern Recognition Letters, vol 22, N°6-7, pp 631-637, 2001. [2] S.S.Chen, J.M.Keller and R.M.Crownover, "On the calculation of fractal features from images", IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol 15, N°1, pp 1087-1090, 1993. [3] C.A.Cruz, R.R.Kuopa, M.R.Ayala, A.A.Gonzalez, R.E.Perez, "High order statistical texture analysis-font recognition applied", Pattern Recognition Letters, vol 26, N° 2, pp 135-145, 2005. [4] I.Daubechies, B.Han, A.ron and Z.Shen, "Framelets: MRA-based constructions of wavelet frames", Applied and Computational Harmonic Analysis, vol 14, N°1, pp 1-46, January 2003. [5] X.C.Jin,S.H.Ong, Jayasooriah, "A practical method for estimating fractal dimension", Pattern Recognition Letters, Vol 16, pp 457-564, May 1995. [6] S.Kahan, T.Pavlidis, Henry et S.Baird " On the recognition of printed characters of any font and size ", IEEE trans on Pattern Analysis and Machine Intelligence, vol 9, n°2 pp 274-288, Mars 1987. [7] J.M.Keller, R.Crownover and S.Chen, "Texture description and segmentation through fractal geometry", Computer Vision Graphics and Image Processing, vol 45, pp 150-166, 1989. [8] S. Khoubyari and J. Hull, “Font and Function Word Identification in Document Recognition,” Computer Vision and Image Understanding, vol. 63, no. 1, pp. 66–74, 1996.
[9] B. Mandelbrot, "Fractales hazard et finance", Flammarion, Paris, 1997. [10] S.Peleg, J.Naor, R.Hartley, and D.Avnir, " Multiple resolution texture analysis and classification ", IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), vol6, N° 4, pp 518523, July 1984. [11] A.P.Pentland, "Fractal-based description for natural scenes", IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), vol. PAMI-6, N° 6, pp 661-674, November 1984. [12] N.Sarkar and B.B.Chaudhuri, "An efficient approach to estimate fractal dimension of textural images", Pattern Recocognition, vol 25, N° 9, pp 1035-1041, 1992. [13] N.Sarkar and B.B.Chaudhuri, " An efficient differential box-counting approach to compute fractal dimension of image", IEEE Trans.Systems, man, and cybernitics, vol 24, N° 1, pp 115-120, January 1994. [14] A.Seropian, M.Grimaldi, N.Vincent, "Writer Identification based on the fractal construction of a reference base", Proceedings of the seventh International Conference on Document Analysis and Recognition (ICDAR 2003), vol 2, N° 2, pp 1163-1167, 2003. [15] H.Shi, T.Pavlidis, "Font recognition and contextual processing for more accurate text recognition", Proceedings of the International Conference on Document Analysis and Recognition( ICDAR '97),pp 39-44,1997. [16] Y.Tao, E.C.L.Lam, Y.Y.Tang, " Feature extraction using wavelet and fractal" Pattern Recognition Letters, vol 22,N°3 , pp 271-287, 2001. [17] N.Vincent, T.Frêche, " Gray level use in a handwritting fractal approach and morphological properties quantification ", Proceedings of the
International Conference on Document Analysis and Recognition(ICDAR'01), Seattle (USA), pp 1113, September 2001. [18] Y.Zhu, T.Tan et Y.Wang : "Font recognition based on global texture analysis".IEEE trans on Pattern Analysis and Machine Intelligence, vol 23, n°10, pp 1192-1200. October 2001. [19] A.Zramdini et R.Ingold, " Optical font recognition using typographical features", IEEE trans on Pattern Analysis and Machine Intelligence, vol.20, n° 08, pp 877-882, August 1998.