Research Article Received: 3 July 2007,

Revised: 10 December 2007,

Accepted: 27 March 2008,

Published online in Wiley InterScience: 2008

(www.interscience.wiley.com) DOI: 10.1002/cem.1154

A pilot study on colonic mucosal tissues by fluorescence spectroscopy technique: Discrimination by principal component analysis (PCA) and artificial neural network (ANN) analysisy Sudha D. Kamatha, Claretta S. D’souzaa, Stanley Mathewb, Sajan D. Georgea, Santhosh C a and K. K. Mahatoa* Pulsed laser-induced autofluorescence spectra of pathologically certified normal and malignant colonic mucosal tissues were recorded at 325 nm excitation. The spectra were analysed using three different methods for discrimination purposes. First, all the spectra were subjected to the principal component analysis (PCA) and the discrimination between normal and malignant cases were achieved using parameters like, spectral residuals, Mahalanobis distance and scores of factors. Second, to understand the changes in tissue composition between the two classes (normal, and malignant), difference spectrum was constructed by subtracting mean spectrum of calibration set samples from simulated mean of all spectra of any one class (normal/malignant) and in third, artificial neural network (ANN) analysis was carried out on the same set of spectral data by training the network with spectral features like, mean, median, spectral residual, energy, standard deviation, number of peaks for different thresholds (100, 250 and 500) after carrying out 1st-order differentiation of the training set samples and discrimination between normal and malignant conditions were achieved. The specificity and sensitivity were determined in PCA and ANN analyses and they were found to be 100 and 91.3% in PCA, and 100 and 93.47% in ANN, respectively. Copyright ß 2008 John Wiley & Sons, Ltd. Keywords: colonic mucosa; laser-induced fluorescence; principal component analysis; artificial neural network

1.

INTRODUCTION

Colonic cancers are one of the most common causes for death due to cancer in the developed world today. It is seen to occur mostly in the elderly, with 40–50% of all cases occurring after the age of 60 [1]. Colon cancer incidence is higher in developed nations such as USA and Japan and is lowest in developing countries of Asia and Africa. In India, it is a fairly rare disease. The National Cancer Registry Programme in India does not list it as one of the 20 most common cancers seen in the Indian population [2]. Most (over 95%) colon cancers are adenocarcinomas, with underlying genetic mutations having resulted in a sequential and progressive molecular and morphological changes now described as the adenoma–carcinoma sequence. Apart from age, the other risk factors associated with colon cancer are personal history of colorectal cancer, history of colon or rectal polyps, personal history of chronic inflammatory bowel diseases such as ulcerative colitis family history of colorectal cancer and diet mostly from animal sources. About the 30% of people developing colorectal cancer have disease that is familial. About 3–5% colorectal cancers are associated with hereditary nonpolyposis colorectal cancer (HNPCC) or lynch syndrome and about 1% is associated with familial adenomatous polyposis (FAP) [3]. One of the problems associated with colon cancer treatment is that it is frequently detected too late. Fifty-seven per cent of all colorectal cancers have, at the time of diagnosis spread to

surrounding tissue and in some cases, distant organs [3]. Colon cancer symptoms can be easily ignored since they are often trivial and non-specific until the disease has reached an advanced stage. Symptoms include progressive constipation, sense of incomplete evacuation, bleeding per rectum, abdominal discomfort and unexplained anaemia [4]. The 5-year relative survival rate for people, whose colorectal cancer is treated in an early stage, before it has spread, is greater than 90% [3]. But only 39% of colorectal cancers are found at that early stage. Once the cancer has spread to nearby organs or lymph nodes, the 5-year relative survival rate goes down, and if cancer has spread to distant organs (i.e. the liver or lung) the 5-year survival is less than 10%. Therefore, an effective and reliable early detection system is highly desirable.

* Correspondence to: K. K. Mahato, Centre for Laser Spectroscopy, Manipal University, Manipal-576 104, India. E-mail: [email protected]; E-mail: [email protected] a S. D. Kamath, C. S. D’souza, S. D. George, S. Chidangil, K. K. Mahato Centre for Laser Spectroscopy, Manipal Life Sciences Centre, Manipal University, Manipal-576 104, India b S. Mathew Department of General Surgery, Kasturba Medical College, Manipal University, Manipal-576 104, India y Sudha D. Kamath and Claretta S. D’souza contributed equally to this work.

408 J. Chemometrics 2008; 22: 408–416

Copyright ß 2008 John Wiley & Sons, Ltd.

A PILOT STUDY ON COLONIC MUCOSAL TISSUES The commonly employed techniques for the detection of colon cancer are faecal occult blood test (FOBT), flexible sigmoidoscopy, colonoscopy and double contrast barium enema (DCBE). Abnormal areas visualized on endoscopy are biopsied and subjected to histopathological examination for confirmation of the diagnosis. Various imaging techniques such as ultrasound, computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET) and angiography are used as adjuncts to stage the disease and plan treatment. The mainstay of treatment is complete surgical excision (radical colectomy) followed by adjuvant radiation therapy and chemotherapy [5]. Nevertheless, colon cancer has a recurrence rate of about 40% within 3–5 years after surgery. Although tumour markers such as carcinoembryonic antigen (CEA) and CA 19-9 are used to corroborate the diagnosis of colon cancer, it has the disadvantage that enhanced level of these tumour markers may also be present in the blood of some people with ulcerative colitis, non-cancerous tumours of the intestines, or some types of liver disease or chronic lung disease. Smoking can also increase the CEA level in blood. Histopathological examination of samples taken as endoscopic biopsy, which is considered as the gold standard, suffers from several disadvantages. Firstly, the method is only pertinent to the more advanced stage of the disease, as morphological changes of cells and tissue architecture form the basis of histopathological diagnosis. Further, there is a considerable tissue processing methodology including formalin fixation, paraffin embedding, microtoming sections and staining using several dyes—all resulting in time delay to diagnosis. There are usually also problems associated with inadequate biopsy samples due to improper clinical examination and biopsies from non-representative areas. Also, in the case of post-treatment relapses, frequent biopsies are uncomfortable for the patient in question [4–6]. All these practical difficulties demand the development of new non- or minimally invasive technique for the early detection of colon cancer. Recent years have witnessed a steady growth of interest in optical methods for medical diagnostics, and if applied successfully, optical spectroscopy has the potential to contribute to both diagnostic and therapeutic medical applications. Spectroscopy is the study of the interaction of electromagnetic radiation with matter and there are three aspects of spectroscopic measurements; viz, irradiation of sample with electromagnetic radiation, measurement of absorption, spontaneous emission (fluorescence, phosphorescence) and/or scattering (Rayleigh elastic and Raman inelastic scattering). Among these techniques, in the UV/VIS spectral regions, absorption and fluorescence spectroscopy have been explored extensively as diagnostic tools for pre-cancer and cancer detection in surface epithelia of various organ sites (colon, cervix, bronchus, lung, brain, bladder, oesophagus, head and neck, skin, bile duct, breast, stomach, cervix and oral cancer tissue). Autofluoresence spectroscopic technique basically depends upon the excitation of fluorphores (NADH, flavins, porphyrins, collagen and elastin, in addition, the amino acids tryptophan, tyrosine and phenylalanine present in the proteins) in tissue with a suitable wavelength laser light and consequent emission of fluorescence, followed by, classification of normal and malignant tissue using appropriate algorithms. In the case of tissues, the fluorescence spectra obtained after illumination with electromagnetic radiation at a particular wavelength is a composite spectrum of fluorescence from various fluorophores. Hence the fluorescence spectra can

contain information about the changes in biochemical and biophysical properties due to variation in relative contribution of flurophores depending upon the stage of malignancy in the tissue [7–10]. Laser-induced fluorescence (LIF) spectroscopy has the potential for early detection of pre-malignant lesions in situ. Since the early 1980s this technique has been developed as a reliable, non-invasive and real time adjunct to endoscopy for gastrointestinal (GI) tissue diagnosis [11]. Advances in fibre optic probe designs now allow in situ and in vivo delivery of electromagnetic radiation to tissue surface and the collection of autofluorescence from tissue surface. Normal and malignant tissue are distinguished by differences in fluorescent spectral features based on autofluorescent tissue chromophores. The reported accuracy of LIF, with sensitivities, specificities and positive predictive values ranging from 80 to 100, 77 to 95 and 82 to 94%, respectively, approached that of histopathological interpretation of biopsy specimens with 90% agreement in tissue diagnosis [8–18]. Development of various multivariable statistical tools such as principal component analysis (PCA) [19,20], support vector machine (SVM), etc. enhanced the reliability and accuracy of classification of normal and malignant patients based on collected autofluorescence spectra. In this work, we focus on the classification of normal and malignant colon tissues excited at 325 nm using PCA and artificial neural network (ANN) [21–24] analysis. As the choice of appropriate algorithm and wavelength have great influence on the specificity and sensitivity of the analysis based on LIF technique, the present work has great practical significance, especially from the point of view of development of accurate optical diagnostic tool.

2.

MATERIALS AND METHODS

2.1.

Sample collection and handling

Surgically resected colectomy specimens were cleaned and representative samples of normal and malignant colonic mucosal tissue of approximate size 3  3 mm2 were harvested in the operation theatre, by the surgeon from the Department of General Surgery, Kasturba Hospital, Manipal University, Manipal. Mirror image samples of each normal and malignant cases used in this study were sent for histopathological investigations. During the recording of the spectra, the samples were always kept moist with saline. Trial runs have shown that the spectra remain unchanged for at least 2 h after biopsy, if kept in saline. For each sample, spectra were recorded from several points, separated by a few hundred microns to a few millimetres. Details of the sample preparation and spectral recording are mentioned elsewhere [19–24]. The sample details are given in Table I. No particular care was taken regarding relative orientation of probe and sample except to make sure the probe was visually normal to the tissue surface and the tissue to probe distance at nearly 1 mm maintained for all spectral recordings. A change from this orientation by several degrees (say 20), which would be easily noticed, would change the path length only about 6% from that of normal orientation, causing only a small change in the spectrum [21]. 2.2.

Experimental

The schematic of the experimental set-up is shown in Figure 1. A 325 nm radiation from an Nd-YAG MOPO-FDO (Spectra Physics,

409

J. Chemometrics 2008; 22: 408–416

Copyright ß 2008 John Wiley & Sons, Ltd.

www.interscience.wiley.com/journal/cem

S. D. KAMATH ET AL.

Table I. Sample details Spectrum No.

Sample type

1–15 16–30 31–84 85–115

Normal calibration set Malignant calibration set Normal test set Malignant test set

Mean age 35  5 45  10 30  8 40  8

Histopathology

Spectroscopy

Normal Malignant Normal Normal and malignant

Normal Malignant Normal Normal and malignant

Figure 1. Schematic experimental set-up used for laser-induced fluorescence measurement.

Quanta Ray, model: PRO 230 10, MOPO SL) source was used for excitation at a 10 Hz pulse repetition rate and 100–200 mJ energy per pulse. The fluorescence emission from the tissue was collected by using a fibre-optic probe, designed and fabricated in our laboratory, which is readily adaptable to in vivo studies [19–24]. The probe was, a typical seven-fibre probe with central fibre for excitation and surrounding six fibres for collection of fluorescence. The output from these fibres was coupled into an imaging spectrograph (Acton Spectra pro 150 spectrograph 300 g/mm, 300 nm blazed grating) equipped with an ANDOR ICCD system (Andor Technology, Northern Ireland, ICCDBH501-25F-01) for spectral recording. The time delay between excitation and collection was provided by a DG 535 delay and gate generator (Stanford Research Systems), which was triggered by the Q-switch advance trigger from the laser power supply. A delay of 500 ns was used to compensate for the cable and other delay times. The total emission after the delay was recorded with a 55 ns gate, which is sufficient to record the total fluorescence. All spectra were recorded with the following parameters: slit width—250 mm (5 nm band pass), average of 100 laser pulses (10 Hz) for each spectrum, laser power between 100 and 200 mJ and gate width (time of acquisition per shot) of the detector— 55 ns. In this study, we have recorded a total of 115 spectra (69 normal and 46 malignant) from 13 normal and 10 malignant colonic mucosal tissue samples. For an excitation wavelength of 325 nm, the emission spectra were recorded in the 350–600 nm region with 1024 data points in each recording.

2.3.

Data analysis

Previously, when spectra were recorded using fluorescence technique [19–24] from various sites on a single tissue sample, the mean of all the spectra recorded from one sample was taken as a representative of that sample for further data processing. In the present work, because of possibility of site-to-site variation of the disease, all the 115 spectra have been treated as independent data [21]. The spectral analysis for discrimination between normal and malignant conditions was carried out using three different methods, PCA, difference spectra analysis and ANN analysis. In method of PCA [25] (PLS Plus/IQ, Galactic Corporation, USA), first the mean spectrum of the entire data set is calculated and then the difference between ith spectrum and this mean (ith spectrum  mean spectrum) is used, so that the spectral features common to all are eliminated. These difference spectra are subjected to eigenvector–eigenvalue analysis and the associated principal components (PCs) along with their eigenvalues are derived. Using eigenvalues, contribution to total variance, eigenvectors, sum of squared spectral residual (sum of squares of differences between observed and simulated spectra for all data points) and other parameters the number of significant factors (PCs) required for correct data analysis are selected and PCA based match/no match discrimination analysis is carried out [21,24]. The difference spectra approach determines the spectral alterations between normal and malignant cases. In this method,

410 www.interscience.wiley.com/journal/cem

Copyright ß 2008 John Wiley & Sons, Ltd.

J. Chemometrics 2008; 22: 408–416

A PILOT STUDY ON COLONIC MUCOSAL TISSUES

Figure 2. Typical colonic mucosal tissue fluorescence spectra recorded at 325 nm excitation.

any given spectrum (normal/malignant) is simulated with the PCs derived from any calibration set samples (normal/malignant). This simulated spectrum will then reproduce correctly only normal/ malignant spectra and will reproduce partly of the spectrum for other samples malignant/normal. The difference spectrum of a test sample (observed  simulated) can give information about the PCs used for simulation of that sample which in turn provide information about the changes in biochemical composition of the tissue. ANN consists of an input layer, one or more hidden layers and an output layer with variable number of processing elements (artificial neurons or nodes) in each layer [26–28]. Each node, with the exception of the input neurons, receives multiple weighted inputs and produces an output which is usually a nonlinear function of the inputs. Unlike rule-based methods, ANN is capable of learning from examples and generalizing; it has the power of pattern recognition and classification. The most widely used class of ANN is the feed-forward network with a back-propagation algorithm [21,26–28] for training of the network. In the current study, a simple three-layer network has been used and discrimination between normal and malignant colonic condition is carried out.

3.

RESULTS AND DISCUSSION

Typical florescence spectra recorded from normal and malignant colonic mucosal tissue samples are shown in Figure 2. Though the spectra of normal and malignant tissues are very different, a visual examination can only identify the obviously different cases. Such identification is subjective depending on the number of cases the clinician has seen. In the present study, in most of the cases, the spectral differences between normal and malignant are less well defined and therefore, PCA and ANN analysis with statistically well-defined parameters have been used and objective discrimination between normal and malignant classes been achieved.

3.1.

Principal component analysis (PCA)

PCA is a classical statistical method which transforms attributes of a data set into a new set of uncorrelated attributes called principal components. PCA can be used to reduce the dimensionality of a data set, while still retaining as much of the variability of the data set as possible [20–25]. Applying PCA to

the feature space matrix, the original data set gets transformed into a set of PC scores. The contribution of each PC to the total variance of spectral data is proportional to its eigenvalues. Higher order PCs often account for less than 1% of the total variance and represent mostly noise [20–25]. PCs that have variance more than 1% in the spectral data are considered as informative PCs. In PCA, first the calibration sets for normal and malignant were prepared using pathologically certified samples, selecting 15 spectra from each class randomly. On these 15 spectra of each set when PCA was performed and ‘scores’ versus ‘sample number’ was plotted, scores of some spectra were found inconsistent with the scores of others in a set. The spectra which had inconsistent scores in a set were then replaced with other spectra of same class having scores consistent to that set and thus the calibration sets of normal and malignant were optimized and utilizing them further appropriate number of PCs required for diagnostic data analysis were selected. Figure 3(a) and (b), respectively, show the eigenvalues and total percentage variance for the seven factors using normal calibration set samples. Figure shows that the first four factors contributed to about 92.3% of total variance in the calibration data and suggests that these four factors are sufficient to describe all calibration data. Therefore, for all final calculations these four factors are only used. After optimizing the calibration sets and selecting four informative PCs, the PCA based match/no match discrimination analysis was carried out using three parameters, scores of factors, spectral residuals and Mahalanobis distance. Mahalanobis distance [29] is represented in units of standard deviation. Mahalanobis distance or M-distance is given by D2 ¼ ðStest ÞM1 ðStest Þ0

(1)

where Stest is the vector of scores and sum of squared spectral residuals for a given test sample, and M is given by S0 S/(n  1), where S contains the corresponding parameters for the calibration set of n standards. In discrimination analysis, the spectra that fell between limits were labelled as a ‘match’ and all others labelled as ‘no match’. The match/no match discrimination analysis described here was carried out with the calibration set of normal samples. When each of 15 normal and 15 malignant calibration set samples were tested, normal samples classified as normal and malignant samples classified as malignant. In case of test samples, 54 normal spectra were classified as normal, 27 out of 31 malignant spectra

411

J. Chemometrics 2008; 22: 408–416

Copyright ß 2008 John Wiley & Sons, Ltd.

www.interscience.wiley.com/journal/cem

S. D. KAMATH ET AL.

Figure 3. PCA (a) eigen values, (b) total percentage variance for a model set of 15 colonic mucosal tissue spectra.

Table II. PCA based match/no match test of normal and malignant samples (spectra) against calibration set of 15 normal samples Sample No.

Match

1 2–14 15 16 17–29 30 31 32–83 84 85 86–94 95 96–101 102 103–113 114 115

Yes Yes Yes No No No Yes Yes Yes Yes No Yes No Yes No Yes No

M-distance

Limit tests

1.0162133 0.37048–1.3063789 1.4576779 17.462555 3.5135585–22.627525 21.378596 0.7043262 0.280168–1.7013797 1.6013382 0.5352469 3.0931738–20.899335 2.3128076 4.1120131–6.3931158 2.878605 3.688031–13.528189 0.9610147 6.799877

Pass Pass Pass Fail Fail Fail Pass Pass Fail Pass Fail Pass Fail Pass Fail Pass Fail

(PPP) (PPP) (PPP) (FFF) (FFF) (FFF) (PPP) (PPP) (FPP) (PPP) (FFF) (PPP) (FPF) (PPP) (F?F) (PPP) (F?F)

Spectral residual 0.789747 0.2063288–1.0571101 0.1765312 7.6232792 1.6753106–9.7320539 9.2078907 0.361631 0.208814–1.508202 0.4433901 0.50019 1.5312373–9.0930771 0.4043505 1.8904728–2.8163261 0.4671112 1.8247696–5.863798 0.4239971 3.2100605

P, pass; F, fail; ?, possible. The parameters in bracket are 1, score; 2, spectral residual; 3, M-distance.

412

classified as malignant and remaining 4 malignant classified as normal. In this analysis, all the normal spectra have shown Mahanalobis distance values 3, 27 out of 31 malignant spectra shown >>3 and remaining 4 malignant shown 3. The specificity and sensitivity of this technique are found to be 100 and 91.3%, respectively. The results of match/no match analysis are shown in Table II. Unlike earlier workers who have tried to check the category of a sample by parameters like scores, derived from the combined normal and malignant data, in the present study we have developed the match/no match test with more than one parameter (Mahalanabis distance and spectral residuals) and the possibility of miscalculation using only one parameter is avoided. Though we have used only two categories, normal and malignant, the method may be applicable for other categories also, when sufficient number of pathologically certified samples for those becomes available for the study. Figure 4 shows a plot of M-distance against spectral residual for 115 spectra (15 þ 54 normal, 15 þ 31 malignant) compared to the

www.interscience.wiley.com/journal/cem

Figure 4. Classification of 115 spectra (69 normal, and 46 malignant) compared to the calibration set of 15 normal spectra. A plot of M-distance versus spectral residual.

calibration set of normal spectra. It is clearly seen from the plot that all samples classified as normal are clustered together in the lower left-hand corner of the plot, while samples classified as malignant lie far outside. Similarly, the plot of M-distance against sample number for 115 spectra (15 þ 54 normal, 15 þ 31

Copyright ß 2008 John Wiley & Sons, Ltd.

J. Chemometrics 2008; 22: 408–416

A PILOT STUDY ON COLONIC MUCOSAL TISSUES

Figure 5. Classification of 115 spectra (15 þ 54 normal, 15 þ 31 malignant) compared to the calibration set of 15 normal spectra. A plot of M-distance versus sample number.

malignant) compared against the calibration set of normal spectra (Figure 5) shows that all normal spectra having M-distance values 3 are clustered together in one group and 42 out of 46 malignant spectra with M-distance values >>3 clustered in another group. In both the plots very few of the malignant samples (four) are found overlapped with the normal group, suggesting the probability of the malignant samples being out of the malignant cluster 8.69%. It is possible that the spectral response of the instrument may vary over long periods. This is checked by periodically recording spectra from standard samples like Rhodamine6G and tryptophan and found that the spectra remained unchanged over the course of this work. Also, to minimize the spectral variation due to instrumental factor, the spectra of normal and malignant samples from the same subjects have always been recorded within a short interval of time [19–24]. That all spectra from a given class more or less remained unchanged in spectral profile is clear from the low dispersion of the parameters (spectral residual, e.g.) for individual spectra of the calibration set, taken over several months, when matched against that set. 3.2.

Difference spectra

As already mentioned, with the PCs derived from any given calibration set, say normal, the spectrum of any sample (normal, malignant) can be simulated. Obviously, such simulation will reproduce correctly only the spectra from normal samples and will reproduce partly of the spectra for other samples. This is evident from the small values of spectral residuals of the samples belonging to the calibration set and large values for all other samples. The important point to note is that the simulated spectral contribution for a test spectrum presumably comes from the components of the calibration set present in the ‘test spectrum’. The difference spectrum for test samples (observed  simulated) thus corresponds to the changes in the components of the test sample. To understand the changes in tissue composition between the two classes (normal, and malignant), we simulated the mean of all spectra of normal/malignant class with the normal calibration set samples. The difference spectra (mean of normal calibration set  simulated mean of all spectra of normal/malignant class) are shown in Figure 6. Figure 6(a) is the difference spectrum of the normal class (mean of normal calibration set  simulated mean of all 69 normal spectra) where the difference is almost noise. This indicates that the two spectra are almost identical without any difference. Figure 6(b) shows the difference spectrum between

Figure 6. Plot shows difference spectra (mean of normal calibration set  simulated mean of all spectra of one class) of (a) normal (b) malignant.

mean of all normal calibration set and mean of all 46 malignant spectra. The figure shows two positive peaks with approximate intensities 0.15 at 398 nm and 0.2 at 440 nm and a negative peak of intensity 0.14 at 533 nm. The intensity at 398 may be correlated to collagen and intensity at 440 nm may be to NADH (bound). These positive residual peaks in the difference spectrum of malignant class imply that the amount of collagen and NADH are more in normal condition as compared to the malignant cases. The negative peak at 533 nm indicates that the amount of flavin may be more in malignancy as compared to the normal condition. The similar observations are seen in the difference spectra obtained when all individual normal and malignant spectra are simulated with the normal calibration set samples. Therefore, in difference spectra analysis, we conclude that the intensities of collagen, NADH (bound) and flavin may be used as discrimination parameters for classification of normal and malignant colonic conditions. 3.3.

ANN analysis

ANN analysis was carried out with the same set of normal and malignant samples as used in PCA. The calibration set samples were used for training of the network and the test samples for validating the trained network. 3.3.1.

Architecture of ANN diagnostic scheme

Baseline corrected fluorescence spectra were normalized using norm function and filtered with a 21-order median filter to eliminate undesirable spikes due to noise/other disturbances. The spectral signal of each normal and malignant was then fed to the feature extraction module and eight statistical features were extracted from each. These features are mean, median, spectral residual, energy, standard deviation and number of peaks for different thresholds (100, 250, 500) after carrying out 1st-order differentiation on each spectrum. The number of peaks for each threshold, 100, 250 and 500 after 1st-order differentiation were treated as independent features. For spectral residual values, a 10th-degree polynomial curve was fitted onto each of normal and malignant spectrum and residual values were obtained.

413

J. Chemometrics 2008; 22: 408–416

Copyright ß 2008 John Wiley & Sons, Ltd.

www.interscience.wiley.com/journal/cem

S. D. KAMATH ET AL.

Figure 7. Training of ANN and its convergence. This figure is available in colour online at www.interscience.wiley.com/journal/cem

Feature extraction and ANN implementation was carried out using [email protected] algorithms [30–31]. 3.3.2.

Back-propagation algorithm of ANN

Three-layer networks are sufficient to design any nonlinear network [21,26–38]. Therefore, in the present study we have used three-layered feed-forward network with one input layer, one hidden and one output layer. The input layer consists of nodes with feature vectors as inputs followed by a hidden layer consisting of seven neurons. For training of the network, performance goal was set at 0.01 accuracy and was achieved using only one output neuron. During the training, in order to minimize error, learning cycle updates weights of output layer and progress backwards. Error function is mean squared error (MSE) updating in negative gradient. Hence it is a ‘gradient descent’ back-propagation algorithm (delta learning rule). The second and third layers are with an activation function for all the neurons. Activation function used in this case is a ‘tan–sigmoid’ function and is f ðxÞ ¼

1  ex 1 þ eþx

(2)

3.3.3.

Training of ANN and prediction of spectral data

The ANN was trained with eight features from each of normal and malignant calibration set samples (as used in PCA). The performance goal was set at 0.01 and the training of ANN was achieved with 1322 epochs as shown in Figure 7. The network was given the instruction to show a binary digit 1 for normal and 0 for malignant conditions. After training of the ANN, class prediction of new samples was carried out by feeding their corresponding eight features into the trained network. The trained network then matches these new features with the corresponding training set data and gives its judgment as binary 1 for normal and 0 for malignant. When each of 15 normal and 15 malignant training samples was tested with the trained network, all normal classified as normal and malignant classified as malignant. Subsequently when 54 test normal and 31 test malignant spectra were tested, all normal classified as normal, while 28 out of 31 malignant classified as malignant. The remaining three malignant test spectra were classified as normal. The prediction of colonic mucosal data using ANN is shown in Table III. The specificity and sensitivity values obtained in this analysis are found to be 100 and 93.47%, respectively.

The MSE function is defined as k 1X ðdk  zk Þ2 E ¼ 2 k¼1

3.4. (3)

The error function conceived in the n-dimensional hyperspace can be deemed as a bowl, whose bottom most point indicates the optimum set of weights. Weights are to be changed in the direction of negative gradient of this error function. In the above network, the input passes through two processing elements to reach the output. Since f is a sigmoid function, the network (when trained) can represent a nonlinear classifier of any order. In the present case, for convergence of training of the network, a gradient descent algorithm with momentum is used. The details of the ANN theory is explained elsewhere [21,26–38].

Discrimination performance by different techniques

A few numbers of the malignant spectra are classified as normal by both PCA and ANN analysis (four in PCA and three in ANN). Of the malignant spectra tested against the normal calibration set using PCA, four spectra (nos. 85, 95,102 and 114) are classified as normal and except the sample number 102 all other samples (nos. 85, 95 and 114) are also classified as normal by ANN analysis. The sample number 102 which is classified as normal by a very small margin in PCA is correctly classified to its exact class malignant in ANN analysis. This shows that ANN analysis is able to pick up even smaller details of the sample and classified it properly. These observations show that the ANN seems to be a better classifier technique with higher sensitivity as compared to PCA. However, PCA can be made more sensitive by fixing some

414 www.interscience.wiley.com/journal/cem

Copyright ß 2008 John Wiley & Sons, Ltd.

J. Chemometrics 2008; 22: 408–416

A PILOT STUDY ON COLONIC MUCOSAL TISSUES

Table III. The calibration/test set of normal (15 þ 54 spectra) and calibration/test set of malignant (15 þ 31) are tested with the trained neural network Spectrum No. 1 2–14 15 16–29 30 31 32–84 85 86–94 95 96–113 114 115

Desired output

Classifier output

Output characterization

1 1 1 0 0 1 1 1 0 1 0 1 0

0.9523 0.9494–0.9873 0.9774 0.0063–0.0578 0.0203 0.9732 0.8220–0.9922 0.9844 0.0037–0.1268 0.9804 0.0079–0.1818 0.9834 0.0279

Normal Normal Normal Malignant Malignant Normal Normal Normal Malignant Normal Malignant Normal Malignant

higher stringent prediction limits, when a large number of certified samples become available for the study. Apart from distinguishing normal and malignant tissue spectra using PCA and ANN, the spectral data are also analysed and the information about the specific tissue components responsible for the characteristic spectra is obtained. For example using the difference spectra approach, the difference in collagen content between normal and malignant tissues is determined. Using this approach, if it is possible to obtain and analyse malignant tissue samples at various stages of cancer, and if there is a noticeable difference in the nature of the spectra at various stages of the disease, then it may be possible to create a spectral profile of the progression of cancer. If this is successful, then it can be used not only for diagnosis of cancer but also for the instant staging of cancer, thereby making it possible to plan the most effective path of treatment. An important application of optical methods is surgery boundary demarcation. Since the fluorescence spectroscopy is a sensitive and fast technique, site-to-site spectral variation in a single piece of tissue can be monitored in real times if required and hence it can be used as an effective tool for surgical boundary demarcation. This may help the surgeon to identify and thereby to remove all the malignant tissues from a lesion. The florescence spectroscopy coupled with fibre optic probe can be easily adapted for in vivo screening of lesions. This has already been carried out by a research group where they collected the fluorescence spectra from the colonic mucosal tissues by inserting a fibre-optic cable through the endoscope to the lesion during routine colonoscopy procedures [1–5]. But to be accepted for routine medical applications, it is essential that different groups as well as different techniques are used and validated any new method. Also, since the results obtained by different people deal with entirely different ethnic groups and it is not necessary that those results will be applicable to all other ethnic types until so proved.

4.

CONCLUSIONS

There are practically no systematic studies on effectiveness of discriminating between normal and malignant conditions in colonic mucosal tissue by LIF. One aim of our study was to see

whether this can be done and the results show that LIF can successfully discriminate between normal and malignant situations. Therefore, as a pilot study, current work is carried out on a limited number of normal and malignant colonic tissue samples and their discrimination analysis is performed using PCA, ANN and difference spectra analysis. Important point to note in this study is the classification ability of two different techniques; PCA and ANN are cross validated using the same set of spectral data. Also, since the algorithms are trained and validated with appropriate features of each class blindly (blind to the machine), we believe, the trained algorithms will be able to determine the class of any unknown new sample tested against it. Another important point is, PCA based match/no match analysis is carried out with more than one discrimination parameters (Mahalanabis distance and spectral residuals) and the possibility of miscalculation using only one parameter is avoided. This match/no match classification is highly objective and accurate as compared to single parameter discrimination and in our knowledge is different and unique and is not attempted earlier. Sensitivity result obtained in this study using PCA is 91.3% and using ANN it is 93.47%. These results reveal that the cases in which PCA failed in a narrow margin to classify the correct class of a sample, the ANN picked up even very minor details from those samples and classified them properly. Based on this observation it seems that ANN is a better classifier as compared to the PCA, though such a conclusion might be re-evaluated when more number of subjects from both categories becomes available for the study. Apart from chemometrics analysis, a qualitative difference spectra approach is also used and biochemical changes associated with diseased and non-diseased colonic conditions are studied. The difference spectra approach gives information about the biochemical changes that occur during malignancy and they may be correlated to the tissue architecture for disease diagnosis. This approach seems promising because compositional changes of tissue in different stages of a cancer can be monitored and exploited for further information of the disease. The real clinical problem for early detection of various cancers is the lack of appropriate facilities detecting cancers. LIF technique which is fast, requires only minimally trained technicians to operate, and in combination with statistical/

415

J. Chemometrics 2008; 22: 408–416

Copyright ß 2008 John Wiley & Sons, Ltd.

www.interscience.wiley.com/journal/cem

S. D. KAMATH ET AL. mathematical data processing objectively evaluate the stage of a disease can be of much interest in this direction and present work may be a valuable addition to the existing knowledge base.

Acknowledgements Authors are thankful to the Manipal University, Manipal, India for its supports during this study. Laser-induced fluorescence set-up established by BRNS, Govt. of India (DAE/BRNS Project No. 98/34/ 14/BRNS Cell/1090 dated 4th March 1999) is used for this study is also acknowledged.

REFERENCES 1. Braunwald E, Fauci A, Kasper DL, Hauser SL, Longo DL, Jameson JL (eds). Harrison’s Principles of Internal Medicine (15th edn). McGraw Hill: New York; 2001. 2. Parkin DM, Bray F, Ferlay J, Pisani P. Estimating the world cancer burden: Globocon 2000. Int. J. Cancer 2001; 94: 153–156. 3. Burt R. Colon cancer screening. Gastroenterology 2000; 119: 837– 853. 4. Levin TR, Palitz A, Grossman S, Conell C, Finkler L, Ackerson L, Rumore G, Selby JV. Predicting advanced proximal colonic neoplasia with screening sigmoidoscopy. JAMA 1999; 281(17): 1611–1617. 5. Alon U, Barkai N, Notterman DA, Gish K, Mack D, Levine AJ. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 1999; 96: 6745–6750. 6. Biomedical Photonics Handbook. Tuan Vo-Dinh (Editor-in-chief ), CRC Press: Boca Raton/London/New York/Washington, D.C., 2003. 7. Lakowicz R. Joseph Principles of Fluorescence Spectroscopy (2nd edn). Kluwer Academic Plenum publisher: New York, 1999; 63–66. 8. Mehimann BM, Rick K, Stepp H, Grevers G, Baumgartner R, Leunig A. Autofluorescence imaging and spectroscopy of normal and malignant mucosa in patients with head and neck cancer. Lasers Surg. Med. 1999; 25: 323–334. 9. Hanpeng C, Jianan YQ, Pyres JS, Kwong D, William IW. Light-induced autofluorescence spectroscopy for detection of naopharyngeal carcinoma in vivo. Appl. Spectrosc. 2002; 56: 1361–1367. 10. Kamath SD, Mahato KK. Optical pathology using oral tissue fluorescence spectra: classification by principal component analysis (PCA) and k-means nearest neighbour (k-NN) analysis. J. Biomed. Opt. 2007; 12(1): 014028-1–014028-9. 11. Kapadia CR, Cutruzzola FW, O’Brien KM, Stetz ML, Enriquez R, Deckelbaum LI. Laser-induced fluorescence spectroscopy of human colonic mucosa. Detection of adenomatous transformation. Gastroenterology 1990; 99(1): 150–157. 12. Schomacker KT, Frisoli JK, Compton CC, Flotte TJ, Richter JM, Deutsch TF, Nishioka NS. Ultraviolet Laser-Induced Fluorescence of Colonic Polyps. Gastroenterology 1992; 102: 1155–1160. 13. Cothren RM, Richards-Kortum R, Sivak MV Jr, Fitzmaurice M, Rava RP, Boyce GA. Gestrointestinal tissue diagnosis by laser-induced fluorescence spectroscopy at endoscopy. Gastrointest. Endosc. 1990; 36(2): 105–111. 14. Cothren RM, Sivak MV Jr, Van Dam J, Petras RE, Fitzmaurice M, Crawford JM, Wu J, Brennan JF, Rava RP, Manoharan R, Feld MS. Detection of dysplasia at colonscopy using laser-induced fluorescence: a blinded study. Gastrointest. Endosc. 1996; 44(2): 168– 176. 15. Yakshe PN, Bonner RF, Cohen P, Leon MB, Fleischer DE. Laser-induced fluorescence spectroscopy may distinguish colon cancer from normal human colon. Gastrointest. Endosc. 1989; 35: 184.

16. Richards-Kortum R, Rava RP, Petras RE, Fitzmaurice M, Sivak M, Feld MS. Spectroscopic diagnosis of colonic dysplasia. Photochem. Photobiol. 1991; 53(6): 777–786. 17. Marchesini R, Brambilla M, Pignoli E, Bottiroli G, Croce AC, Dal Fante MSP, di Palma S. Light-induced fluorescence spectroscopy of adenomas, adenocarcinomas and non-neoplastic mucosa in human colon. J. Photochem. Photobiol. B 1992; 14(3): 219–230. 18. Wagnieres GA, Star WM, Wilson BC. In vivo fluorescence spectroscopy and imaging for oncological applications. Photochem. Photobiol. 1998; 8(5): 603–632. 19. Manjunath BK, Krishna C, Chidananda MS, Venkatkrishna K, Kartha VB. Autofluorescence of oral tissue for optical pathology in oral malignancy. J. Photochem. Photobiol. B 2004; 73: 49–58. 20. Chidananda MS, Satyamoorthy K, Lavanya R, Manjunath AP, Kartha VB. Optical diagnosis of cervical cancer by fluorescence spectroscopy technique. Int. J. Cancer 2006; 119: 139–145. 21. Nayak GS, Kamath SD, Keerthilatha MP, Sarkar A, Satadru R, Kurien J, D’Almeida L, Krishnanand BR, Chidangil S, Kartha VB, Mahato KK. Principal component analysis (PCA) and artificial neural network (ANN) analysis of oral tissue fluorescence spectra. Classification of normal pre-malignant and malignant pathological conditions. Biopolymers 2006; 82: 152–166. 22. Mahato KK, Kamath SD, Kamath DV, Ray S, Bhat RA. Optical pathology of photoacoustic spectra of ovarian tissue: classification using prinicipal component analysis (PCA) and artificial neural network (ANN) analysis, http://www.symp16.nist.gov/pdf/p689.pdf. 23. Kamath SD, Nayak GS, Ray S, Kartha VB, Chidangil S, Mahato KK. Ovarian tissue fluorescence in different pathological conditions: classification using principal component analysis (PCA) and artificial neural network (ANN). (Under communication, 2008). 24. Chidananda SM, Satyamoorthy K, Lavanya R, Manjunath AP, Kartha VB. Optical diagnosis of cervical cancer by fluorescence spectroscopy technique. Int. J.Cancer 2006; 119: 139–145. 25. Joliffe IT. Principal Component Analysis. Springer-Verlag: New York, 1986. 26. Baxt WG. Use of an artificial neural network for the diagnosis of myocardia infarction. Ann. Intern. Med. 1991; 115: 843–848. 27. Boulle A, Chandramohan D, Weller P. A case study of using artificial neural networks for classifying cause of death from verbal autopsy. Int. J. Epidemiol. 2001; 30: 515–520. 28. George RA, Michail M, Michael Z, George F, Giannis Z, Asterios NK, Theodore GP. Artificial neural networks for discriminating pathologic from normal peripheral vascular tissue. IEEE Trans. Biomed. Eng. 2001; 48: 1088–1097. 29. Mahalanobis PC. On the generalized distance in statistics. Proc. Natl. Inst. Sci. India 1936; 12: 49–55. 30. Rudra P. Getting Started With MATLAB 5. Oxford University Press: New Delhi, 2000. 31. Palm WJ. Introduction to MATLAB 6 for Engineers. McGraw Hill: Singapore, 2001. 32. Sigurdsson S, Peter AP, Lars KH, Jan L. Detection of skin cancer by classification of Raman spectra. IEEE Trans. Biomed. Eng. 2004; 51: 1784–1793. 33. Burke HB, Philip HG, David BR, Donald EH, John NW, Frank E, Harrell Jr, Jaffrey RM, David PW, David GB. Artificial neural networks improve the accuracy of cancer in survival prediction. Cancer 1997; 79: 857– 862. 34. Zurada JM. Introduction to Artificial Neural Systems (3rd edn). Jaico Publishing House: Mumbai, India, 2002. 35. Eeldman J, Fanty MA, Goddard NH. Computing with structured neural networks. Computer 1988; 21: 91–103. 36. Hebb DO. The Organization of Behavior. John Wiley & Sons: New York, 1949. 37. Lippmann RP. An introduction to computing with neural nets. IEEE ASSP Mag. 1987; 4: 4–22. 38. Carpenter GA, Grossberg S. Pattern Recognition by Self Organizing Neural Networks. MIT Press: Cambridge, MA, 1991.

416 www.interscience.wiley.com/journal/cem

Copyright ß 2008 John Wiley & Sons, Ltd.

J. Chemometrics 2008; 22: 408–416

A pilot study on colonic mucosal tissues by ... - Wiley Online Library

Mar 27, 2008 - Pulsed laser-induced autofluorescence spectra of pathologically certified normal and malignant colonic mucosal tissues were recorded at 325nm excitation. The spectra were analysed using three different methods for discrimination purposes. First, all the spectra were subjected to the principal component ...

300KB Sizes 0 Downloads 60 Views

Recommend Documents

Cultureindependent pilot study of microbiota ... - Wiley Online Library
Jan 3, 2014 - ed by amplifying serial dilutions of known concentrations of. E. coli genomic DNA. Estimated 16S rRNA copy ... tinued to be significantly higher in relative abundance on the skin compared to the wound even .... ment, type and timing of

A Case Study of Diverging Stakeholder ... - Wiley Online Library
... in carbon emissions. Achieving such a trajectory is daunting from social, technical and ..... Climate Center, the Clean Air Task Force, and the PoLAR Climate Change Education ..... Nature (520) 171 – 179. doi: 10.1038/nature14338. Stirling ...

A Microbial Sensor Based on Direct Electron ... - Wiley Online Library
A Microbial Sensor Based on Direct Electron Transfer at. Shewanella Sp. Drop-Coated Screen-Printed Carbon Electrodes. Kariate Sudhakara Prasad,a A. B. ...

ELTGOL - Wiley Online Library
ABSTRACT. Background and objective: Exacerbations of COPD are often characterized by increased mucus production that is difficult to treat and worsens patients' outcome. This study evaluated the efficacy of a chest physio- therapy technique (expirati

poly(styrene - Wiley Online Library
Dec 27, 2007 - (4VP) but immiscible with PS4VP-30 (where the number following the hyphen refers to the percentage 4VP in the polymer) and PSMA-20 (where the number following the hyphen refers to the percentage methacrylic acid in the polymer) over th

Recurvirostra avosetta - Wiley Online Library
broodrearing capacity. Proceedings of the Royal Society B: Biological. Sciences, 263, 1719–1724. Hills, S. (1983) Incubation capacity as a limiting factor of shorebird clutch size. MS thesis, University of Washington, Seattle, Washington. Hötker,

Kitaev Transformation - Wiley Online Library
Jul 1, 2015 - Quantum chemistry is an important area of application for quantum computation. In particular, quantum algorithms applied to the electronic ...

PDF(3102K) - Wiley Online Library
Rutgers University. 1. Perceptual Knowledge. Imagine yourself sitting on your front porch, sipping your morning coffee and admiring the scene before you.

Standard PDF - Wiley Online Library
This article is protected by copyright. All rights reserved. Received Date : 05-Apr-2016. Revised Date : 03-Aug-2016. Accepted Date : 29-Aug-2016. Article type ...

Authentic inquiry - Wiley Online Library
By authentic inquiry, we mean the activities that scientists engage in while conduct- ing their research (Dunbar, 1995; Latour & Woolgar, 1986). Chinn and Malhotra present an analysis of key features of authentic inquiry, and show that most of these

TARGETED ADVERTISING - Wiley Online Library
the characteristics of subscribers and raises advertisers' willingness to ... IN THIS PAPER I INVESTIGATE WHETHER MEDIA TARGETING can raise the value of.

Verbal Report - Wiley Online Library
Nyhus, S. E. (1994). Attitudes of non-native speakers of English toward the use of verbal report to elicit their reading comprehension strategies. Unpublished Plan B Paper, Department of English as a Second Language, University of Minnesota, Minneapo

PDF(270K) - Wiley Online Library
tested using 1000 permutations, and F-statistics (FCT for microsatellites and ... letting the program determine the best-supported combina- tion without any a ...

Phylogenetic Systematics - Wiley Online Library
American Museum of Natural History, Central Park West at 79th Street, New York, New York 10024. Accepted June 1, 2000. De Queiroz and Gauthier, in a serial paper, argue that state of biological taxonomy—arguing that the unan- nointed harbor “wide

PDF(270K) - Wiley Online Library
ducted using the Web of Science (Thomson Reuters), with ... to ensure that sites throughout the ranges of both species were represented (see Table S1). As the ...

Standard PDF - Wiley Online Library
Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN 37996, USA,. 3Department of Forestry and Natural. Resources, Purdue University ...

PDF(118K) - Wiley Online Library
“legitimacy and rationality” of a political system results from “the free and ... of greater practical import and moral legitimacy than other models of democracy.

Feeding on yeast hydrolysate enhances ... - Wiley Online Library
Centre for the Integrative Study of Animal Behaviour, Macquarie University, North Ryde, Sydney, NSW, 2109, Australia. Accepted: 15 July 2008. Key words: ...

Effect of Hillslope Gradient on Vegetation ... - Wiley Online Library
In the present study, we investigated the effect of hillslope gradient on vegetation recovery on abandoned land of shifting cultivation in Hainan Island, south China, by measuring community composition and struc- ture of 25-year-old secondary forest

Effects of drying regime on microbial ... - Wiley Online Library
Energy budgets of wetlands in temperate deciduous forests are dominated by terrestrially derived leaf litter that decays under different drying conditions depending on autumn precipitation. We compared decay rates and microbial colonization of maple

3rd International workshop on crocodylian ... - Wiley Online Library
Oct 16, 2008 - This compilation represents the second set of crocodylian genetics and genomic articles pub- lished in a Special Issue of JEZ. Most of these articles were presented in April of 2007, in. Panama City, Panama for the 3rd Crocodylian. Gen

Effect of Hillslope Gradient on Vegetation ... - Wiley Online Library
A total of 49 733 free-standing woody plant stems higher than 10 cm and ...... Data analysis ... fallow were computed using EstimateS software (Version 7.5;.