Computational Chemistry Approach to Protein Kinase Recognition Using 3D Stochastic van der Waals Spectral Moments ´ LEZ-DI´AZ,1 LIANE SAI´Z-URRA,2 REINALDO MOLINA,3 YENNY GONZA ´ LEZ-DI´AZ,1 HUMBERTO GONZA ´ NCHEZ-GONZA ´ LEZ4 ANGELES SA 1

Department of Organic Chemistry and Institute of Industrial Pharmacy, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain 2 € Rostock, Albert-EinsteinAbteilung fu¨r Organische Chemie, Institut fu¨r Chemie, Universitat Strasse 3a, 18059 Rostock, Germany 3 REQUIMTE, Facultade de Ci^ encias, Universidade do Porto, 4169-007 Porto, Portugal 4 Department of Inorganic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain Received 27 April 2006; Revised 20 June 2006; Accepted 10 July 2006 DOI 10.1002/jcc.20649 Published online 31 January 2007 in Wiley InterScience (www.interscience.wiley.com).

Abstract: Three-dimensional (3D) protein structures now frequently lack functional annotations because of the increase in the rate at which chemical structures are solved with respect to experimental knowledge of biological activity. As a result, predicting structure-function relationships for proteins is an active research field in computational chemistry and has implications in medicinal chemistry, biochemistry and proteomics. In previous studies stochastic spectral moments were used to predict protein stability or function (Gonza´lez-D{az, H. et al. Bioorg Med Chem 2005, 13, 323; Biopolymers 2005, 77, 296). Nevertheless, these moments take into consideration only electrostatic interactions and ignore other important factors such as van der Waals interactions. The present study introduces a new class of 3D structure molecular descriptors for folded proteins named the stochastic van der Waals spectral moments (o k). Among many possible applications, recognition of kinases was selected due to the fact that previous computational chemistry studies in this area have not been reported, despite the widespread distribution of kinases. The best linear model found was Kact ¼ 9.448 0(c) þ10.948 5(c)  2.408 0(i) þ 2.4585(m) þ 0.73, where core (c), inner (i) and middle (m) refer to specific spatial protein regions. The model with a high Matthew’s regression coefficient (0.79) correctly classified 206 out of 230 proteins (89.6%) including both training and predicting series. An area under the ROC curve of 0.94 differentiates our model from a random classifier. A subsequent principal components analysis of 152 heterogeneous proteins demonstrated that  k codifies information different to other descriptors used in protein computational chemistry studies. Finally, the model recognizes 110 out of 125 kinases (88.0%) in a virtual screening experiment and this can be considered as an additional validation study (these proteins were not used in training or predicting series). q 2007 Wiley Periodicals, Inc.

J Comput Chem 28: 1042–1048, 2007

Key words: protein structure-function relationships; kinases; Markov models; moments; van der Waals interactions

Introduction Three-dimensional (3D) protein structures now frequently lack functional annotations because of the increase in the rate at which structures are solved with respect to experimental knowledge of biological activity. As a result, the classification of proteins into families based on their 3D molecular structures is ultimately a very important goal.1 Combinatorial Extension, ProCat, SPASM, and VAST can be used to detect similar structures and folds and carry out function annotation for 3D protein structures.2–6 Numerous methods use

the spatial arrangement of atoms in protein functional sites to create a template.7 The majority of these methods for predicting protein function are reliant on identifying a similar protein and

This article contains supplementary material available via the Internet at http://www.interscience.wiley.com/jpages/0192-8651/suppmat Correspondence to: H. Gonza´lez-D´az; e-mail: [email protected] Contract/grant sponsor: Xunta de Galicia, Programa Isidro Parga Pondal and PGIDIT research projects; contract/grant numbers: PGIDT05BTF20302PR-2, PXIB20304PR

q 2007 Wiley Periodicals, Inc.

Protein Kinase Recognition Using 3D Stochastic van der Waals Spectral Moments

transferring its annotations to the query protein. This method fails when a similar protein cannot be identified, or when any similar proteins identified also lack reliable annotations.8 As an alternative, one may use a method that can assign function from structure without the use of algorithms reliant upon 3D alignments. This can be done using simple attributes (molecular descriptors) that can be calculated from any crystal structure.9–11 The models derived with these numerical indices are protein quantitative structure-activity relationship (QSAR) studies. Very successful QSAR methods are based on spectral moments of different matrices. However, not many applications of spectral moments to connect chemical structure with biological function of large series’ of proteins have been reported to date.12–20 On the other hand, our research group has introduced different molecular descriptors for the characterization of both small-sized molecules and biopolymers and these descriptors are based on the Markov Models (MM) theory. These molecular descriptors describe changes in electrostatic potential,21–23 vibration frequency,24 electron distribution,25 or free energies.26 In this sense, we used different mathematical formulations such as entropies,27 absolute probabilities,28 electrostatic potentials,29 affinity constants,30 and stochastic spectral moments.31–33 However, a thorough inspection of this approach reveals that little attention is paid to other effects that are also of major importance for biological activity, such as van der Waals (vdw) interactions. In general, numerous studies continue to appear related to protein structure, function, and bioinformatics and these include different proteins.34–36 For instance, interesting protein QSAR studies have been reported by Chou37–42 using classifiers such as support vector machines (SVM). Alternatively, Marrero-Ponce et al.43,44 reported very interesting molecular descriptors for protein QSAR studies and these worked very well with Liner Discriminant Analysis (LDA) methods. However, protein kinase activity has not been studied in detail with linear QSAR techniques. Because of the widespread distribution and importance of this area, we selected a large and heterogeneous database of kinases. The kinase family is one of the largest target families in the human genome. So vast is the extent of this field that it has been given a name of its own: Kinomics. The key function of this family of compounds in signal transduction for all organisms makes it a very attractive target. Kinases may act as therapeutic targets or play important roles in many diseases such as cancer, diabetes, inflammation, arthritis, alopecia, herpes virus infection, Alzheimer’s disease, asthma, and malaria. However, a general QSAR model to discover new kinases based on protein 3D, structural molecular descriptors had not been reported until we became interested in this area.45–57 In this study we define stochastic vdw spectral moments (o k), which are an extension of the previous moments but encompass vdw-type interactions to describe 3D protein structures. The work also involved an investigation into the connection of the new moments to protein function as an alternative method to 3D alignment-similarity procedures. A general-purpose classification model is validated that can predict whether a protein 3D structure can act as kinase or not. We compare o k with other alternative stochastic descriptors for protein structure. In addition, a Principal Component Analysis (PCA) was carried out in this work to dem-

1043

onstrate that o k encodes structural information different from other previous non-stochastic molecular descriptors. Both the QSAR studies and the PCA studies validate the main concept behind the present work: o k are new molecular descriptors with potential applications in protein QSAR studies.

Methods The Total and Local vdw Stochastic Spectral Moments

In the work described here, we generalized our previous stochastic spectral moments32,33,58–61 to characterize protein 3D structure taking into consideration vdw interactions. The elements (1pij) of the new matrix 1Pvdw are equal to the probability of direct vdw interaction between the amino acids ai and aj (1pij) if ai and aj are placed at a distance shorter than a certain cut-off (ij ¼ 1), which is otherwise equal to 0 (ij ¼ 0). The vdw energy takes a Lennard-Jones 12-6 potential11 form [eq. (1)]. The parameters A and B are also defined and they can be seen in eqs. (2) and (3),11 where " is the well depth and R one half of the separation at which the energy passes through a minimum (i.e. the van der Waals radius). The parameter rij is the distance between a pair of amino acids and these are considered as rigid spheres. The Lennard-Jones potential is characterized by an attractive component that varies as r6 and a repulsive component that varies as r12. The energy function modeling the steric repulsion between pairs of amino acids becomes large and positive at interatomic distances r less than the sum of the van der Waals radii of the probe amino acid and the target amino acid. Molecular modeling studies by Professor Celda’s group62–64 or by Leach65 demonstrated the utility of similar truncation for electrostatic or vdw interactions: 8 9 > :rA12  rB6 > ;  ij Eij ij ij 1 pij ¼ þ1 ¼ þ1 8 9 P vdw P :A Elk ik r12  rB6 ; vdw

k¼1

k¼1

ik

(1)

ik

12 pffiffiffiffiffiffiffi "i "j Ri þ Rj

(2)

6 pffiffiffiffiffiffiffi B ¼ 2 "i "j Ri þ Rj

(3)



The vdw field truncation factor (ij) is of major importance as it simplifies calculations by ignoring all long-range interaction terms. Norberg and Nilsson66,67 recently reviewed the use of truncation in biopolymers. However, even weak, long-range interactions may be important-especially if they act in a cooperative way. Consequently, we do not ignore long-range interactions at all. Indeed, our model forbids long-range interactions in the direct interaction matrix 1Pvdw but we can consider them by using the higher order spectral moments of the matrix to describe the biopolymer 3D structure:

Journal of Computational Chemistry

k

 o

k ¼ Tr

DOI 10.1002/jcc

1

vdw

(4)

1044

Gonza´lez-D{az et al. • Vol. 28, No. 6 • Journal of Computational Chemistry

Results and Discussion The Structure-Activity Relationship Study

The best equation found in this study to assign kinase family membership given the 3D folded structure of the protein was: Kact: ¼ 9:44c 0 þ 10:94c 5  2:40i 0 þ 2:45m 5 þ 0:73

Figure 1. Graphical example to show the spatial distribution of different orbits used in the definition of the descriptors using a kinase protein (PDB code 2C5X). We labeled Core orbit as (i), Inner orbit (ii), Middle orbit (iii) and Surface orbit (iv). Note that (i) and (iii) were light colored while (ii) and (iv) were dark colored.

Where Tr is the trace of the matrix and indicates that one must sum the probabilities in the main diagonal of the matrix. The letter ‘‘o’’ signifies ‘‘orbit’’ and indicates that we can consider all the aa in the proteins and calculate a global or total moment or only a local moment for a specific collection of aa named orbit. The orbits in this work are defined calculating the ratios between the distance from the -Carbon of the aaj to the protein centre of mass d(j) and the largest of these distances dmax(j). In the present study, four values were used by default for the parameter orbit ¼ 0, 1, 2, 3, 4, considering aa with a ratio r ¼ 100 d(j)/dmax(j) ranging between the following limits: 0  orbit0 < 25  orbit1 < 50  orbit2 < 75  orbit3 < 100% or all of the aa together, orbit4. As the orbits are related to the position of the aa with respect to the centre of the protein, we have named them orbit0 ¼ core, orbit1 ¼ inner, orbit2 ¼ middle, and orbit3 ¼ surface (Fig. 1).68,69 It can be seen in Figure 1 that the different orbits are represented in gray colors scale in an effort to provide a clearer understanding of how they were calculated. In this example we used a kinase protein with PDB code 2C5X70 and the software Chimera.71 It should be noted that the orbits presuppose that we calculate the descriptors only for certain aa that lie within lower and upper distances from the centre of the protein. The symbols for the corresponding spectral moments are c k, i k, m k, and s k, respectively. All of the calculations were carried out using our experimental software BIOMARKS.72

(5)

Where Kact. is a dummy variable76; i.e., Kact. ¼ 1 for proteins with kinase activity and Kact. ¼ 1 for non-active ones. The equation was derived with LDA, which is simpler than SVM and other classifiers as mentioned above. The Wilk’s statistic ( ¼ 0.48), Fischer Ratio (F(1.51) ¼ 60.9), and significance level (p < 0.001) of the parameters were assessed.77 In addition, we controlled the Matthew’s coefficient (C ¼ 0.79)78 and the cases/adjustable parameters ratio ( ¼ 17.2).79 It can be seen that the model is statistically significant (p < 0.05). The high value for the Matthew’s coefficient indicates a strong linear relationship between the molecular descriptors and the output of the model.80 Furthermore, we used Randicˇ orthogonalization81–83 prior to seeking the model to avoid collinearity between the three molecular descriptors. Finally, the high value of  ¼ 17.2 shows that the model is not over-fitted by an excess of parameters; this parameter is expected to be >4.79 This discriminant model showed excellent results in the training and external prediction series used to validate the model, as can be seen in Table 1. (Table 2 for some selected proteins, details in spplementary materials). In this study we validated the utility of o k for kinase recognition not only with the aforementioned results for training and predictability analysis. We also performed ROC curve analysis to show the differences between our classifier and a random one. A pronounced ROC curve is depicted in Figure 2 with an area under curve of 0.954, which is markedly higher than 0.5-the area under the curve expected for a random classifier (diagonal line).84 The four most statistically important variables introduced in the development of the best classifier model for kinase proteins are c 0, c5, i 0, and m 5. As can be seen, the variables describe the indirect vdw interactions between ai and aj at a distance k ¼ 5 within the protein backbone and the initial unperturbed state of the amino acids when the interactions with other amino acids are not considered. On the other hand, the variables are related to core, inner, and middle orbits and this situation allows us to consider that most of the active sites of the kinase proteins might be in the middle orbits or in proximity to this area. However, although changes in core and inner orbits do not affect directly the active site, they might affect the protein as a whole. Table 1. Training and Predictability Analysis Results.

Experimental Data

The list of kinases was taken from a very recent review by Vieth et al.73 who collected almost all publicly accessible kinase X-ray structures available in PDB as of June 18, 2003. Non-kinase proteins were collected from a list by Fleming and Richards74 of non-homologous proteins. The X-ray files for both groups of proteins were taken from the Protein Data Bank (PDB).75

Training (89.0% total)

Kinases Nonkinases

Journal of Computational Chemistry

Prediction (89.7% total)

%

Kinases

Nonkinases

%

Kinases

Nonkinases

81.4 96.5

70 3

16 83

86.2 96.6

25 1

4 28

DOI 10.1002/jcc

Protein Kinase Recognition Using 3D Stochastic van der Waals Spectral Moments

1045

Table 2. Classification Results for Some Selected Kinases and

Nonkinases in Training and Cross-Validation. pdba

pb

Kinases 1cdk 99.99? 1e8x 99.99 1e9o 99.98 Nonkinases 2fox 51.81 1bab 73.07? 1hvd 51.14

pdba

pb

pdba

pb

pdba

pb

1ql6 1qcf 2src

98.94 99.09 99.18?

1h1r b 1h26 c 1fq1

95.45? 66.45 74.42

1ydt 1apm 1jwh c

49.33 46.76 48.2?

2bbk h 2sn3 4blm

78.91 80.67 80.51

1gof 2chs j 1tgs

87.15 83.66? 84.95

1arb 1trb 1etb

93.26 90.24 93.67

See detailed results in supplementary material. a PDB ID. b Resulting probability (%). This value is higher than 50% for proteins predicted as kinases and lower than 50% for misclassified ones. c Protein used for the external predicting subset.

Figure 2. ROC curve for the present study. Kinase Virtual Screening

Finally, after analyzing the training and predictability of the model and assessing the relationships between the descriptors, we carried out an additional virtual screening experiment to illustrate the use of the model. The model correctly classified 88.0% of 129 kinases selected from the literature and these compounds were not used in the experiments described in the previous section. The supplementary material depicts the PDB IDs and the resulting probabilities for these proteins. This last result successfully confirms the hypothesis of linear dependence between the o k of a protein and the probability of recognizing it as a kinase. Thus, the present model can be used to connect protein structure with the probability that it carries out kinase biological functions.85 Comparison of obk with Other Stochastic Descriptors

As stated above, we selected the method involving spectral moments of the stochastic matrix 1Pvdw mainly due to its broad and successful use. However, other invariants for stochastic matrices similar to 1Pvdw have been used before. For instance, Marrero-Ponce et al.86,87 recently used stochastic-type quadratic forms of the pseudo-graph of small sized molecules. Recently, our group used the entropy and the mean potential calculated with different Markov matrices (1P),20,22,88,89 including both small and macromolecules. As a result, it is mandatory to validate quantitatively the preferential selection of spectral moments with respect to other stochastic macromolecular invariants. Firstly, we calculated the three kinds of invariants for a large data set containing all kinase and non-kinase proteins used in the training and validation series. The data set also contains all of the 129 kinases used in the virtual screening as well as a plus of non-kinases to avoid inconvenient disproportion in large groups (kinases/non-kinases ratio). A summary for classification results using these three kinds of invariants is shown in Table 3. It should be noted that both entropy and potential give rise to lower classification ratios than those for the spectral moments.

However, if we consider carefully the results in Table 3, it can be seen how these methodologies present differences concerning the accuracy in predicting whether the proteins are kinases or not. For instance, entropy descriptors predict more easily kinase than non-kinase proteins as do the potential descriptors. The opposite is the case for the spectral moments. This fact could be due to the differences and similarities between these descriptors concerning the structural information that they are able to encode because of their mathematical definition.29,68,69 All of these methodologies have different final equations despite the fact that they all consider vdw interactions. Nonetheless, in spite of predicting non-kinases more easily, the Spectral moments have the least separation between the percentage of good classification for these kinds of groups (95.55% (non-K) to 85.40% (K) ¼ 10.15) and the best total percentage (89.71). The entropy approach gives separation values between the percentages equal to 11.64 and a total percentage of 71.29 whereas potential descriptors give rise to the worst separation values between the percentages (67.99) and total percentage (69.38).

Table 3. Comparison Between  k and Other Stochastic Descriptors. o

Entropy Nonkinases Kinases Total Potential Nonkinases Kinases Total Spectral moments Nonkinases Kinases Total

Journal of Computational Chemistry

Percent

Non kinases

Kinases

64.61 76.25 71.29

115 57

63 183

30.34 98.33 69.38

54 4

124 236

95.55 85.40 89.71

170 35

8 205

DOI 10.1002/jcc

Gonza´lez-D{az et al. • Vol. 28, No. 6 • Journal of Computational Chemistry

1046

Table 4. Three-Dimensional Molecular Descriptor Space Derived with

PCA Analysis.

I3 (%) H (%) S Naa OSP RG c 0 c 5 i 0 m 5

Folding factor

Bulky factor

van der Waals factor

0.98 0.96 0.96 0.02 0.06 0.08 0.03 0.03 0.05 0.05

0.03 0.06 0.03 0.95 0.76 0.93 0.53 0.04 0.00 0.08

0.02 0.09 0.04 0.07 0.00 0.01 0.09 0.92 0.98 0.81

On the other hand, we found that if one tries to seek a model using a pool of descriptors that encompass spectral moments, entropies, and potentials, the same model (3) is obtained as described above. All of these results demonstrate that spectral moments are superior to the other approaches involved in the comparison and that they alone are able to recognize protein kinases. The Relationship of obk with Other Nonstochastic Descriptors

New methods to generate molecular descriptors are being introduced for protein QSAR.90 In such a process it is mandatory to study the relationships between the new descriptors and the previously reported descriptors. PCA91 is a technique that is classically used to build property spaces and study the relationships between molecular descriptor space. For this reason we developed a PCA using o k entered in the QSAR model and different molecular descriptors previously calculated for a series of 152 nonhomologous proteins.74,92 One of the descriptors used was Occluded Surface Packing, OSP, and this measures the interatomic occluded surface area for each atom in the protein; the values for the 152 non-homologous proteins were reported by Fleming and Richards74 along with the mathematical definition. Another descriptor, the radius of gyration of the protein, RG, has been widely used as a measure of the global compactness and as a measure of packing. Helix content, %H, strand content, %S, Folding degree, I3, and the number of aa residues n were also considered.92 Varimax normalized rotation of the factor space was carried out. The molecular descriptor spaces derived have three dimensions (principal components) and these are denoted as Bulky, Folding, and vdw factors. The eigenvalues for these factors were 3.3, 2.6, and 1.74, respectively, which explains the values 36.4, 28.6, and 19.3% for the overall variance, respectively (total 84.3%). The factor loadings of the PCA are depicted in Table 4.93 The names of the factors coincide with the structural information codified by the variables that they group. For instance, the Bulky factor is so named because it groups bulky properties such as n. It can be seen from the table that, with the exception of o0, all the other vdw moments c 5, i0, and m 5 (not previously orthogonalized) encode information that is different to that

encoded by classic bulky (n, RG) and folding (%H, %S, I3) protein descriptors. This result demonstrates that the present descriptors are qualitatively different to the former.

Conclusions This study demonstrates that ok are novel 3D structural parameters that are promising for protein computational chemistry structure-function studies. Specifically, the model fitted by means of LDA accurately discriminates between protein kinases and nonkinase proteins taking into consideration protein 3D structure. Comparison of o k with other alternative stochastic invariants, such as entropies and potentials, demonstrates that o  k are more successful for this problem. PCA analysis carried out in this study demonstrated that o k codifies information that is qualitatively different to the information encoded by previous nonstochastic protein molecular descriptors. All of these facts validate the use of o k as a new and promising tool for protein structure-function computational chemistry analysis.

Acknowledgments Specifically, H. Gonza´lez-D{az thanks funding from the program ‘‘Axuda para a incorporacion de investigadores tecnologos/visitantes da CONSELLEREIA DE INOVACION, INDUSTRIA E COMERCIO, IN8061 2005/63-0’’. This author also acknowledges a contract as guest professor from the Department of Organic Chemistry of the University of Santiago de Compostela in Spain as well as Vlaamse Interuniversitaire Raad, VLIR, USO, University Development Cooperation.

References 1. Najmanovich, R. J.; Torrance, J. W.; Thorton, J. M. BioTechniques 2005, 38, 847. 2. Shindyalov, I. N.; Bourne, P. E. Protein Eng 1998, 11, 739. 3. Gibrat, J.-F.; Madej, T.; Bryant, S. H. Curr Opin Struct Biol 1996, 6, 377. 4. Wallace, A. C.; Laskowski, R.; Thornton, J. M. Protein Sci 1996, 5, 1001. 5. Wallace, A. C.; Borkakoti, N.; Thornton, J. M. Protein Sci 1997, 6, 2308. 6. Kleywegt, G. J. J Mol Biol 1997, 273, 371. 7. Dobson, P. D.; Cai, Y.; Stapley, B. J.; Doig, A. J. Curr Med Chem 2004, 11, 2135. 8. Dobson, P. D.; Doig, A. J. J Mol Biol 2003, 330, 771. 9. Zbilut, J. P.; Giuliani, A.; Colosimo, A.; Mitchell, J. C.; Colafranceschi, M.; Marwan, N.; Webber, Ch. L., Jr.; Uversky, V. N. Proteome Res 2005, 3, 1243. 10. Dobson, P. D.; Doig, A. J. J Mol Biol 2005, 345, 187. 11. Todeschini, R.; Consonni, V. Handbook of Molecular Descriptors. Wiley VCH: Weinheim, Germany; 2000. 12. Morales, A. H.; Gonza´lez, M. P.; Rieumont, J. B. Polymer 2004, 45, 2045. 13. Gonza´lez, M. P.; Dias, L. C.; Morales, A. H. Polymer 2004, 15, 5353. 14. Gonza´lez, M. P.; Morales, A. H.; Molina, R. Polymer 2004, 45, 2773.

Journal of Computational Chemistry

DOI 10.1002/jcc

Protein Kinase Recognition Using 3D Stochastic van der Waals Spectral Moments

15. Gonza´lez, M. P.; Tera´n, M. C.; Fall, Y.; Dias, L. C.; Morales, A. H. Polymer 2005, 46, 2783. 16. Gutman, I.; Rosenfield, V. R. Theor Chim Acta 1996, 93, 191. 17. Lee, S. Acc Chem Res 1991, 24, 249. 18. Burdett, J. K.; Lee, S. J Am Chem Soc 1985, 107, 3063. 19. Estrada, E. Bioinformatics 2002, 18, 697. 20. Estrada, E. Proteins: Struct Funct Bioinf 2004, 54, 727. 21. Gonza´lez-D{az, H.; Molina, R. R.; Uriarte, E. Polymer 2004, 45, 3845. 22. Ramos, R.; Gonza´lez-D{az, H.; Molina, R. R.; Uriarte, E. Proteins: Struct Funct Bioinf 2004, 56, 715. 23. Gonza´lez-D{az, H.; Molina, R. R.; Uriarte. E. Bioorg Med Chem Lett 2004, 14, 4691. 24. Gonza´lez-D{az, H.; Ramos, R.; Molina, R. Bioinformatics 2003, 19, 2079. 25. Gonza´lez-D{az, H.; Marrero, Y.; Herna´ndez, I.; Bastida, I.; Tenorio, I.; Nasco, O.; Uriarte, E.; Castan˜edo, N.; Cabrera, M.; Aguila, E.; Marrero, O.; Morales, A.; Pe´rez, M. Chem Res Tox 2003, 16, 1318. 26. Gonza´lez-D{az, H.; Agu¨ero, G.; Cabrera, M. A.; Molina, R.; Santana, L.; Uriarte, E.; Delogu, G.; Castan˜edo, N. Bioorg Med Chem Lett 2005, 15, 551. 27. Gonza´lez-D{az, H.; Tenorio, E.; Castan˜edo, N.; Santana, L.; Uriarte, E. Bioorg Med Chem 2005, 13, 1523. 28. Gonza´lez-D{az, H.; Herna´ndez, S. I.; Uriarte, E.; Santana, L. Comput Biol Chem 2003, 27, 217. 29. Sa{z-Urra, L.; Gonza´lez-D{az, H.; Uriarte, E. Bioorg Med Chem 2005, 13, 3641. 30. Gonza´lez-D{az, H.; Cruz-Monteagudo, M.; Molina, R.; Tenorio, E.; Uriarte, E. Bioorg Med Chem 2005, 13, 1119. 31. Gonza´lez-D{az, H.; Olaza´bal, E.; Castan˜edo, N.; Herna´dez, S. I.; Morales, A.; Serrano, H. S.; Gonza´lez, J. J Mol Mod 2002, 8, 237. 32. Gonza´lez-D{az, H.; Uriarte, E. Biopolymers 2005, 77, 296. 33. Gonza´lez-D{az, H.; Uriarte, E.; Ramos, R. Bioorg Med Chem 2005, 13, 323. 34. Setny, P.; Geller, M. Protein: Struct Funct Bioinf 2005, 58, 511. 35. Srinivasan, A. N.; Krupa, A. Protein: Struct Funct Bioinf 2005, 58, 180. 36. Verkhivker, G. M. Protein: Struct Funct Bioinf 2005, 58, 706. 37. Chou, K.-C. Curr Protein Pept Sci 2002, 3, 615. 38. Chou, K.-C. Peptides 2001, 22, 1973. 39. Chou, K.-C. Anal Biochem 2000, 286, 1. 40. Chou, K.-C. J Biol Chem 1993, 268, 16938. 41. Chou, K.-C. Anal Biochem 1996, 233, 1. 42. Chou, K.-C.; Zhang, C. T. J Protein Chem 1993, 12, 709. 43. Marrero-Ponce, Y.; Medina-Marrero, R.; Castillo-Garit, J. A.; Romero-Zaldivar, V.; Torrens, F.; Castro, E. A. Bioorg Med Chem 2005, 13, 3003. 44. Marrero-Ponce, Y.; Medina-Marrero, R.; Castro, E. A.; Ramos de Armas, R.; Gonza´lez, D. H.; Romero-Zaldivar, V.; Torrens, F. Molecules 2004, 9, 1124. 45. Bossemeyer, D. FEBS Lett 1995, 369, 57. 46. Davis, S. T.; Benson, B. G.; Bramson, H. N.; Chapman, D. E.; Dickerson, S. H.; Dold, K. M.; Eberwein, D. J.; Edelstein, M.; Frye, S. V.; Gampe, R. T., Jr.; Grif, R. J.; Harris, P. A.; Hassell, A. M.; Holmes, W. D.; Hunter, R. N.; Knick, V. B.; Lackey, K.; Lovejoy, B.; Luzzio, M. J.; Murray, D.; Parker, P.; Rocque, W. J.; Shewchuk, L.; Veal, J. M.; Walker, D. H.; Kuyper, L. F. Science 2001, 291, 134. 47. Schang, L. M. Biochim Biophys Act 2004, 1697, 197. 48. Moffata, J. F.; McMichaelb, M. A.; Leisenfeldera, S. A.; Taylora, S. L. Biochim Biophys Act 2004, 1697, 225. 49. Tsai, L.-H.; Lee, M.-S.; Cruz., J. Biochim Biophys Act 2004, 1697, 137.

1047

50. Droucheaua, E.; Primota, A.; Thomasa, V.; Matteib, D.; Knockaerta, M.; Richardsonc, C.; Sallicandrod, P.; Alanod, P.; Jafarshade, A.; Barattea, B.; Kunickf, C.; Parzyg, D.; Pearlc, L.; Doerige, C.; Meijer, L. Biochim Biophys Act 2004, 1697, 181. 51. Garc{a-Echeverr{a, C.; Traxler, P.; Evans, D. B. Med Res Rev 2000, 20, 28. 52. Davies, S. P.; Reddy, H.; Caivano, M.; Cohen, P. Biochem J 2000, 351, 95. 53. Bain, J.; Mclauchlan, H.; Elliott, M.; Cohen, P. Biochem J 2003, 371, 199. 54. Wonga, W. S. F.; Leong, K. P. Biochim Biophys Act 2004, 1697, 53. 55. Chou, K.-C.; Watenpaugh, K. D.; Heinrikson, R. L. Biochem Biophys Res Commun 1999, 259, 420. 56. Zhang, J.; Luan, C.-H.; Chou, K.-C.; Johnson, G. V. W. Proteins 2002, 48, 447. 57. Chou, K.-C. Curr Med Chem 2004, 11, 2105. 58. Gonza´les-D{az, H.; Gia, O.; Uriarte, E.; Herna´dez, I.; Ramos, R.; Chaviano, M.; Seijo, S.; Castillo, J. A.; Morales, L.; Santana, L., Akpaloo, D., Molina, E., Cruz, M., Torres, L. A.; Cabrera, M. A. J Mol Mod 2003, 9, 395. 59. Gonza´lez-D{az, H.; Ramos, R.; Molina, R. Bull Math Biol 2003, 65, 991. 60. Gia, O.; Magno, S. M.; Gonza´lez-D{az, H.; Quezada, E.; Santana, L.; Uriarte, E.; DallaVia, L. Bioorg Med Chem 2005, 13, 809. 61. Ramos, R.; Gonza´lez-D{az, H.; Molina, R.; Gonza´lez, M. P.; Uriarte, E. Bioorg Med Chem 2004, 12, 4815. 62. Navarro, E.; Fenude, E.; Celda, B. Biopolymers 2004, 73, 229. 63. Navarro, E.; Fenude, E.; Celda, B. Biopolymers 2002, 64, 198. 64. Monleon, D.; Celda, B. Biopolymers 2003, 70, 212. 65. Leach, A. R. Molecular modeling. Principles and applications. Longman: Singapore; 1996. 66. Norberg, J.; Nilsson, L. Biophys J 2000, 79, 1537. 67. Norberg, J.; Nilsson, L. Quart Rev Biophys 2003, 36, 257. 68. Gonza´lez-D{az, H.; Sa{z-Urra, L.; Molina, R.; Uriarte, E. Polymer 2005, 46, 2791. 69. Gonza´lez-D{az, H.; Molina, R.; Uriarte, E. FEBS Lett 2005, 579, 4297. 70. Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne. P. E. Nucleic Acids Res 2000, 28, 235. 71. Pettersen, E. F.; Goddard, T. D.; Huang, C. C.; Couch, G. S.; Greenblatt, D. M.; Meng, E. C.; Ferrin T. E. J Comput Chem 2004, 25, 1605. 72. Gonza´lez-D{az, H.; Molina, R.; Herna´ndez, I.;BIOMARKS' version 2.0, 2005 (BIOinformatic MARKovian Studio). This is in-house software non-commercially available, contact information: gonzalezdiazh@ yahoo.es. 73. Vieth, M.; Higgs, R. E.; Robertson, D. H.; Shapiro, M.; Gragg, E. A.; Hemmerle, H. Biochim Biophys Act 2004, 1697, 243. 74. Fleming, P. J.; Richards, F. M. J Mol Biol 2000, 299, 487. 75. Bernstin, F. C.; Koetzle, T. F.; Wiliams, G. J. B. J Mol Biol 1977, 112, 535. 76. Kowalski, R. B.; Wold, S. In Handbook of Statistics; Krishnaiah, P. R.; Kanal, L. N, Eds.; North Holland Publishing: Amsterdam, 1982; pp. 673–697. 77. Van Waterbeemd, H. In Method and Principles in Medicinal Chemistry, Vol. 2; Manhnhold, R., Krogsgaard-Larsen, H.; Timmerman, H., Eds.;Chemometric Methods in Molecular Design; Van Waterbeemd, H., Ed.; VCH: Weinheim, 1995; pp 265–282. 78. Matthews, B. W. Biochim Biophys Acta 1975, 405, 442. 79. Garcia-Domenech, R.; de Julian-Ortiz, J. V. J Chem Inf. Comput Sci 1998, 38, 445.

Journal of Computational Chemistry

DOI 10.1002/jcc

1048

80. 81. 82. 83. 84.

Gonza´lez-D{az et al. • Vol. 28, No. 6 • Journal of Computational Chemistry

Yuan, Z. FEBS Lett 1999, 451, 23. Randic’, M. J. Chem Inf Comput Sci 1991, 31, 311. Randic’, M. New J Chem 1991, 15, 517. Randic’, M. J. Mol Struct (THEOCHEM) 1991, 233, 45. Povoa, P.; Coelho, L.; Almeida, E.; Fernandes, A.; Mealha, R.; Moreira, P.; Sabino, H. Clin Microbiol Infect 2005, 11, 101. 85. Fersh, A. Structure and Mechanism in Protein Science; N. H. Freeman: New York, 1999. 86. Marrero-Ponce, Y.; Iyarreta-Veit{a, M.; Montero-Torres, A.; ´ vila, P. E.; Kirchgatter, K.; Romero-Zaldivar, C.; Brandt, C. A.; A Machado, Y. J Chem Inf Model 2005, 45, 1082.

87. Marrero-Ponce, Y.; Montero-Torres, A.; Romero-Zaldivar, Iyarreta-Veit{a, M., Mayo´n-Pe´rez, Garc{a-Sa´nchez, R. N. Bioorg Med Chem 2005, 13, 1293. 88. Santana, L.; Uriarte, E.; Gonza´lez-D{az, H.; Zagotto, G.; Soto-Otero, ´ lvarez. J Med Chem 2006, 49, 1149. R.; Me´ndez-A 89. Agu¨ero-Chapin, G.; Gonza´lez-D{az, H.; Molina, R.; Varona-Santos, J.; Uriarte, E.; Gonza´lez-D{az, Y. FEBS Lett 2006, 580, 723. 90. Estrada, E.; Uriarte, E.; Vilar, S. J Proteome Res 2006, 5, 105. 91. Cramer, R. D., III. J Am Chem Soc 1980, 102, 1837. 92. Estrada, E. J. Chem Inf Comput Sci 2004, 44, 1238. 93. Malinowski, E. R.; Howery, D. G. Factor Analysis in Chemistry; Wiley-Interscience: New York, 1980.

Journal of Computational Chemistry

DOI 10.1002/jcc

Computational chemistry approach to protein kinase ...

moments to protein function as an alternative method to 3D align- ... of the separation at which the energy passes through a minimum ..... All of these facts.

162KB Sizes 2 Downloads 239 Views

Recommend Documents

Computational chemistry approach to protein kinase ...
components analysis of 152 heterogeneous proteins demonstrated that k codifies ... descriptors used in protein computational chemistry studies. Finally .... Experimental Data ..... Kowalski, R. B.; Wold, S. In Handbook of Statistics; Krishnaiah,.

protein kinase II (/, isoform) contains a proline-rich ...
sponding to two contiguous EcoRI–Bsa HI restriction fragments (730 and 815 bp) of a cDNA clone of mouse brain CaM kinase II a-subunit. Eleven putative CaM ...

AMP-activated protein kinase is activated by low ...
0.1% (w\v) BSA] containing 10 mM glucose at. 37 mC for 60 min. The medium was removed and replaced with. 5 ml of KRH buffer with or without glucose or test ...

AMP-activated protein kinase is activated by low ...
cedure gave quantitative recovery of AMPK and resulted in a higher incorporation of ..... Results are expressed as meanspS.E.M. for three experiments. Data for .... University of Bristol, Bristol, U.K.) for INS-1 cells and Kevin Docherty (Institute o

Protein Kinase Activities in Rat Pancreatic Islets of ...
were from Boehringer Corp. (London) Ltd., Lewes,. East Sussex BN7 ...... homogenates measured ata cyclic [3H]AMP con- centration of 1pUM was found to be ...

Human calcium/calmodulin-dependent protein kinase ... - Springer Link
project chromosome 10 data base which was identified by. BLAST homology searches .... ing the variable domains of the protein ensured excel- lent detection .... gene based genetic linkage and comparitive map of the rat. X chromosome.

B201 A Computational Intelligence Approach to Alleviate ...
B201 A Computational Intelligence Approach to Alleviate Complexity Issues in Design.pdf. B201 A Computational Intelligence Approach to Alleviate Complexity ...

A multi-objective evolutionary approach to the protein ...
protein structure prediction; multi-objective evolutionary algorithms. 1. ... in the cell, such as enzymatic activity, storage and transport of material .... 2005). In the following, we formally introduce the MOOP. 2.1. Multi-objective optimization p

HotSprint: Database of Computational Hot Spots at Protein Interfaces ...
We present a new database of computational hot spots at protein interfaces: HotSprint. Sequence conservation and solvent accessibility of interface residues are ...

A Computational Neuroscience Approach to Attention ...
Decision – Making. Gustavo Deco. Institucio Catalana de Recerca i Estudis Avançats (ICREA), Universitat Pompeu Fabra,. Passeig de Circumval.lacio, 8, 08003 ...

Evidence for A Computational Memetics Approach to ...
to Music Information and New Interpretations of An. Aesthetic Fitness ..... 1), there might be a priming effect where the main theme was memorised by the ...

a computational approach to edge detection pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. a computational ...

Computational chemistry comparison of stable ...
Apr 20, 2007 - Some of most used connectivity indices include first (M1) and second (M2) Zagreb ... First, the data is split at random in two parts, a training series. (t) used for model ..... CambridgeSoft Corporation. Chem3D Ultra software.

Algebraic Number Theory, a Computational Approach - GitHub
Jan 16, 2013 - 2.2.1 The Ring Z is noetherian . .... This material is based upon work supported by the National Science ... A number field K is a finite degree algebraic extension of the ... How to use a computer to compute with many of the above obj

Receptor for RACK1 Mediates Activation of JNK by Protein Kinase C
Aug 5, 2005 - sociate with JNK and increase its activity, the data sug- gest that it is ... These data establish that S129 is the primary site of pression (via siRNA ...

The role of cytosolic free Ca2+ and protein kinase C in acetylcholine ...
from BDH, Atherstone, Warwicks., U.K., and BSA was from. BCL, Lewes, East Sussex, U.K. Acetylcholine (in 10 mg vials), verapamil, N-methyl-D-glucamine, ...

Substrates for cyclic AMP-dependent protein kinase in islets of ... - NCBI
test for 10mM-glucose versus 2mM-glucose and forlO/M-forskolin plus ... 6((g pellet. 15000. 110+5. 114+10. 204+7**. 580+13***. 240()g pellet. 23000. 188 +19*.

Extracting Protein-Protein Interactions from ... - Semantic Scholar
statistical methods for mining knowledge from texts and biomedical data mining. ..... the Internet with the keyword “protein-protein interaction”. Corpuses I and II ...