Geochimica et Cosmochimica Acta 71 (2007) 3391–3392 www.elsevier.com/locate/gca

Response

Reply to Comment by Agrawal and Verma on ‘‘Tectonic classification of basalts with classification trees’’ Pieter Vermeesch ETH Zurich, Isotope Geology and Mineral Resources, Clausiusstrasse 25, NW C85, CH-8092 Zurich, Switzerland Received 20 April 2007; accepted in revised form 27 April 2007; available online 3 May 2007

Agrawal and Verma (2007) allege six problems with the use of classification trees for the tectonic discrimination of oceanic basalts, as proposed by Vermeesch (2006). In the following, I will demonstrate that the first five of their points are false, whereas the sixth is partially correct but can easily be fixed. The results of Vermeesch (2006) are said to be irreproducible because a large number of data points are ‘‘unclassifiable’’ due to the absence of all the primary and surrogate variables. However, Section 4.2 of Vermeesch (2006) explains that in such cases a ‘‘follow the majority’’ decision should be made. For example, in the absence of TiO2, P2O5, and Zr (the primary and surrogate variables for the first split of the full tree; Figure 4 of Vermeesch, 2006), the sample should be sent to the ‘‘Yes’’-side of the split, because this is where the majority (520/756) of the training data go. The ‘‘follow the majority’’ rule is an integral part of the classification tree method and all but obviates points (i)–(v) of the Comment by Agrawal and Verma (2007). Thanks to the simple but effective way of dealing with missing data, the sparseness of the training data used by Vermeesch (2006) is not a problem. Agrawal and Verma (2007) remark that not even a single sample in the training set was analyzed for all the variables, and that only one MORB sample was analyzed for Sn. It is important to note that the variable Sn was used in neither of the two classification trees presented by Vermeesch (2006). The fact that classification trees are not hurt by sparse datasets should be seen as a positive feature and can hardly be considered a criticism of the method. The sixth and final point of Agrawal and Verma (2007) is that geochemical analysis should consider only the relative and not the absolute values of its components. I welcome the opportunity to elaborate on this point here. It is true that the constant-sum constraint (‘‘closure’’) of

E-mail address: [email protected] 0016-7037/$ - see front matter Ó 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.gca.2007.04.021

compositional data implies some degree of correlation between the components. Loss or gain of any component causes a change in the concentration of all the other components. This problem is well known in geochemistry and is generally solved by taking ratios. For a parametric method such as discriminant analysis, Aitchison (1986) advocates taking log-ratios. However, taking logarithms is not necessary for non-parametric tools such as the classification tree. The latter only considers the order of the split variables, which is not affected by taking logarithms. For the sake of illustration, a ratio-based tree was built using the same dataset as Vermeesch (2006), but converting major oxide concentrations (in weight percent) to elemental concentrations (in parts per million). The following variables were used: La/Ti, Ce/Ti, Nd/Ti, Sm/Ti, Eu/Ti, Gd/ Ti, Tb/Ti, Dy/Ti, Ho/Ti, Er/Ti, Tm/Ti, Yb/Ti, Lu/Ti, Sc/ Ti, V/Ti, Sr/Ti, Y/Ti, Zr/Ti, Nb/Ti, Hf/Ti, Ta/Ti, Th/Ti, U/Ti, Sr/Zr, Zr/Nb, Nb/Th, La/Sm, La/Yb, Gd/Yb, Th/ Ta, Nb/La, Th/Yb, Th/U, Nb/U and Nb/Ta (Fig. 1). Only 751 of the original 756 data were used for the tree construction, because one IAB and four OIBs lacked all the necessary variables. Using the entire dataset of 756 training data, the resubstitution error of the ratio-based tree is 14% and its 10-fold cross-validation error is 18%. Because the surrogate variables are also composed of ratios (Table 1), they are subject to some degree of spurious correlation (Chayes, 1971). There is no way around this, but the cross-validation error estimate suggests that it only affects the performance of the tree to a minor degree. To illustrate once again the use of surrogate variables and the ‘‘follow the majority’’ rule, consider a sample with a Sr/Ti ratio of 0.01 and lacking all other variables. The primary split variable (Sr/Zr) is missing, so the first surrogate variable must be used (Table 1). Because Sr/Ti = 0.01 < 0.02056053, the sample is sent to the right side of the first node. We have now arrived at the third node, and the primary and surrogate variables are La/Yb, La/Sm and Yb/Ti, respectively. All these variables are missing, so

3392

P. Vermeesch / Geochimica et Cosmochimica Acta 71 (2007) 3391–3392 N = 751: 255/241/255

Sr/Zr>=1.864 | 1

Yes

No

IAB/MORB/OIB Nb/La< 0.8086

La/Yb< 5.54 3

2 V/Ti>=0.02497

La/Yb< 5.606

4

IAB 182/5/13

MORB 6/184/7

5 Gd/Yb< 2.329 6

OIB 1/0/6

OIB 0/1/24

Sr/Zr>=5.169 7

Sr/Zr>=3.228 OIB 5/5/10

8

IAB 5/0/0

OIB 31/18/189

Sm/Ti< 0.002996 MORB 3/15/5

9

IAB 22/6/0

MORB 0/7/1

Fig. 1. Example of a ratio-based tree. The ‘‘heaviest’’ nodes are encircled as in Vermeesch (2006).

Table 1 Surrogate splits for the ratio-based tree (Fig. 1) Split number

IAB/MORB/OIB

Primary split

Surrogate 1

Surrogate 2

Surrogate 3

1 2 3 4 5 6 7 8 9

255/241/255 249/56/224 6/185/31 183/5/19 66/51/205 30/33/16 36/18/189 25/28/6 22/13/1

Sr/Zr P 1.86375 Nb/La < 0.8085888 La/Yb < 5.539706 V/Ti P 0.02497367 La/Yb < 5.606413 Gd/Yb < 2.328818 Sr/Zr P 5.168944 Sr/Zr P 3.228164 Sm/Ti < 0.002996233

Sr/Ti P 0.02056053 Zr/Nb P 13.66719 La/Sm < 2.778826 — La/Sm < 2.480184 La/Yb < 3.675245 — Sr/Ti P 0.03181145 Eu/Ti < 0.001157938

— Nb/Th < 4.770525 Yb/Ti P 0.0002107271 — La/Ti < 0.0007888696 — — Zr/Ti < 0.008364239 Yb/Ti < 0.00315352

— Sr/Zr P 3.527116 — — Ce/Ti < 0.002014049 — — La/Sm P 1.078327 La/Ti < 0.002535417

If all primary and surrogate splits are missing, a ‘‘follow the majority’’ rule is used.

we must use the ‘‘follow the majority’’ rule. Because 197 out of the 222 training data that arrived at node 3 were sent to the left, our sample is classified as MORB. The misclassification rate of samples with missing data is worse than that of samples which were analyzed for all the components. However, provided that the sample of unknown tectonic affinity and the training data are comparatively sparse, the cross-validation error of 18% should be a reasonably accurate estimate of the true misclassification rate. For this reason, comparing the performance of a classification tree with that of a discriminant analysis lacking any missing data, as done by Verma et al. (2006) is fundamentally unfair. APPENDIX A. SUPPLEMENTARY DATA Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.gca.2007. 04.021.

REFERENCES Agrawal S., and Verma S. P. (2007) Comment on ‘‘Tectonic classification of basalts with classification trees’’ by Pieter Vermeesch (2006). Geochim. CosmoChim. Acta 71, 3388–3390. Aitchison J. (1986) The Statistical Analysis of Compositional Data. Chapman and Hall. Chayes F. (1971) Ratio Correlation; A Manual for Students of Petrology and Geochemistry. Chicago University Press. Verma S. P., Guevara M., and Agrawal S. (2006) Discriminating four tectonic settings: five new geochemical diagrams for basic and ultrabasic volcanic rocks based on log-ratio transformation of major-element data. J. Earth Sys. Sci. 115, 485–528. Vermeesch P. (2006) Tectonic discrimination of basalts with classification trees. Geochim. CosmoChim. Acta 70, 1839– 1848. Associate editor: Richard J. Walker

Reply to Comment by Agrawal and Verma on ...

doi:10.1016/j.gca.2007.04.021. E-mail address: [email protected] www.elsevier.com/locate/gca. Geochimica et Cosmochimica Acta 71 (2007) 3391– ...

NAN Sizes 2 Downloads 253 Views

Recommend Documents

Reply to Comment by Agrawal and Verma on ''Tectonic ...
14% and its 10-fold cross-validation error is 18%. Because the surrogate variables are also composed of ratios (Table. 1), they are subject to some degree of ...

Reply to “Comment on 'Maximal planar networks with ...
May 10, 2006 - and power-law degree distribution' ” ... The analytic forms of finding the degree distribution obtained by the above ... PACS number(s): 89.75.

Reply to the comment on 'TenS (n = 1–4) clusters in the gas phase ...
mass spectrum of bulk TeS powder. Å palt et al. have commented that there was ambiguity in that assignment and suggested that the peaks reported by us are due to Te clusters and not due to TeS clusters. Here in this reply, we show that there are uniq

On Swinburne's Reply
On Swinburne's Reply. Michael Martin. I am grateful for Professor Swinburne's comments1 on my paper “Swinburne on the. Resurrection”2 and I would like to address his concerns here. In particular let me explain why. Swinburne's dismissal of my cri

Comment on article by Gelfand et al. - Project Euclid
there are problems with species presence data that do not occur for Gelfand et al. ... Sampling has not occurred uniformly over my state, or any large geographic.

Comment on Distribution Equilibria
Oct 6, 2010 - Fax: 972-3-640-9357. Email: [email protected]. Abstract ... Section 2 presents a few examples and basic prop- erties of distribution ... The middle table presents the best distribution equilibrium in this game - a symmetric.

Reply to Clanton and Forcehimes
there are many ways of addressing moral conflicts that are consistent with the full employment of reason. Epistemic version of pluralism, by contrast, need.

From Locality to Continent: A Comment on the ...
401$499$5491, fax: 401$863$1970. The research ... an experimental laboratory, in which full free$riding is a strictly dominant strategy, they contribute to a public ... schedules and classification results are found in on$line Appendix A. Result 1 ..

Richard Rorty - On Ethnocentrism - A Reply to Clifford Geertz.pdf ...
Page 1 of 1. Page 1. Richard Rorty - On Ethnocentrism - A Reply to Clifford Geertz.pdf. Richard Rorty - On Ethnocentrism - A Reply to Clifford Geertz.pdf. Open.

NFU reply to Commission Consultation on Bank Accounts.pdf ...
NFU reply to Commission Consultation on Bank Accounts.pdf. NFU reply to Commission Consultation on Bank Accounts.pdf. Open. Extract. Open with. Sign In.

Ghasemi, Ward, 2011, Comment on Discussion on a mechanical ...
Ghasemi, Ward, 2011, Comment on Discussion on a mec ... solid surface J. Chem. Phys. 130, 144106 (2009).pdf. Ghasemi, Ward, 2011, Comment on ...

Comment on ``Identification of Nonseparable ...
Jun 25, 2015 - In all other cases, the results in Torgovitsky (2015) are ei- ther not covered by those in D'Haultfœuille and Février (2015), or are obtained under.

A COMMENT ON DOREIAN'S REGULAR EQUIYALENCE IN ...
correspond closely with intuitive notions of role (Nadel 1957; Sailer. 1978; Faust 1985), for symmetric data this correspondence seems to break down. Doreian's solution, which I call the “Doreian Split”, is creative and practical, and yields intu

abhishek verma
Performance evaluation of IO monitoring vs. memory introspection as a technique to analyze memory consolidation opportunities in co-hosted virtual machines for boot and VDI scenarios. Microsoft India Development Center. Hyderabad, India. Software Dev

Comment on “On estimating conditional conservatism ...
Dit = 1 if ARit < 0, which represents bad news, and 0 otherwise, and .... market-adjusted stock returns on six variables (V) derived from current and lagged ...

Reply to Shiner
My response will take the form of a series of questions followed by my own proposed .... assurances in his reply that ' it looks like Kristeller may not have been ...

Reply to Jackendoff
Unfair to facts. Philosophical Papers. J. O. Urmson and G. J. Warnock (eds.),. 154–174. Oxford: Oxford University Press. Chomsky, Noam (2000). New Horizons in the Study of Language and Mind. Cambridge: Cam- bridge University Press. Fodor, Jerry (19

Response to the comment by MJ Kohn on “Tooth ...
We appreciate Kohn's interest in our work and his willingness to provide insight into this challenging subject. Our model is predictive in nature and as such is unwieldy in terms of its actual use for estimating primary input signals from measured is

agrawal-dayal_jmps2015b.pdf
backward nucleation stresses without changing the energy landscape; (iii) stick–slip in- terface kinetics; (iii) the competition between nucleation and kinetics in determining the. final microstructural state; (iv) the effect of anisotropic kinetic

Reply to Ramsey by Brenton Ferry
Leviticus 18:5, according to the exegesis of Calvin and other Reformed interpreters, a separate covenant .... For example, he writes, "a covenant that is radi-.

Kalyanmoy Deb and Ram Bhusan Agrawal - CiteSeerX
This work is funded by the Department of Science and Technology, New Delhi under grant ... (Doctoral dissertation, University of Michigan, Ann Arbor).

agrawal-dayal_jmps2015a.pdf
Page 2 of 21. 1. Introduction. Twinning and structural phase transformations are important in areas as diverse as superelasticity and shape-memory in. functional materials (Bhattacharya, 2003), forming of structural metals (Bozzolo et al., 2010), nan

AutoCAD Electrical 2016 Black Book by Gaurav Verma - 2015.pdf ...
Sign in. Page. 1. /. 394. Loading… Page 1 of 394. Page 1 of 394. Page 2 of 394. AUTOCAD ELECTRICAL 2016. BLACK BOOK. By. Gaurav Verma.

On good advice: a reply to McNaughton and Rawling Stephen Kearns ...
Please cite version published in Analysis 71:3, 2011, 506-8 (email [email protected] or [email protected] for a copy). 1. On good advice: a reply to McNaughton and ...