JVS 5795 De Caceres Accepted-Revised

Viewer
Transcript

1

Numerical reproduction of traditional classifications and automatic

2

vegetation identification Miquel de Cáceres1, 2,*, Xavier Font1, Paloma Vicente1, Francesc Oliva2 1

3

Department of Plant Biology, University of Barcelona, Avda. Diagonal 645, Barcelona, ES08028, Spain; 2 Department of Statistics, University of Barcelona, Avda. Diagonal 645, Barcelona, ES-08028, Spain; * Corresponding author; E-mail: [email protected] Abstract

4

Questions: Is it possible to develop an expert system to provide reliable automatic identifications

5

of plant communities at the precision level of phytosociological associations? How can unreliable

6

expert-based knowledge be discarded before applying supervised classification methods?

7

Material: We used 3677 relevés from Catalonia (Spain), belonging to eight orders of terrestrial

8

vegetation. These relevés were classified by experts into 222 low-level units (associations or

9

subassociations).

10

Methods: We reproduced low-level expert-defined vegetation units as independent fuzzy clusters

11

by using the Possibilistic C-means algorithm. Those relevés detected as transitional between

12

vegetation types were excluded in order to maximize the number of units numerically

13

reproduced. Cluster centroids were then considered static and used to perform supervised

14

classifications of vegetation data. Finally, we evaluated the classifier’s ability to correctly

15

identify the unit of both typical (i.e. training) and transitional relevés.

16

Results: Only 166 out of 222 (75%) of the original units could be numerically reproduced.

17

Almost all the unrecognized units were subassociations. Among the original relevés, 61% were

18

deemed transitional or untypical. Typical relevés were correctly identified 95% of times, while

19

the efficiency of the classifier on transitional data was only 64%. However, if the second

1

20

classifier’s choice was also considered the rate of correct classification for transitional relevés

21

was 80%.

22

Conclusions: Our approach stresses the transitional nature of relevé data coming from vegetation

23

databases. Relevé selection is justified in order to adequately represent the vegetation concepts

24

associated to expert-defined units.

25

Keywords: Fuzzy sets; Expert systems; Possibilistic C-means; Phytosociological data;

26

Syntaxonomy.

27

Introduction

28

During recent years, there has been a renewed interest in vegetation classification, even in

29

parts of the world with little phytosociological tradition (e.g. Rodwell et al. 1995, Jennings 2003).

30

Nature managers are in need of consistent systems of vegetation classification. Indeed, assigning

31

a meaningful vegetation type to the plant community observed in a sampling site is the first step

32

in applied ecological studies, such as landscape mapping, vegetation conservation or restoration

33

planning. Such assignment (i.e. the determination of the community type) would be a simpler

34

task if the identification of possible types was done through the use of remote expert systems of

35

vegetation classification (Noble 1987). Up to date, there is no expert system specially designed

36

for providing web-based vegetation classification services on the basis of species

37

composition/abundance. Nevertheless, several local computer programs are already available for

38

this purpose (van Tongeren 1986, Hill 1996, Pot 1997, Tichý 2002, van Tongeren et al. 2008),

39

and Czech vegetation scientists distribute expert system configurations to be used locally within

40

the JUICE program (Chytrý 2007).

41

Methodologically speaking, the act of identifying the predefined class or classes to which a

42

given plant community may be assigned is usually called supervised classification. Standard

43

statistical tools such as quadratic discriminant analysis (Ejrnæs et al. 2004) and specially artificial 2

44

neural networks (Cerná & Chytrý 2005) have recently been advocated as efficient

45

methodological approaches for the identification of plot data. Simpler but more easily

46

interpretable approaches consist in calculating resemblance values between the target relevé and

47

each of the predefined vegetation units. After that, the relevé is identified with the nearest unit(s).

48

Relevé resemblance computation may be performed by combining information from species

49

composition, abundance values, and/or the presence of diagnostic species (van Tongeren 1986,

50

Hill 1989, Kocí et al. 2003, Tichý 2005, van Tongeren et al. 2008). Another approach consists in

51

identifying potential units by progressing downward from higher to lower hierarchical levels (Pot

52

1997). In either the case, a classifier is developed from a training data set of plot observations

53

whose classification is previously known and is assumed to be valid. Such assumption can be a

54

source of problems in expert domains where it does not hold, or when there is no consensus on

55

the classification of the training set. Traditional expert-based vegetation classifications usually

56

suffer from several inconsistencies (i.e. different researchers used variable and sometimes not

57

explicitly stated classification criteria) and/or contain loosely defined units (i.e. plant

58

communities defined by the occurrence, dominance, or absence of particular species). Under this

59

scenario supervised classification methods may spread potentially wrong knowledge if traditional

60

expert-defined classifications are not previously validated using a common classification

61

criterion. Since contemporary vegetation scientists are increasingly using numerical clustering

62

(i.e. unsupervised) methods to derive new vegetation units (Mucina & van der Maarel 1989,

63

Mucina 1997), they should also be used to review traditional classifications. However, note that

64

current conservation policies, like those of the Natura 2000 networking program, are based on

65

habitat definitions (e.g. the CORINE biotopes manual, Devillers et al. 1991), which in turn rely

66

on traditional phytosociological units. Therefore drastic changes in regional/national vegetation

67

classifications can be problematic and should be avoided. Even if traditional vegetation units are

68

considered valid, we believe the classification criterion of supervised classifications should be 3

69

congruent with the one used in the original classification of the training data set. Otherwise,

70

either the efficiency and/or interpretation of results may be affected. This explains why

71

supervised approaches emulating traditional phytosociological concepts perform better when the

72

expert classification of the training set is used instead of that resulting from numerical clustering

73

analyses (e.g. van Tongeren et al. 2008).

74

The aim of this paper is to propose a methodological framework for translating low-level

75

expert-defined vegetation units into an automatic vegetation identifier. It consists of two main

76

steps. First, we use Possibilistic C-means, a fuzzy unsupervised classification method, to

77

reproduce expert-defined vegetation units. Second, clusters centroids resulting from the first step

78

are used to identify new observations by means of a fuzzy classifier. We use the traditional

79

phytosociological classification of the Catalan vegetation to build numerical clusters and to

80

evaluate the classifier’s ability to provide satisfactory answers at the precision level of

81

association.

82

Material and Methods

83

Data sets and data transformations

84

We took the traditional phytosociological classification of terrestrial vegetation in Catalonia,

85

northeast of Spain. In order to span a broad range of vegetation types, we considered eight

86

syntaxonomical orders (see Table 1), which include different types of grasslands, shrublands and

87

forests. For each order we compiled all relevés from those phytosociological associations

88

containing at least 3 representatives. Relevés were drawn from the Biodiversity Data Bank of

89

Catalonia (Font 2008). Original authors of the relevés had assigned them to associations or

90

subassociations that were fitted into the syntaxonomical classification made by Bolòs & Vigo

91

(1984). Only Brometalia erecti grasslands had undergone a numerical revision, based on

92

correspondence analyses, of the original expert-based classification (Font 1993). We did not 4

93

perform any stratified re-sampling (Knollová et al. 2005) neither an elimination of those relevés

94

with unusually small or large plot sizes (Otýpková & Chytrý 2006). Relevé compilation resulted

95

in eight distinct datasets, one corresponding to each order. Taken together, we considered 3677

96

relevés, which belong to 222 distinct low-level (i.e. association or subassociation) vegetation

97

units. These vegetation types were the expert knowledge to be validated and emulated by means

98

of numerical methods.

99

Species nomenclature was homogenized using a regional flora (Bolòs et al. 1990). Unsure

100

plant determinations, determinations not reaching the species level and taxon names not

101

appearing in the flora were eliminated. Although they are not consistently reported, we kept

102

cryptogam records because they are diagnostic for some vegetation units. Braun-Blanquet cover-

103

abundance values were first transformed to the nine-degree ordinal scale (van der Maarel 1979).

104

We then applied the Hellinger transformation (Legendre & Gallagher 2001). The Hellinger

105

distance (Rao 1995, Legendre & Legendre 1998) is equal to the chord distance (Orlóci 1967)

106

computed after taking the square root of the abundance values. The multivariate space provided

107

by the Hellinger distance was used to define numerical clusters reproducing expert-defined

108

vegetation units.

109

Cluster model

110

In the opinion of many vegetation scientists, vegetation types are not crisp classes but types

111

that are conceptually vague and fuzzy (e.g. Dale 1988, Moraczewski 1993, Willner 2006).

112

Therefore, any numerical classification of vegetation should allow some degree of overlap and

113

even allow leaving some relevés unclassified. Setting a hierarchical tree or a partition (either

114

fuzzy or crisp) as classification model seemed excessively constraining to us. In addition, we

115

wanted a cluster model where new clusters could be defined without changing all those clusters

116

previously defined. Due to these two reasons, we turned our attention to the Possibilistic C5

117

Means algorithm (PCM, Krishnapuram & Keller 1993, 1996), which implements a clustering

118

model where clusters are both fuzzy and independent. PCM algorithm originated from Fuzzy C-

119

means (FCM, Bezdek 1981) an unsupervised partitive clustering procedure well-known among

120

vegetation scientists (Marsili-Libelli 1989, Mucina 1997). Table 2 summarizes the mathematical

121

differences between PCM and FCM models. In the possibilistic approach, fuzzy membership

122

values are not relative (i.e. probabilistic) as in FCM, but are interpreted absolute cluster

123

typicalities. Cluster independence is obtained because the partition constrain of FCM is

124

eliminated. That is, for any object the sum of its possibilistic membership values does not have to

125

be one. Resulting from this fundamental difference, PCM is a mode-seeking algorithm. That is, in

126

PCM each vegetation cluster corresponds to a dense region in the multivariate space of relations

127

between plots. A single PCM run can be regarded as c independent runs of an algorithm looking

128

for a single cluster (Davé & Krishnapuram 1997). The PCM model solves the FCM problem

129

raised by Dale (1995), consisting on the possible data contamination resulting from types not

130

well represented and whose centre lies outside the available data. Fig. 1 further illustrates the

131

differences between the two models, by showing their corresponding results on relevé data from

132

three xerophytic grassland associations.

133

Reproduction of traditional units into numerical clusters

134

Whenever possible, we create one possibilistic fuzzy cluster for each traditional low-level

135

vegetation unit (syntaxonomical association or subassociation). One additional advantages of

136

PCM over FCM is that it avoids the need of specifying the number of clusters to be sought.

137

Instead, distinct clusters are permitted as long as they represent distinguishable dense regions of

138

the multivariate space. In our case, we considered two clusters as distinguishable when their

139

amount of overlap was less than 10% (see below). We used this criterion to detect poorly defined

140

vegetation units. Moreover, relevé databases may be plagued with noisy and transitional plot 6

141

data. Including indiscriminately all the available relevés would preclude the PCM algorithm from

142

distinguishing many expert-defined units. Therefore some relevés were discarded during the

143

reproduction process.

144

The following steps were performed for each of the eight datasets: We started by taking those

145

relevés belonging to the first low-level vegetation unit. The one-cluster PCM algorithm was then

146

run on this initial training relevé set, using the three closest relevés as starting cluster members.

147

The fuzziness exponent was set to m = 1.03, which is a rather crisp value but allowed higher

148

sensitivity of the algorithm. The PCM cluster size parameter (ηi in Table 2) was then

149

progressively augmented in order to make the cluster grow. This was done by using a method

150

described in De Cáceres et al. (2006), which allows finding appropriate PCM cluster sizes. Once

151

grown, the relevés showing very low membership values (i.e. with uij < 0.0001) were excluded

152

from the training data set and stored in a set of transitional (non-typical) relevés. The final cluster

153

configuration was also stored. After reproducing this first unit, the relevés belonging to the next

154

vegetation unit were included in the training relevé set, and we let the previously defined PCM

155

cluster(s) “react” to the newly added relevés by running the PCM algorithm from its last

156

configuration, also allowing for changes in the cluster size parameter (De Cáceres et al. 2006). It

157

could happen that some of the newly added relevés become members (i.e. with a possibilistic

158

fuzzy membership uij > 0.1) of any of the previous cluster(s). In this case, those relevés were

159

deemed transitional, and they were also excluded from the training set and stored in the

160

transitional set. We then reloaded the stored cluster configuration(s) and the “reacting” process

161

was rerun without the noisy transitional relevés. This was repeated until all previously-defined

162

PCM cluster(s) were stable to the new relevés. Note that this process of discarding relevés could

163

leave a given vegetation unit without enough relevés for being translated into a numerical cluster.

164

If enough relevés were left, we used the three closest relevés as starting cluster members for a 7

165

new PCM cluster, which was grown as described above. Any given PCM cluster reproducing a

166

low-level expert unit was only accepted whenever it fulfilled the following three conditions: (a)

167

The sum of membership values for the fuzzy cluster set (i.e. its cardinality) was equal or greater

168

than 3; (b) all relevés with a membership value for the fuzzy cluster above 0.1 had been classified

169

by experts into the same vegetation type (this condition ensured that the PCM cluster represented

170

the proper vegetation concept); and (c) the proportion of overlap between the fuzzy cluster and

171

any of the remaining PCM clusters was lower than 10%. Cluster overlap between any pair of

172

clusters was calculated as the cardinality of the fuzzy intersection set divided by the cardinality of

173

the fuzzy union set. Whenever a cluster failed to be accepted, a distinct set of three relevés was

174

used as starting cluster configuration. The steps above-described were iteratively repeated until

175

all the traditional low-level vegetation units had been considered. Subassociations were given

176

priority over associations as units to be reproduced. This algorithm yielded three sets: (1) a final

177

training set, made of typical relevés only (this is hereafter also referred to as the typical relevé

178

set); (2) a transitional set, containing those relevés that were outliers or similar to more than one

179

numerical cluster; and (3) a set of PCM fuzzy clusters corresponding to reproduced expert-

180

defined vegetation units.

181

Supervised classification of relevés

182

We used the probabilistic approach of FCM to perform supervised classifications. In order to

183

use this unsupervised method in a supervised mode, the centroid coordinates for each of the fuzzy

184

clusters must be considered static (but see the leave-one-out procedure below). Supervised FCM

185

classification of any relevé j was performed in two simple steps: (1) Compute eij, the distance

186

between the relevé j and each fixed cluster centroid i; and (2) Compute uij, the relevé fuzzy

187

membership to each cluster i, by using the FCM membership function (eq. 1 in Table 2). We set

8

188

the fuzziness exponent to m = 1.2 in this case, as recommended by several authors (e.g. Marsili-

189

Libelli 1989, Podani 1990, Escudero & Pajarón 1994).

190

Evaluation of the classifier

191

Our objective was to assess the performance of the classifier by measuring its rate of correct

192

identification at the precision level of association. If a given association (and its possible

193

subassociations) had not been reproduced, it was not represented in the set of fuzzy clusters.

194

Hence, its relevés could not be used to evaluate the classifier’s performance. However, if some

195

subassociations of an association or the association itself had been reproduced, then all its

196

subassociations were considered to be represented because in this case the classifier was capable

197

of returning a correct answer at the level of association.

198

Both typical and transitional relevé sets were used for the evaluation of the classifier. Since

199

relevés of the transitional set had been discarded in the definition of PCM clusters, they could be

200

used as a test set. However, relevés of the typical (training) set exerted an attraction on the

201

centroids, and thus their re-classification was biased. Aiming to remove this bias, we used a

202

leave-one-out crossvalidation procedure. For each training relevé to be classified we temporarily

203

removed it from the training set and the PCM clusters were allowed to “react” as explained

204

above. After this step, identification could be done without the influence of the target relevé on

205

cluster centroids.

206

The classifier responses were homogenized at the level of association. For each represented

207

association within each order we estimated the sensitivity and positive predictive power of the

208

classifier (see Cerná & Chytrý 2005 for details). We also calculated rates of correct association

209

identification for each of the eight datasets, and for all datasets taken together. In order to gain

9

210

more detailed information on the classifier’s performance, we repeated this efficiency assessment

211

also taking into account the second choice as an additional source of correct identification.

212

Results

213

Reproduction of traditional units

214

Among the 222 original low-level units, 166 (75%) could be numerically reproduced using

215

strategy described (see Table 1). Only two of the 56 non-reproduced units were associations. The

216

remaining 54 non-reproduced units were subassociations, which means that in all these cases

217

other subassociations of the same association could be reproduced. Approximately 39% of the

218

original relevés were finally kept in the training set, but this percentage varied from 31% (for

219

Fagetalia beech forests) to 57% (for Galio-Alliarietalia megaforb communities). Hence, nearly

220

25% of the expert-defined vegetation units and 61% of the relevés can be considered of

221

transitional nature following our cluster building criteria.

222

Performance of the vegetation classifier

223

The two non-reproduced associations accounted for 27 relevés. The remaining 3650 relevés

224

belonged to associations represented in the classifier, so they were used to assess its performance.

225

We report detailed result tables on the sensitivity and positive predictive power for each

226

association in App. 1. We show in Table 3 the rates of correct identification computed for the

227

eight datasets independently and altogether. The overall rate of correct association identification

228

for the typical relevés was very high: 95% of relevés were classified into the correct association

229

in the first choice, and 99% taking into account the first and second choices of the classifier (see

230

Table 3). This high rate of success is not surprising, since the relevés of this set were those

231

which, by definition, were closest to cluster centroids. In contrast, the classifier identified the

232

correct association for 64% of the relevés of the transitional set. Nevertheless, if we take into

233

account the transitional nature of these relevés, the percentage of correct identification using the 10

234

first and second choices may be a more realistic measure of performance. Over all

235

phytosociological orders, this latter percentage was 79.5%. Identification of beech forests

236

(Fagetalia sylvaticae) was the least successful (66%) and that of Quercus ilex forests and related

237

communities (Quercetalia ilicis) the most successful (89.3%). When considering both typical and

238

transitional relevé, the estimated overall efficiency of the classifier was 76.3% of correct

239

identification on first choice, and 86.9% considering also the classifier’s second choice.

240

Discussion

241

Reproduction of traditional classifications

242

Several attempts of reproduction of traditional vegetation classifications usually forced the

243

reproduction of all expert-defined units into the classifier (e.g. van Tongeren 1986, Hill 1989, van

244

Tongeren et al. 2008). In the case of Kocí et al. (2003), the use of the Cocktail algorithm

245

(Bruelheide 2000) allowed excluding poorly differentiated units, but their approach was still

246

essentially expert-based (Chytrý 2007). Going a step further, we stressed here the necessity of

247

validating traditional vegetation units through the use of an unsupervised clustering method.

248

Although we tried to maximize the amount of vegetation types that could be numerically

249

reproduced, 25% of the original low-level units turned out to be impossible to stand.

250

Subassociations turned out to be more difficult to reproduce because many of them are

251

traditionally defined as a subclass of an association that shows a tendency towards an

252

ecologically neighbouring association (in other words, they are transitional).

253

Moreover, in previous approaches relevé identification was usually performed using

254

assignment rules that were different from the rules originally used in the classification of training

255

data (e.g. Kocí et al. 2003, Tichý 2005, van Tongeren et al. 2008). We preferred to use the

256

resemblance in species abundance values only, as a simple common criterion for both

257

unsupervised and supervised classification. Not using Cocktail’s species groups but overall 11

258

species composition has the advantage that it allows reproducing units lacking differential species

259

(i.e. ‘basal’ or ‘central’ communities). However, the classifier is not expected to provide accurate

260

results with such units due to their high variability and amount of transitional relevés.

261

Performance of the vegetation classifier

262

Whereas inconsistency in the original classification methods can be avoided by applying

263

numerical clustering, it reappears when attempting to evaluate the efficiency of the classifier

264

because the reference classification is expert-based. That is, the precision in the original

265

assignments may be affecting the percentages of successful identification. In addition, relevés

266

belonging to transitional subassociations were more difficult to classify correctly than relevés

267

belonging to reproduced vegetation units (even if both were represented at the level of

268

association). This occurred because the classifier lacked centroids to represent these units and

269

hence its relevés were assigned to one of the neighbouring units. The high number of

270

unrecognized subassociations in Fagetalia beech forests (see Table 1) may account for the low

271

classifier results on this data set (Table 2). There are other possible sources of low supervised

272

classification efficiency, derived from inconsistencies in the sampling methods that different

273

authors use. Otýpková & Chytrý (2006) showed that smaller plots tend to produce less stable

274

ordinations in data sets of low beta diversity. The lecture of their findings in terms of

275

classification is that relevés from small plots may be easily misclassified because of their higher

276

degree of variability both in species presence and abundance. The same reasoning may be applied

277

to the inconsistent recording of cryptogams.

278

Sampling and the appropriate representation of vegetation types

279

We carefully selected the relevés included in the training set, which certainly is a critical point

280

in our approach and must be justified. Statistically speaking, such relevé selection is still a

281

subjective decision that completely biases sampling and precludes any inference on the validity 12

282

of groups. Hence, one cannot expect to accurately reflect the real patterns of vegetation.

283

Moreover, Cerná & Chytrý (2005) found that selecting plots with diagnostic species as training

284

set resulted in lower efficiency of neural network classifiers compared to using a randomly

285

selected training set. Nevertheless, nowadays vegetation scientists generally agree that vegetation

286

is mainly of continuous nature. Therefore, as long as an optimal vegetation sampling theory is

287

lacking, statistical inference on clustering results will remain a delicate subject (e.g. Rolecek et

288

al. 2007). Meanwhile, vegetation classification should not aim at discovering true vegetation

289

types, but should provide a knowledge basis for performing applied ecological studies. Having

290

this in mind, we considered more important to keep the vegetation concept to be reproduced very

291

clear. We set a specific point in the multivariate space (i.e. the cluster centroid) as the

292

representative of the expert-defined unit. Not including transitional relevés into the centroid

293

definition helped in keeping it as an ideal type. Ensuring that the nomenclatural type relevé (if

294

available) shows a high membership to the unit would be a way to allow using the syntaxon name

295

for the fuzzy cluster.

296

Limitations of the numerical cluster model

297

Note that our numerical cluster model assumes roughly spherical clusters, both when building

298

PCM clusters and when executing the FCM classifier. One of Dale’s (1995) criticisms to FCM

299

was its inability to cope with non-spherical cluster shapes. Although it is possible allow

300

hyperellipsoidal clusters in FCM and PCM algorithms (Krishnapuram & Keller 1993), by taking

301

into account the cluster variance-covariance matrix. Another limitation of our approach is that

302

FCM membership function works better with clusters of similar size. PCM typicality function

303

may be used instead, but at the expense of obtaining values which cannot be interpreted as

304

probabilities.

305

Final remarks and future work 13

306

In our opinion, vegetation scientists should decide whether they would prefer: (1) a vegetation

307

classifier designed as an interface to communicate expert vegetation knowledge to non-experts;

308

or (2) a computer program like the former, but which could also promote the revision of the

309

expert knowledge itself. In the first case the program would simply run supervised classification

310

methods from a knowledge that would be assumed to be true. In contrast, in the second case the

311

system would allow doubting expert knowledge, and even changing his point of view. We

312

believed this second model was more flexible and promising. We implemented our proposals in a

313

set of related computer programs called Araucaria (see App. 2 and

314

http://biodiver.bio.ub.es/vegana/araucaria). One of them allows experts to feed the classifier with

315

new plot data, and see how the current set of PCM clusters “reacts” to this new information.

316

Regarding future developments, we strongly believe that a comparison of vegetation

317

classification methodologies is necessary, not only in terms of efficiency but also aiming a

318

unification of traditional and numerical approaches. Since vegetation classifications are

319

regionally restricted, studying solutions for biogeographical issues (e.g. vicariant units) would be

320

another interesting research topic. Nevertheless, large-scale vegetation expert systems (say valid

321

for all Europe) will certainly be difficult to develop.

322

Acknowledgements

323

We would like to thank Lubomir Tichý and an anonymous reviewer for their very useful

324

comments on a previous version of this manuscript. This study was supported by a Ph.D. grant

325

awarded by the “Comissionat per a Universitats i Recerca” (1999SGR00059), of the

326

“Departament d’Universitats, Recerca i Societat de la Informació de la Generalitat de Catalunya”

327

(2001 FI 00269), and by a research project from the Spanish “Ministerio de Educación y Ciencia”

328

(CGL2006-13421-C04-01/BOS).

14

329

References

330

Bezdek, J. C. 1981. Pattern recognition with fuzzy objective functions. Plenum Press, New York.

331

Bolòs, O. de & Vigo, J. 1984. Flora dels Països Catalans. Vol. 1. Ed. Barcino, Barcelona.

332

Bolòs, O. de, Vigo, J., Masalles, R. M. & Ninot, J. M. 1990. Flora Manual dels Països Catalans.

333

2nd ed. Pòrtic, Barcelona.

334

Braun-Blanquet, J. 1964. Pflanzensoziologie: Grundzüge der Vegetationskunde. Springer.

335

Bruelheide, H. 2000. A new measure of fidelity and its application to defining species groups.

336 337 338 339

Journal of Vegetation Science 11(2): 167-178. Cerná, L. & Chytrý., M. 2005. Supervised classification of plant communities with artificial neural networks. Journal of Vegetation Science 16: 407-414. Chytrý., M. (ed.) 2007. Vegetation of the Czech Republic. 1. Grassland and Heathland

340

Vegetation, Academia, Praha, 525 pp.

341

http://www.sci.muni.cz/botany/vegsci/expertni_system.php?lang=en

342 343

Dale, M. B. 1988. Some fuzzy approaches to phytosociology. Ideals and instances. Folia geobotanica et phytotaxonomica 23: 239-274.

344

Dale, M. B. 1995. Evaluating classification strategies. Journal of Vegetation Science 6:437-440.

345

Davé, R. N. & Krishnapuram, R. 1997. Robust clustering methods: a unified view. IEEE

346 347 348 349

transactions on fuzzy systems 5: 270-293. De Cáceres, M., Oliva, F. & Font, X. 2006. On relational possibilistic clustering. Pattern recognition 39: 2010-2024. Devillers, P., Devillers-Terschuren, J. & Ledant, J.-P. (1991). CORINE biotopes manual.

350

Habitats of the European Community. A method to identify and describe consistently sites

351

of major importance for nature conservation. Data specifications - Part 2. Office for

352

Official Publications of the European Communities. Luxembourg. 15

353

Ejrnæs, R., Bruun, H. H., Aude, E. & Buchwald, E. 2004. Developing a classifier for the Habitats

354

Directive grassland types in Denmark using species lists for prediction. Applied

355

Vegetation Science 7: 71-80.

356 357 358 359 360

Escudero, A. & Pajarón, S. 1994. Numerical syntaxonomy of the Asplenietalia petrarchae in the Iberian Peninsula. Journal of Vegetation Science 5: 205-214. Font, X. 1993. Estudis geobotànics sobre els prats xeròfils de l’estatge montà dels pirineus. Institut d’Estudis Catalans, Barcelona, ES. Font, X. 2008. Mòdul Flora i Vegetació. Banc de Dades de Biodiversitat de Catalunya.

361

Generalitat de Catalunya i Universitat de Barcelona.

362

http://biodiver.bio.ub.es/biocat/homepage.html

363 364 365 366 367 368 369

Hill, M. O. 1989. Computerized matching of relevés and association tables, with an application to the British National Vegetation Classification. Vegetatio 83: 187-194. Hill, M. O. 1996. TABLEFIT version 1.0, for identification of vegetation types. Institute of Terrestrial Ecology, Huntingdon, UK. Jennings, M. 2003. Guidelines for Describing Associations and Alliances of the US National Vegetation Classification. Ecological Society of America. Knollová, I., Chytrý, M., Tichý, L. & Hajek, O. 2005. Stratified resampling of phytosociological

370

databases: some strategies for obtaining more representative data sets for classification

371

studies. Journal of Vegetation Science 16: 479-486.

372

Kocí, M., Chytrý, M. & Tichý, L. 2003. Formalized reproduction of an expert-based

373

phytosociological classification: A case study of subalpine tall-forb vegetation. Journal of

374

Vegetation Science 14: 601-610.

375 376

Krishnapuram, R., & J. M. Keller. 1993. A possibilistic approach to clustering. IEEE transactions on fuzzy systems 1: 98-110.

16

377 378 379 380

Krishnapuram, R. & Keller, J. M. 1996. The possibilistic c-means algorithm: Insights and recommendations. IEEE transactions on fuzzy systems 4: 385-393. Legendre, P. & Gallagher, E. D. 2001. Ecologically meaningful transformations for ordination of species data. Oecologia 129: 271-280.

381

Legendre, P., & Legendre, L. 1998. Numerical Ecology. 2nd english ed. Elsevier.

382

Marsili-Libelli, S. 1989. Fuzzy clustering of ecological data. Coenoses 4: 95-106.

383

Moraczewski, I. R. 1993. Fuzzy logic for phytosociology: 1. Syntaxa as vague concepts.

384 385 386 387 388

Vegetatio 106: 1-11. Mucina, L. 1997. Classification of vegetation: Past, present and future. Journal of Vegetation Science 8: 751-760. Mucina, L. & van der Maarel, E. 1989. Twenty years of numerical syntaxonomy. Vegetatio 81: 1-15.

389

Noble, I. R. 1987. The role of expert systems in vegetation science. Vegetatio 69: 115-121.

390

Orlóci, L. 1967. An agglomerative method for classification of plant comunities. Journal of

391 392 393

Ecology 55: 193-206. Otýpková, Z. & Chytrý, M. 2006. Effects of plot size on the ordination of vegetation samples. Journal of Vegetation Science 17: 465-472.

394

Podani, J. 1990. Comparison of fuzzy classifications. Coenoses 5: 17-21.

395

Pot, R. 1997. SYNDIAT, SYNtaxonomical DIAgnostics Tool, a computer program based on the

396

deductive method of community identification. Acta Botanica Neerlandica 46: 230.

397

Rao, C. R. 1995. A review of canonical coordinates and an alternative to correspondence analysis

398

using Hellinger distance. Qüestiió (Quaderns d'Estadistica i Investivació Operativa) 19:

399

23-63.

400 401

Rodwell, J. S., Pignatti, S., Mucina, L. & Schaminée, J. H. J. 1995. European Vegetation Survey: update on progress. Journal of Vegetation Science 6: 759-762. 17

402

Rolecek, J., Chytrý, M., Háyek, M., Lvoncik, S. & Tichý, L. 2007. Sampling in large-scale

403

vegetation studies: Do not sacrifice ecological thinking to statistical puritanism. Folia

404

Geobotanica 42: 199-208.

405 406 407 408 409 410 411 412 413

Tichý, L. 2002. JUICE, software for vegetation classification. Journal of Vegetation Science 13: 451-453. Tichý, L. 2005. New similarity indices for the assignment of relevés to the vegetation units of an existing phytosociological classification. Plant Ecology 179: 67-72. van der Maarel, E. 1979. Transformation of cover-abundance values in phytosociology and its efects on community similarity. Vegetatio 39: 97-114. van Tongeren, O. 1986. FLEXCLUS, an interactive program for classification and tabulation of ecological data. Acta Botanica Neerlandica 35: 137-142. van Tongeren, O., Gremmen, N., & Hennekens, S. M. 2008. Assignment of relevés to predefined

414

classes by supervised clustering of plant communities using a new composite index.

415

Journal of Vegetation Science 19: 525-536.

416

Willner, W. 2006. The association concept revisited. Phytocoenologia 36: 67-76.

417

18

Phytosociological order

Short description

Non-reproduced units

Training (typical) rel.

their low-level classification.

Reproduced units

418

Original relevés

Table 1. The eight phytosociological orders studied and results of the numerical reproduction of

Original units

417

Brometalia erecti

mesophytic or slightly xerophytic pastures

30

531

26

4

231

Origanetalia vulgaris

herb communities growing on forest fringes

12

133

10

2

67

Galio-Alliarietalia

megaforb sciophilous communities

13

124

12

1

71

Prunetalia spinosae

shrub communities growing on decideous forest fringes

18

353

16

2

161

Populetalia albae

riverine meso-macroforests growing on wet fluvisols with high water-table

17

199

10

7

107

Quercetalia ilicis

mediterranean woodlands, scrublands and maquis

31

753

25

6

254

Quercetalia pubescentis

submediterranean decideous oak woodlands

41

651

30

11

243

Fagetalia sylvaticae

beech forests

60

933

37

23

286

222

3677

166

56

1420

Total

419 420 421

Table 2: Main mathematical characteristics of the Fuzzy C-means (FCM) and Possibilistic C-

422

means (PCM) clustering algorithms. FCM Fuzzy membership definition

c

!

i =1

c

uij = 1 for all objects j = 1, ..., n c

Optimisation function

PCM

!

i =1

c

n

i =1 j =1

c

Membership function

n

for all objects j = 1, ..., n

c

n

i =1

j =1

J PCM = ! ! (uij ) m eij2 + ! #i ! (1 " uij ) m

J FCM = ! ! (uij ) m eij2

uij = 1 / ! (eij / elj ) 2 /( m "1)

uij > 0

i =1 j =1

(1)

uij = 1 /(1 + (eij2 / "i )1/( m!1) )

(2)

l =1

423 424

19

424

Table 3. Classification efficiency of the numerical classifier at the association level. Column

425

blocks list the efficiency on the typical and transitional relevé sets, as well as the overall

426

efficiency for the represented associations. Ass.: Number of represented associations. %:

427

Percentage of relevés correctly classified; L/U: Lower/upper 95% confidence limits following the

428

binomial distribution. Typical 1st choice Phytosociological order

429

L

Transitional 1st/2nd choice

U

%

L

U

1st choice Rel.

%

L

Represented

1st/2nd choice U

%

L

U

1st choice Rel.

%

L

1st/2nd choice

Ass.

Rel.

%

U

Brometalia erecti

20

231

97.4

94.4 99.0

99.1

96.9 99.9

285

68.8 63.5

74.6

85.6 81.1 89.6

516

81.6 78.4 85.2

91.7 89.1 94.0

%

L

U

Origanetalia vulgaris

10

67

92.5

83.4 97.5

100.0

94.6 100.0

66

39.4 27.6

52.2

78.8 67.0 87.9

133

66.2 57.5 74.1

89.5 83.0 94.1

Galio-Alliarietalia

11

71

94.4

86.2 98.4

97.2

90.2 99.7

53

56.6 42.3

70.2

73.6 59.7 84.7

124

78.2 69.9 85.1

87.1 79.9 92.4

Prunetalia spinosae

9

161

96.3

92.1 98.6

98.8

95.6 99.8

192

72.9 66.0

79.1

85.9 80.2 90.5

353

83.6 79.3 87.3

91.8 88.4 94.4

Populetalia albae

7

107

92.5

85.8 96.7

94.4

88.2 97.9

92

64.1 53.5

73.9

82.6 73.3 89.7

199

79.4 73.1 84.8

88.9 83.7 92.9

Quercetalia ilicis

13

254

99.2

97.2 99.9

99.2

97.2 99.9

487

80.9 79.4

86.7

89.3 88.2 93.8

741

87.2 86.7 91.5

92.7 92.2 95.9

Quercetalia pubescentis

10

243

90.5

86.1 93.9

98.8

96.4 99.7

408

65.7 60.9

70.3

82.1 78.0 85.7

651

75.0 71.4 78.2

88.3 85.6 90.7

Fagetalia sylvaticae

22

286

96.2

93.2 98.1

99.0

97.0 99.8

647

49.0 45.1

52.9

66.0 62.2 69.6

933

63.5 60.3 66.5

76.1 73.2 78.8

Total 102

1420

95.4

94.2 96.4

98.6

97.8 99.1

2230

64.1 62.1

66.2

79.5 77.9 81.3

3650

76.3 75.1 77.9

86.9 86.0 88.2

430 431

20

431

Fig. 1: Example of clustering results of FCM and PCM on relevés belonging to three grassland

432

associations of Brometalia erecti. (a) Classical multidimensional scaling coordinates from Bray-

433

Curtis distances, with the original vegetation units labelled using different symbols (filled circles:

434

Koelerio-Avenuletum ibericae; squares: Adonido-Brometum erecti; diamonds: Lino viscosi-

435

Brometum erecti; empty circles: intermediate artificial relevés created by averaging randomly-

436

selected relevés from the three groups). (b) FCM (m=1.2) solution with three groups. (c) PCM

437

(m=1.09) solution with three groups, after setting appropriate reference distance parameters as

438

described in De Cáceres et al. (2006). Symbol size and colour intensity are function of the

439

object’s largest membership value.

440 441 442 443 444 445 446

JVS 2669 Aragon VE10

Bullying escolar - academi militar caceres sullana.pdf

RECOPILACION-DE-ESTRATEGIAS-DE-MODIFICACIÃN-DE ...

Proposition de stage de DEA

Politica de privacidad en Internet de POLIMADERAS DE COLOMBIA ...

Comarca de la Sierra de AlbarracÃn - Gobierno de AragÃ³n

transformada de place de la delta de dirac.pdf

tabla-de-factores-de-conversion-de-unidades.pdf

CABALLO DE TROYA DE DESCARTES, de Antonio Hidalgo.pdf ...

02 estudo-de-viabilidade-de-sistemas-de-informa.pdf

PROGRAMA_ENCUENTRO REGIONAL DE EDUCADORES DE ...

Responsabilidad social de los centros de educación superior de criminología

tabla-de-factores-de-conversion-de-unidades.pdf

Directorio de Responsables de la CS de IES.pdf

Comarca de la Sierra de AlbarracÃn - Gobierno de AragÃ³n

PINCELADAS DE LA HISTORIA DE CUBA (TESTIMONIO DE 19 ...

Banner PERFIL DE SENSIBILIDADE DE GERMES CAUSADORES DE ...

Plano de Concurso TEC DE PROD DE SOM E IMAGEM.pdf ...

lista-de-graduados-28-de-abril-de-2017 Resolucion Resolucion.pdf ...

Cao1998-A-Cahiers de Geographie de Quebec-Espace social de ...

sistema-de-control-de-polizas-de-jdc-jarquin.pdf

Laberinto de Fortuna de Juan de Mena.pdf

CONPES 3673 DE 2010 - PREVENCIÓN DE RECLUTAMIENTO.pdf ...