asprs 2001 final

Viewer
Transcript

USING GIS, ARTIFICIAL NEURAL NETWORKS AND REMOTE SENSING TO MODEL URBAN CHANGE IN THE MINNEAPOLIS-ST.PAUL AND DETROIT METROPOLITAN AREAS Bryan C. Pijanowski Bradley A. Shellito 207 Manly Miles Building College of Natural Science Michigan State University East Lansing, Michigan 48824 [email protected] [email protected] Marvin E. Bauer Kali E. Sawaya 115 Green Hall Department of Forest Resources University of Minnesota St. Paul, Minnesota 55108 [email protected] [email protected]

ABSTRACT We parameterized the GIS and neural net-based Land Transformation Model for the Detroit and Twin Cities Metropolitan Areas using historical land use data derived from aerial photography. We built several neural net models and attempted to test whether these models were transferable across the two metropolitan regions and whether a regional model provided as good a fit as a locally parameterized model. The overall accuracy of the model to predict urban transitions was 37% and 33% for the TCMA and DMA, respectively. An “internal” versus “external” learning exercise resulted in models that appeared to be fairly transferable in one case (DMA applied to TCMA) and not well transferable in the other case (TCMA applied to DMA). A “local” versus “regional” exercise produced results suggesting that learning from larger scale spatial patterns does not reduce the affect of the model to predict smaller, local trends. We discuss the implications of these two learning exercises and suggest ways in which the models could be improved. Overall accuracy of the presented models is judged against previous LTM applications in Michigan’s Grand Traverse Bay Watershed and Kuala Lumpur, Malaysia.

INTRODUCTION Land use information provides valuable input to local, state and regional land use planning. The importance of accurate information describing the kind and extent of land features – past, present, and future – is increasing. There are a variety of ways to develop digital land use information. The most common is through the interpretation of aerial photography. Many local and regional government GIS and planning offices now develop land use maps on a regular timetable and store and manage this information in a GIS. For example, in the Minneapolis-St. Paul and Detroit Metropolitan areas, land use/cover maps have been developed every 5-6 years for nearly 30 years. Information about land use change, as analyzed from multiple time periods, can generate useful knowledge about the patterns of change and the possible factors driving change. This information can be used by planners and resource managers to develop better decisions that affect the environment and local and regional economies. A Land Transformation Model (Pijanowski et al., 1997, Pijanowski et al. 2000a, Pijanowski et al., in press) has been developed to simulate land use change in a variety of locations around the world. The Land Transformation Model (LTM) uses population growth, transportation factors, proximity or density of important landscape features such as rivers, lakes, recreational sites, and high-quality vantage points as inputs to model land use change. The model relies on GIS, artificial neural network routines, land use data from at least two time periods and customized geospatial tools. The model can be used to help understand what factors are most important to land

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

1

use change. The artificial neural networks (hereafter as neural nets) “learn” about complex spatial relationships of factors that correlate with urban development. Information derived from an historical analysis of land use change is often used to conduct forecasting studies (Pijanowski, in press) or for coupling to other environmental models (e.g., Brown et al., 2000; Boutt et al., 2001) for use in planning and resource management The purpose of this paper is to: (1) illustrate how the LTM is parameterized and calibrated using land use change data derived from aerial remote sensing data; and (2) conduct two different learning exercises designed to test the ability of neural nets to generalize across locations and across time. We examine these by building models for the Minneapolis-St. Paul and Detroit Metropolitan Areas, which we then use to forecast where new urban uses will occur in the year 2020.

BACKGROUND Artificial Neural Networks Artificial neural networks are powerful tools that use a machine learning approach to numerically solve relationships between inputs and outputs. Neural nets are used in a variety of disciplines, such as economics (Fishman et al., 1991), medicine (Babaian et al., 1991), landscape classification (Brown et al., 1998), image analysis (Fukushima et al., 1983), pattern classification (Ritter et al., 1988), climate forecasting (Drummond et al., 1998), mechanical engineering (Kuo and Cohen, 1998), and remote sensing (Atkinson and Tatnall, 1997). The use of neural networks has increased substantially over the last several years because of the advances in computing performance (Skapura, 1996) and of the increased availability of powerful and flexible neural net software. Neural nets are designed to emulate the functionality of neurons in order to achieve a high parallel processing potential. Neural nets typically consist of many simple processing units, which are wired together in a complex communication network. Each unit or node is a simplified model of a real neuron which fires (sends off a new signal) if it receives a sufficiently strong input signal from the other nodes to which it is connected. The strength of these connections may be varied in order for the network to perform different tasks corresponding to different patterns of node firing activity. One of the first neural nets, the perceptron, (Rosenblatt, 1958) consists of a single node, which receives weighted inputs and thresholds the results according to a defined rule. This type of simple neural machine is capable of classifying linearly separable data and performing linear functions. The multi-layer perceptron (MLP) neural net described in Rumelhart et al. (1986) is one of the most widely used neural nets. The MLP consists of three layers: input, hidden, and output and thus can be used to identify relationships that are non-linear in nature. The input data could be the wavebands in a spectral analysis or the pixel intensity in an image (Skapura, 1996). In the case of a classification problem, the output from the MLP for a given input can be one of several different classes. Neural net algorithms calculate weights for input values, input layer nodes, hidden layer nodes and output layer nodes by introducing the input in a feed forward manner, which propagates through the hidden layer and to the output layer. The signals propagate from node to node and are modified by weights associated with each connection. The receiving node sums the weighted inputs from all of the nodes connected to it from the previous layer. The output of this node is then computed as the function of its input called the "activation function." The data moves forward from node to node with multiple weighted summations occurring before reaching the output layer. Weights in a neural net are determined by using a training algorithm, the most popular of which is the back propagation (BP) algorithm. The BP algorithm randomly selects the initial weights, then compares the calculated output for a given observation with the expected output for that observation. The difference between the expected and calculated output values across all observations is summarized using the mean squared error. After all observations are presented to the network, the weights are modified according to a generalized delta rule (Rumelhart et al. 1986), so that the total error is distributed among the various nodes in the network. This process of feeding forward signals and back-propagating the errors is repeated iteratively (in some cases, many thousands of times) until the error stabilizes at a low level. Neural nets differ from statistical (Skapura, 1996) or algorithm based models in several respects. First, neural nets do not require formal mathematical specification. Iterative processes performed by the computer determine relationships between inputs and outputs. Thus, the weights derived between inputs, hidden nodes and the output(s) are not directly interpretable. Second, neural nets are not highly sensitive to noise in data; statistical or mathematical algorithms treat noise in data similar to data of high quality. Third, neural nets generate information that can be applied to data that it “has not seen before”. Thus, there is the potential to develop models that can be “generalized” or transferable.

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

2

Land Transformation Model Framework LTM modeling follows five sequential steps: (1) processing/coding of data to create spatial layers of predictor variables; (2) applying spatial rules that relate predictor variables to land use transitions for each location in an area; the resultant layers contain input variable values in grid format; (3) integrating all input grids using one of three techniques; and (4) analysis of the difference between model outputs and real change; and, (5) temporally scaling the amount of transitions in the study area in order to create a time series of possible future land uses. In Step 1, processing of spatial data, inputs are generated from a series of base layers that are stored and managed within a GIS. These base layers represent land uses (such as agriculture parcels and urban areas) or features in the landscape (e.g., roads, rivers, and lakeshores). Grid cells are coded to represent predictors as either binary (presence=1 or absence=0) or continuous variables (e.g., distance of a cell from a road) depending on the type of attribute. For Step 2, applying spatial transition rules, inputs are developed using a set of spatial transition rules that quantify the spatial effects that predictor cells have on land use transitions (see Pijanowski et al., 2000, for details). We use four classes of transition rules: (1) neighborhoods or densities; (2) patch size; (3) site specific characteristics; and (4) distance from the location of a predictor cell. Neighborhood effects are based on the premise that the composition of surrounding cells has an effect on the tendency of a central cell to transition to another use. Patch sizes relate the variable values of all cells within a defined patch (e.g., parcel) to likelihood of land use transition. Site-specific characteristics are values assigned to a cell based on a biophysical or social characteristic that is specific to each grid cell. An example site-specific characteristic is the location of quality views. The distance spatial transition rule relates the effect of the Euclidean distance between each cell and the closest predictor variable. Certain locations are coded so that they do not undergo transitions. This is necessary for areas within which development is prohibited, such as public lands. We code cells with a ‘4’ if a transition cannot occur; all other locations are assigned a ‘0’. All such layers are then multiplied together to generate one single layer of “exclusionary zones.” Step 3, integration of predictor variables, one of three different integration methods are used in the LTM (see Pijanowski 2000 et al., Pijanowski et al. in press): multi-criteria evaluation (MCE), artificial neural networks (ANN), and logistic regression (LR). Each integration procedure requires a different type of data normalization. With all of the integration methods, the cell size (100 x 100 meters in the present analysis) and analysis window are set to a fixed base layer. The output from this step is a map of “change likelihood values,” which specifies the relative likelihood of change for each cell based the inputs and weights associated with the driving variable grids. The method used exclusively here, however, is the use of neural nets. Step 4 – spatial error analysis - entails examining output of the model forecasts compared against known land use changes occurring during the same time interval. The GIS is used to calculate the percentage of cells that the model correctly identifies to transition to urban when compared with actual land use change data. Arc/INFO GRID is used to code cells using the following integer values: 0 = no observed change and no change predicted by the model; 1 = observed change and no predicted change by the model; 2 = no change but model predicted change; 3 = changed observed and model predicted change. The GIS is used to plot patterns of type ‘1’ and ‘2’ errors along with other spatial features (locations of roads, rivers, lakes, special sites such as casinos, marinas) in an area to determine what factors might be missing from the model or what factors might be spatially miscoded by the GIS (i.e., if a different spatial rule should be applied to a spatial feature). The process of adding, subtracting or reconfiguring the spatial nature of the model inputs is conducted in a ‘trial and error’ fashion. Once the inputs are deemed satisfactory, forecasting exercises are conducted (see Pijanowski et al., in press, for more details). In Step 5, temporal indexing, the amount of land that is expected to transition to urban over a given time period is determined using a "principal index driver" or PID (Pijanowski et al., in press). We calculate the PID based on population growth (i.e., number of people) and historical population density calculations to determine the total amount of new urban land at a later time:

 di P   * Ai (t) d t  i 

U i(t) = 

(1).

Where U is the amount of new urban land required in the time interval t, i is the unit (e.g., county) of for which the population statistics are available, P is the number of new people in any given area in a given time interval and A is the per capita requirements for urban land.

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

3

METHODS AND MATERIALS Study Sites Data. Data for the Minneapolis-St. Paul Metropolitan area were obtained from the University of Minnesota Remote Sensing Laboratory, the Twin Cities Metropolitan Council, the Minnesota data, Vol. 1 produced by the Land Management Information Center and from the Minnesota Department of Natural Resources Data Deli web site (http://deli.dnr.state.mn.us ). Data on land use, transportation, natural features (e.g., locations of rivers, lakes, etc.), public lands, digital elevation and political boundaries were incorporated into the Arc/Info 8.0 Geographic Information System (ESRI 1999). GIS data were converted from their original projection into an Albers Equal Area projection (datum 83) with units in meters. The 1991 and 1997 Generalized Land Use data set encompass the seven county Twin Cities Metropolitan Area (TCMA) in Minnesota. The land use data set was developed by the Twin Cities Metropolitan Council, a regional governmental organization that deals, in part, with regional issues and long range planning for the Minneapolis-St. Paul area. The data set includes the following land use classifications: single family residential, multi-family residential, commercial, industrial, public and semi-public, airports, parks and recreation, vacant and agricultural, major four lane highways, open water bodies, farmsteads, extractive, public industrial, industrial parks not developed, and public and semi-public not developed. Land use was interpreted from 1:24,000 aerial photography scanned with 0.6 m resolution pixels. Most lines were digitized at an on-screen scale of no higher than 1:3,000. In highly urbanized areas,1:1,500 was more common. The Metropo litan Council attempted to meet the National Mapping Accuracy Standards at 1:24,000 (within approximately 40 feet of actual location) although no testing has been conducted to verify this. For the purposes of the modeling exercises, all TCMA data were converted to an Arc/INFO GRID (ESRI, 1999) format with cell sizes of 100m x 100m. Detroit Metropolitan Area (DMA) data were obtained from several sources. Land use and transportation data from 1980 were acquired from Michigan State University’s Center for Remote Sensing and Geographic Information Science. Land use interpretation was made from 1:24,000 color-infrared and black-and-white aerial photographs by the Michigan Department of Natural Resources. These data depict approximately 52 categories of urban, agricultural, forest, wetland, and other land cover types for the entire state of Michigan. Each land use/cover category is depicted by a polygon and identified with a land cover code. The minimum digitized polygon size was 5 acres. Updated land use and transportation data were obtained from the South East Michigan Council of Government’s (SEMCOG) GIS facility. Land use and another GIS data used for the modeling exercise was converted to an Arc/INFO GRID (ESRI, 1999) format, with 100mx100m cell sizes. All land use classes for both metropolitan areas were also reclassified from their original classification to Anderson Level I (Anderson et al., 1976) for the modeling exercises. The resulting land use/cover classes were: urban, agriculture, open grassland, forest, open water, wetlands and barren. For the TCMA study area, a total of 770,502 cells were included in the modeling exercise. The DMA area, on the other hand, is nearly 55% larger, containing 1,192,590 cells. The study interval was 7 and 15 years, for the TCMA and DMA areas, respectively. Description. Figure 1 shows the areas in the Minneapolis-St. Paul and Detroit Metropolitan Areas that are modeled. The Minneapolis St. Paul area encompasses 7 counties: Anoka, Carver, Dakota, Hennepin, Ramsey, Scott and Washington. The metropolitan area around the city of Detroit that is modeled includes: Livingston, Macomb, Monroe, Oakland, St. Clair, Washtenaw and Wayne Counties. Based on a concurrent study in land cover change detection in the Twin Cities between 1991 and 1998 (Sawaya et al., in press), it is clear that the growth in this region for this time period is concentrated along the recently completed I-94 corridor connecting the metro area to western Wisconsin. The growth pattern to the north of the metropolitan area is also substantial but more dispersed. Other “hot spots” of development includes an area south of Waconia, the southwestern perimeter of the metro following the Minnesota River, and pockets along the northwest suburban perimeter near Plymouth and Maple Grove. In the Detroit Metropolitan Area, the most substantial urban growth between 1980 and 1995 occurred in northern Oakland County north of the I-696 corridor and in Livingston County along the I-96 corridor, with most of the growth occurring between the city of Brighton and Novi. Another development corridor occurred between the city of Plymouth and Ann Arbor and between the city of Plymouth and Northville.

LTM GIS Parameterization To insure the input files for the neural networks were identical for each of the two metropolitan areas, the same spatial features (e.g., county roads) and spatial rules (e.g., distance to nearest county road) were applied to both

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

4

Detroit Metropolitan Area St. Clair

Twin Cities Metropolitan Area 0

10

20

30 Kilometer s

Anoka N

Oakland

Washington

Macomb

W

Hennepin Ramsey

Livingston Wayne Washtenaw

Carver

Scott Monroee 0

20

40

60 Kilometers

E S

Dakota

Landuse 1978 urban agriculture grassland forest water wetlands barren Great Lakes Counties

Figure 1. Land uses in the study areas for 1980 and 1991 in the DMA and TCMA, respectively. areas. Data resulting from a GIS calculation is referred to as a driving variable grid. These driving variable grids are shown in Figure 2; the method of calculating them is summarized below: Transportation. The absolute distance each cell in the entire location from the nearest a) highway and b) county road, was stored in separate Arc/INFO GRID coverages. These two driving variable grids represented the potential accessibility of a location for new development. A third transportation coverage, density of residential streets within a 5 km square window, was created to represent the preponderance of an area to possess residential urban services (e.g., sewers, electricity, cable services) that could make it likely that more residential development could occur in the future. Landscape Features. The distance from lakes and rivers was calculated and also stored as separate driving variable grids in the GIS. Pijanowski et al. (in press) has found that landscape topography is an influential factor contributing toward residential use. Thus, a “rolling hills” driving variable grid was created from a 90m Digital Elevation Model (DEM). The amount of topographic variation surrounding each cell was estimated by calculating the standard deviation of all cell elevations within a 5 km square area. Cells containing larger values reflect landscapes that contain a greater amount of topographic relief around them. Urban Services. The distance each cell was from the nearest urban cell during the start the model (TCMA=1991, DMA=1980) was calculated and stored as a separate driving variable grids. It is assumed that the costs of connecting to current urban services (e.g., sewers) decrease with distance from urban. Exclusionary Zones. In the TCMA, the following areas were held back as being areas of nondevelopment: local parks, existing urban areas, water, and public lands (including national wildlife refuges, national forests, state forests, and state parks). In the DMA, the following areas were considered areas of non-development: existing urban areas, water, and designated public lands. Locations of the exclusionary zone for both metropolitan areas is given in Figure 3.

Neural Network Parameterization Driving variable grids stored in the GIS were written to ASCII grid representations and then converted to a tabular format such that each location contained its spatial configuration value (i.e., each location was an input vector into the neural net) from each driving variable grid. This reformatting to a tabular arrangement was necessary for input to the neural net software (see below for more details). The neural network model was based on Pijanowski et al. (in press). Briefly, each value in an entire driving variable grid was normalized from 0.0 to 1.0 by dividing each value by the maximum value contained in driving variable grid. Cells located within the exclusionary zone were removed.

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

5

a

LTM Driving Variables Maps of the seven different driving variables created for the two study locations. Detroit Metropolitan Area (DMA) is located on the left and the Twin Cities Metropolitan Area (TCMA) is located on the right. Pairwise driving variables are:

b

a) Distance from highways b) Distance from inland lakes c) Distance from rivers d) Distance to county roads e) Variation in topography f) Density of residential streets

c

g) Distance from urban

N W

E S

d

f

e

g

Figure 2. Driving variable grids produced by the GIS. Each feature was processed identically for each Metropolitan area. See text for details and Pijanowski et al. 2000, for descriptions of spatial rules of the Land Transformation Model.

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

6

The Stuttgart Neural Network Simulator (SNNS) was utilized for training and testing. To reduce the possibility that the neural network would “overtrain”, every other cell was presented to the neural network. The SNNS “batchman” utility was used to create, train and test the neural network. A back propagation, feedforward neural network, with one input layer, one hidden layer and one output layer was utilized. The neural network input layer contained seven nodes (one node for each driving variable) and seven nodes in the hidden layer Figure 3. Locations of the exclusionary zones (Figure 4). The output layer contained binary data that for both metropolitan areas. represented whether a cell location changed to urban (1= change; 0= no change) during the study period (1980 to 1995 for DMA and 1991 to 1997 for TCMA). Following Pijanowski et al. (in press), the neural net trained on the input and output data for 500 cycles. Previous experience of training urban change data showed that 500 cycles gave a minimum mean square error between the modeled output and presented data. To reduce the possibility that the network would bias its learning based on the order of data presented to it, the shuffle option in SNNS was used to randomly present data during each training cycle. After training was completed, the network file containing all of the weights, biases and activation values was saved. The testing exercise that followed used driving variable input from all cells (except those located in the exclusionary zone) in the study locations but with the output values removed. The network file generated from the training exercise was used to estimate output values for each location. The output was estimated as values from 0.0 (not likely to change) to 1.0 (likely to change); the output file created from this testing exercise is called a “result” file. The actual number of cells undergoing transition during the study period for each county was then used to determine how many cells from each county “result” file should be selected to transition. Cells with values closest to 1.0 were selected as locations most likely to transition. A routine was written into the C programs that performed this calculation such that if cells possessed the same value but only a subset of them needed to be selected to transition, then the necessary number of cells of equal value were randomly selected from the pool of available cells.

LTM Accuracy Assessment Cells that were predicted to transition to urban (according to the model output) were compared with the cells that actually did transition during the time period of study. The percentage of cells falling into this category was then divided by the actual number of cells transitioning to obtain a percent correct match (PCM) metric, calculated as follows: # cells correctly predicted to change x 100.0 # cells actually transitioning

(2)

Neural Network Learning Exercises We conducted two different types of training and testing exercises. These types are graphically depicted in Figure 5. Briefly, we tested whether the neural nets could generate network files that could be applied between the two study areas (Figure 5A) and whether “learning” using the entire regional dataset and applied to a local dataset (one of the counties in the study area) was better than learning directly from driving variables grids created from a subset (the county or local area) of the regional dataset (Figure 5B). We refer to the first set of exercises as a test of “internal” vs “external” learning. The second set of exercises is referred to as “regional” versus “local” learning exercises. We used the PCM metric described in equation (2) to compare the performance of models generated from the different approaches. The “internal” and “external” learning exercises differed only in how the testing was accomplished. As illustrated in Figure 5B, “internal” learning exercises were tested using the network file generated from the same regional set of driving variable grids. The “external” learning exercise, on the other had, was tested so that the network files created by the training of the TCMA and DMA regional driving variable grid set were swapped with each other. This “swapping” of the network file between locations was possible since the same driving variables were used for each region and calculated in the same manner.

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

7

Neural Network Training Exercise Distance to Road Density of Streets Distance to Hwy Distance to Urban Distance to Lake Distance to River

ou tpu t

hid de n

Saving Weights to File ut inp

Network File

(presence=1 or absence=0 of a transition) Assign weights to estimate output

Local

Network File

Network File

TCMA

Rolling Hills Process repeated hundreds or thousands of iterations in order to minimize error

DMA

Estimate error from observed dataset (Back Propagate Errors)

Figure 4. Illustration of how the neural nets were used with the driving variable grids (e.g., distance to roads) and their relationship to the output and network files.

A “local” versus “regional” learning exercise examined whether the network file generated from the regional training, and within the same metropolitan area were the local area (e.g., county) is located, was as good as using a network file created by training and testing on the single county. For each of the two regions, the county that contained the lowest PCM was selected for this exercise (for the TCMA, Scott County was used, while for the DMA, Washtenaw County was used).

PID Calculation Methodology Principal Index Driver (PID) values were calculated on a county-by-county basis using equation (1) and then aggregated to each metropolitan area. Population estimates for the year 2020 for each county were obtained from US Census Bureau. The PID was then calculated using the entire metropolitan population density estimates per equation (1) above.

RESULTS AND DISCUSSION Summary of Land Use Changes Within the TCMA, urban use was 21% of the entire study area in 1991 and 23% in 1997, which represents a total of 35,588 cells transitioning to an urban land use in the TCMA during the 6-year period. On the other hand, urban use in 1980 comprised 26% of the entire area in the DMA. In 1995, urban was 33% of the DMA study area with 87,684 cells transitioning to urban in the DMA during the 15 year time period.

“Internal” and “External” Learning Exercises Figure 6 summarizes the results of the LTM runs for each of the two study areas and for the “internal” versus “external” learning exercises. Shown are the PCM metrics described in equation (2). PCM, as grouped by county, range from the highest of 44% (Dakota County in the TCMA) to the lowest of both metropolitan areas of 18% (Washtenaw County in the DMA). All of the counties for the TCMA study area had greater than 30% PCM while only three of the seven for DMA had an accuracy of 30% PCM or more. PCM on average was 37% and 34% for TCMA and DMA, respectively. The three counties in the with the lowest PCM, Monroe, Washtenaw, and Livingston Counties might be considered “outliers” as urban transitions occurring in these counties are likely to be related to spatial factors that the neural network was not provided. For example, Monroe, a county along the Michigan-Ohio border, is likely influenced by Toledo, Ohio. Washtenaw and Livingston Counties are also influenced by larger regional trends occurring in medium sized metropolitan areas of Jackson and Lansing. Factors such as these were not provided to the neural network. Alternatively, it is possible that an important driving variable or more is missing from the DMA model. For example, development along the Great

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

8

A. Neural Network Testing Exercise – Internal vs. External Learning Internal Network File

DMA 1980 Land Use

DMA

DMA 1995 Urban Use – Predicted

Network File

TCMA External

DMA 1995 Urban Use – Observed

Proportion of Matches in Predicted and Observed

B. Neural Network Testing Exercise – Local vs.Regional Learning Local Network File

Washtenaw 1980 Land Use

Local

Washtenaw 1995 Urban Use – Predicted

Network File

Proportion of Matches in Predicted and Observed

DMA

Regional

Washtenaw 1995 Urban Use – Observed

Figure 5. Systems representation of how the two neural net exercises were accomplished using input from land use (and driving variable grids) and the various network files generated from the training exercises. The last step also illustrates how the percent correct match (PCM) are made.

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

9

Lakes shoreline in St. Clair and Monroe Counties may DMA average not be represented in the driving variable grids TCMA average developed for the DMA. Thus, providing the neural Monroe network with additional driving variable may improve Washtenaw the ability of the neural nets to learn about urban Livingston transitions in these areas. St. Clair Figure 6 also summarizes the results from the Macomb “external” learning exercise with PCM for each county Wayne External Oakland shown in purple. Applying the DMA network file to the Internal Washington TCMA driving variable grids yielded PCM for the Scott TCMA with values nearly as good as the “internal” Ramsey DMA learning exercise. Interestingly, the network file Hennepin generated from the TCMA was much less accurate when Dakota applied to the DMA data. There are a couple of reasons Carver why this swapping exercise may have had this result. Anoka First, the DMA neural net may be more generalized, 0 10 20 30 40 50 compared to the TCMA neural net, for the number of cycles that the training exercise (500) was conducted. Figure 6. The PCM for each of the counties, by More generalized neural nets are known to perform better on data that it has never seen (Skapura, 1996). internal and external learning exercises. Second, the pattern of urban development may be less complex in the TCMA than in the DMA; this would explain why the TCMA outperformed the DMA in its overall fit of the PCM metric. Third, within the DMA, a strong driving variable that is not included may be very important to development patterns in the DMA but not in the TCMA. Given that some of the counties in the DMA occurred along the shoreline of the Great Lakes (Monroe County is along Lake Erie and St. Clair County is along Lake St. Clair), this is entirely possible. Pijanowski et al. (in press) found that distance to lakeshore was an important driving variable in the Grand Traverse Bay Watershed, which contains six counties, five of which are located along a Great Lakes coastline. It would be expected that the pattern of urban development along a coastline wo uld differ greatly than development outward from a city core. Regional and Local Learning Exercises The counties of Scott (TCMA) and Washtenaw (DMA) were selected as part of the “regional” versus “local” learning exercise. The “regional” network file applied to the testing of Scott County, TCMA resulted in a 31.9% PCM; training and testing only on driving variable grid data carved out of the regional dataset for Scott county as part of the local learning exercise yielded a 33.3% PCM, which is only a slight improvement in model performance. Likewise, the regional network file applied to the testing of Washtenaw County, DMA resulted in a 18.6% PCM; training and testing only on data from Washtenaw County produced a slightly better improvement in model performance with a 23.8% PCM. The results of the “local” versus “regional” training and testing exercise suggest that the neural net does not bias itself when presented with a dataset from a larger area. It is thus able to generalize across a large region and it can perform as well as the situation of having it learn about transitions that occur in the smaller, localized area. The lack in a significant improvement in PCM using a “local” learning model versus a “regional” learning model suggests that, at least for some of the counties in the metropolitan areas, a driving variable, or more, may need to be introduced into the final model to improve the ability to predict future urban transitions. More attention is needed to examining the locations where the model did not predict transitions accurately in order to generate new possible driving variables.

Urban Transitions to 2020 According to the U.S. Census Bureau, the seven-county, TCMA region will have a population of 2,905,880 in the year 2020. This represents an increase of over 617,000 people above the 1990 census figures. Assuming the same population density, a total of 53,000 new urban cells will be needed in the TCMA in the year 2020. The seven-county DMA, on the other hand, is projected to have a population of 4,765,900 in the year 2020. Based on the 1995 population density estimates, another 169,000 new urban cells will be needed between 1995 and 2020. Future urban growth in the TCMA (Figure 7) appears to be focused east of the city of Eagan in Dakota County and northeast of the city of Woodbury in Washington County. In addition, a great deal of dispersed development is anticipated in the western portion of the metropolitan area around the many inland lakes in

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

10

Albert ville Twin Cities #

Legend

# #

Hanover

Champlin

#

Blaine #

Osseo

Water Rockf and Parks ord

#

Corcoran

#

e

#

Anoka

#

199 1 Urban 199 7 Urban 202 0 Urban

Scandi

#

Rogers

Cities Urban over tim e

#

Andover

#

#

Forest Lake

#

#

Saint Michael

Counties

lo

Ham Lake

Dayton

#

Coon Rapids #

Lino Lakes #

##

#

Hugo #

Lexington

# # North Oaks # Fridley Maple Grove # # Loretto Dellwood # Delano Shoreview # # # # # # Hilltop # Mahtomedi Independence New HopeCrystal Arden Hills # # # # # # # # # Maple Plain # Willernie Stillwater # # # Plymouth Little CanadaGem Lake # # Robbinsdale # # Pine Springs # Long Lake Roseville # Bay own # # Orono# Wayzata # # Lauderdale# # # Medicine Lake # Nor th S aint PaulLake#Elmo North H # WoodlandSaint Louis Park Minneapolis Mound # # # # Hudson # Oakdale # Maplewood # # # Hopkins Saint Bonif acius Greenwood Lakeland # # # # # # #L Landfall # # # # # Lilydale Saint P aul Shorewood Excelsior # # # Woodbury Edina S # # Richfield Mendot a Victoria West Saint Paul Waconia Chanhassen Aft on# # # # # # # # # Eden Prairie # # Newport # Bloomingt on Sunfish Lake # # Saint Paul Park rica Inver Grove Heights # Cottage G rove ChaskaShakopee # # Cologne # Eagan # # Carver Savage # # Burnsville #

Prior Lake #

Apple ValleyRosemount #

HastingsPrescott

#

#

Coates #

Jordan #

lle Plaine

Vermillion #

Lakeville

#

0

5

10

15

20 Km

#

Farmington

Figure 7. Time series of urban uses in the TCMA.

#

southwestern Hennepin County. In addition, dispersed development may also occur into the northern portion of the TCMA, especially along the US-12 corridor. New urban use in the DMA (Figure 8) that may occur between 1995 and the year 2020 appears significantly in the northwestern portion of Oakland County and around the city of Howell in Livingston county. The model predicts that these development patterns will be dispersed rather than aggregated. A concentrated expansion of urban growth is anticipated north of the city of Rochester in

Macomb County. What is interesting from these forecasting results is that the model predicts areas that will have a clumped or aggregated urban development as well as areas that will have dispersed development in the future for both metropolitan areas. In addition, it illustrates how urban development occurs along transportation corridors as well. It should be noted that the PCM used throughout this study does not reflect the overall performance of the model. The metric was developed as used because it is sensitive to urban change across different types of study regions where urban development occurs in different rates (Pijanowski et al., in press). A more accurate measure of the overall model performance will take into account the accuracy of the model to predict no urban transitions, which is a majority of the cells in most study areas. For example, TCMA and the DMA experienced no change in land use in over 75% of the entire study area. By taking this into account into the model overall accuracy assessment, the model predicts the occurrences of urban and non-urban transitions with a 93% or more accuracy (see Pijanowski et al., in press for more details). In comparison to other LTM model applications that have used the GIS and neural net framework, the TCMA and DMA perform less well than either the Grand Traverse Bay Watershed, which produced a 49% PCM, or the Kuala Lumpur, Malaysia application, which produced a 78% PCM.

CONCLUSIONS We parameterized the GIS and neural net-based LTM for the Detroit and Twin Cities Metropolitan Areas. We built several neural net models and attempted to test whether these models were transferable across the two metropolitan regions and whether a regional model provided as good a fit as a locally parameterized model. The overall accuracy of the model to predict urban transitions was 37% and 33% for the TCMA and DMA, respectively. Our “internal” versus “external” learning exercise resulted in models that appeared to be fairly transferable in one case (DMA applied to TCMA) and not well transferable in the other case (TCMA applied to DMA). The “local” versus “regional” exercise produced results suggesting that learning from larger scale spatial patterns does not reduce the affect of the model to predict smaller, local trends. The results of both modeling exercises suggest that some important driving variables are currently missing from the models. We suggest that information regarding the distance that cells are from the Great Lakes shoreline in the DMA study area may improve model performance. Larger, more regional trends in the DMA may also be important as many of the small towns in the DMA (e.g. Howell, Brighton, Monroe) undergoing urbanization are also

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

11

Atlas M # Swartz Creek AlmontAllenton Grand Blanc # # # # # # Memphis Dur and # Goodrich # # Columb Vernon # Leonard Gaines # Ar mada ancroft # # Ortonville # # OxfordLakeville Morr ice Richmond Byron Linden # # Rom eo # # # # Fenton Holly # # # Lake Orion bur g Casco # Cohoctah New Haven M # Clar kston Washington # # # Macomb Lake Angelus Anchorville # Fair Hav # Oak Gr ove # Rochester # # # # New Baltimore Hartland # Fowlerville Pontiac Alg # # Highland Utica # # # Sylvan Lake ber ville Troy Mount Clemens # Howell # # # Milford # # # Keego Harbor# Fraser # # Clawson # # Br ighton Wixom # Franklin # # # # # St. Clair Shores # # Berkley # # # # # Novi # # # # # Roseville New Hudson # # # # Lakeland # # # # Warren # Hamburg # dgeGregory # # # Grosse Pointe Shore # Northville Farmington # # Oak Park # # Pinckney # Hamtr amck # Grosse Pointe Farms Whitmore Lake Salem # # # Grosse Pointe # Plymouth Livonia Highland Park # # Grosse Pointe Park Detr oit Dexter # # Dearborn Westland # Barton Hills # # # Chelsea # # Inkster Melvindale Wayne # Ann Arbor # # # # River Rouge Taylor # # # Ypsilanti Lake # Ecor se Legend # Rom ulus # # Wyandotte N # Belleville n Center # # Cities Southgate Riverview E Countie s W # Saline Willis # # S rvell Manchester # Woodhaven Trenton Great Lakes # # # # # New Boston Bridgewater Whittaker Urba n O ver Time # Flat RockGibr altar 1980 Urb an n # # Milan 1995 Urb an # Clinton Car leton ## Rockwood 2020 Urb an #

influenced by medium sized metropolitan areas of Lansing, Jackson, and Toledo, Ohio. A more rigorous examination of model errors may also lead the authors to discern what might be missing from the current model configurations.

Detroit

#

sted #

TiptonTecumseh # #

Azalia #

# Britton #

Wa ter an d Par ks

Maybee #

Newport #

Estral Beach #

Dundee

0

5

10

15

20km

Figure 8. Time series of urban use for the DMA.

ACKNOWLEDGEMENTS This work was supported in part from NASA grant NAG13-99002 to the University of Minnesota, University of Wisconsin and Michigan State University entitled “An Upper Great Lakes Regional Environmental Science Applications Center (RESAC)”. We also acknowledge the cooperation of the South East Michigan Council of Governments (SEMCOG) and the Twin Cities Metropolitan Council (TCMC) for providing land use and land features data used for the modeling exercises presented here. We also would like to thank: Snehal Pithadia (MSU), Gaurav Manik (MSU), and Fei Yuan (UM) who also worked on the project developing databases and writing customized programs used in the modeling work.

REFERENCES Anderson, J. R., E.E Hardy, J.T. Roach, & R.E. Witmer. (1976). A Land Use and Land Cover Classification System for Use with Remote Sensor Data. U.S. Geological Survey, Professional Paper 964, p 28, Reston, VA. Atkinson, P. & A. Tatnall. (1997). Neural networks in remote sensing. International Journal of Remote Sensing, 18 (4), 699-709. Babaian, R., H. Miyashita, R. Evans, A. Eshenbach, & E. Ramimrez. (1997). Early detection program for prostate cancer: Results and identification of high-risk patient population. Urology, 37(3), 193-197. Boutt, D.F., D.W. Hyndman, B.C. Pijanowski, & D.T. Long. (in press). Identifying Potential Land Use-derived Solute Sources to Stream Base flow Using Ground Water Models and GIS. Groundwater. Brown, D.G., B.C. Pijanowski & J.D. Duh. (2000). Modeling the Relationships between Land-Use and Land-Cover on Private Lands in the Upper Midwest, USA. Journal of Environmental Management. Brown, D.G., Lusch, D.P., & Duda, K.A. (1998). Supervised classification of glaciated landscape types using digital elevation data. Geomorphology, 21(3-4): 233-250. Drummond, S., A. Joshi, & K. Sudduth. (1998). Application of Neural Networks: Precision Farming. IEEE Transactions on Neural Networks, 211-215. Environmental Systems Research Institute. (1999). Cell-based modeling with GRID. Environmental System Research Institute, Atlanta, Georgia. Fishman, M., M. Barr, S. Dean, & W.J. Loick. (1991). Using neural nets in market analysis. Technical Analysis of

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

12

Stocks & Commodities, 4, 18-21. Fukushima, K., S. Miyake, & T. Takayuki. (1983). Neocognitron: A neural network model for a mechanism of visual pattern recognition. IEEE Transactions on Systems, Man, and Cybernetics, SMC-13(5), 826-834. Pijanowski, B.C., D.T. Long, S.H. Gage and W.E. Cooper. (1997). A Land Transformation Model: Conceptual Elements, Spatial Object Class Hierarchies, GIS Command Syntax and an Application to Michigan's Saginaw Bay Watershed. Land Use Modeling Workshop. Sioux Falls, South Dakota, June 3-5, 1997. Pijanowski, B.C., S.H. Gage, and D.T. Long. (2000). A Land Transformation Model: Integrating Policy, Socioeconomics and Environmental Drivers using a Geographic Information System; In Landscape Ecology: A Top Down Approach , Larry Harris and James Sanderson eds. Pijanowski, B.C., D. Brown, B. Shellito and G. Manik. (in press). Using neural networks and GIS to forecast land use changes: A Land Transformation Model. Computers, Environment and Urban Systems . Ritter, N., N. Logan & N. Bryant. (1988). Integration of neural network technologies with geographic information systems. Proceedings of the GIS Symposium: Integrating Technology and Geoscience Applications, Denver, Colorado, 102-103. Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review, 65, 386-408. Rumelhart, D., G. Hinton, & R. Williams. (1986). Learning internal representations by error propagation. Parallel Distributed Processing: Explorations in the Microstructures of Cognition, 1 edited by D.E Rumelhart and J. L. McClelland (Cambridge: MIT Press), pp. 318-362. Sawaya, K., F. Yuan and M. Bauer. (2001). Monitoring landscape change with Landsat classifications. ASPRS Proceedings. St. Louis, Mo. April 21-26, 2001. Skapura,D. (1996). Building neural networks. New York: ACM press.

ASPRS Proceedings 2001, St. Louis, Mo, April 21,26, 2001.

13

After training was completed, the network file containing all of the weights, biases and .... Systems representation of how the two neural net exercises were ...

Download PDF

6MB Sizes 1 Downloads 193 Views

Report

asprs 2001 final

Recommend Documents